Here you find two datasets for Instance Segmentation based on Cataract Surgery Videos, that were created for content-based video analysis (i.e. surgical tool detection). The annotations for the images in both datasets are in COCO-Format and were created manually (dataset 1) / automatically (dataset 2).
This is a careful selection of frames out of our public collection of cataract surgery videos , that was manually annotated using COCO-annotator. In the selection process we focused on frames with the best visual quality (e.g. no blurriness) and made sure to have a variety of different poses for each class. The final dataset at publication includes annotations for 9 important instruments used in the widely-used COCO-Format. In total we have annotated 393 different images. The classes are balanced and you will not find images of the same video across the sets.
UPDATE, October 2020:
We have extended this dataset with additional images (of better quality) and added missing instrument classes. The full dataset now consists of 843 annotated images and 11 classes.
You find the newest version of the dataset by downloading InSegCat dataset 1 v2.0.
The following class mapping is found in dataset 1:
For dataset 2 we converted the information provided with the public CaDIS-dataset for Image Segmentation  from a Semantic Segmentation task to an Instance Segmentation task by taking the existing mask-images and extracting bounding-boxes as well as instance masks of the desired classes. As a result this dataset is 9x larger than dataset 1 and of the same format. In total there are 4738 annotated images and we keep the same split for training-/validation-/testset images as found in the original format: 3582 training, 542 validation and 614 testing.
The following classes are annotated in the dataset:
Original images and mask images were taken from CaDIS-dataset and Annotations were generated by us.
The datasets are exclusively provided for scientific research purposes and as such cannot be used commercially or for any other purpose. If any other purpose is intended, you may directly contact the originator of the videos, Assoc. Prof. Dr. Klaus Schöffmann.
In addition, reference must be made to the following publication  when this dataset is used in any academic and research reports:
M. Fox, M. Taschwer, K. Schoeffmann. 2020. Pixel-Based Tool Segmentation in Cataract Surgery Videos with Mask R-CNN. To appear: IEEE 33rd International Symposium on Computer Based Medical Systems (CBMS).
* Images of dataset 2 are taken from the CaDIS-dataset, hence, when utilizing this dataset you are requested to as well refer to Flouty et al .
The datasets are licensed under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0, ) and is created as well as maintained by Distributed Multimedia Systems Group of the Institute of Information Technology (ITEC) at Alpen-Adria Universität in Klagenfurt, Austria.
This license allows users of this dataset to copy, distribute and transmit the work under the following conditions:
If you agree to above conditions, you are free to download:
 K. Schoeffmann, M. Taschwer, S. Sarny, B. Münzer, M.J. Primus, and D. Putzgruber. Cataract-101: video dataset of 101 cataract surgeries. In Proceedings of the 9th ACM Multimedia Systems Conference, pages 421–425. ACM, 2018.
 M. Fox, M. Taschwer, K. Schoeffmann. 2020. Pixel-Based Tool Segmentation in Cataract Surgery Videos with Mask R-CNN. To appear: IEEE 33rd International Symposium on Computer Based Medical Systems (CBMS).
 E. Flouty, A. Kadkhodamohammadi, I. Luengo, F. Fuentes-Hurtado, H. Taleb, S. Barbarisi, G. Quellec, and D. Stoyanov. Cadis: Cataract dataset for image segmentation.