Dataset Release for "Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks"

The current dataset includes two sets of annotations being created for the paper: "Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks".

Dataset 1

The first set includes manual annotations of idle and action frames in cataract surgery videos for idle-frame-recognition networks. For these annotations, 22 videos are selected from the released Cataract-101 dataset being collected in 2017-2018 at Klinikum Klagenfurt (Austria). For idle frame recognition, all frames of 22 videos from the dataset are annotated and categorized as idle or action frame. From these annotations, 18 videos are randomly selected for training and the remaining videos are used for testing. Subsequently, 500 idle and 500 action frames are uniformly sampled from each video, composing 9000 frames per class in the training set and 2000 frames per class in the testing set.

Examples for idle frames:

Examples for action frames:

Dataset 2

The second set includes the manual annotations of the cornea and instruments using the open-source Supervisely platform. We have annotated the cornea of 262 frames from 11 cataract surgery videos for the eye segmentation network, and the instruments of 216 frames from the same videos for the instrument segmentation network.

Examples for cornea and instrument annotations:

Disclaimer

The datasets are exclusively provided for scientific research purposes and as such cannot be used commercially or for any other purpose. If any other purpose is intended, you may directly contact the originator of the videos, Prof. Yosuf El-Shabrawi, or Assoc. Prof. DI Dr. Klaus Schoeffmann.

Besides, a reference must be made to the following publication [2] when this dataset is used in any academic and research reports:

N. Ghamsarian, H. Amirpour, C. Timmerrer, M. Taschwer, K. Schoeffmann. 2020. Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks. In Proceedings of the ACM International Conference on Multimedia (ACMMM2020), pages 1-9. ACM, 2020. DOI:10.1145/3394171.3413658

BibTeX:

@inproceedings{DBLP:conf/mm/GhamsarianATTS20,
  author    = {Negin Ghamsarian and
               Hadi Amirpour and
               Christian Timmerer and
               Mario Taschwer and
               Klaus Sch{\"{o}}ffmann},
  editor    = {Chang Wen Chen and
               Rita Cucchiara and
               Xian{-}Sheng Hua and
               Guo{-}Jun Qi and
               Elisa Ricci and
               Zhengyou Zhang and
               Roger Zimmermann},
  title     = {Relevance-Based Compression of Cataract Surgery Videos Using Convolutional
                 Neural Networks},
  booktitle = {{MM} '20: The 28th {ACM} International Conference on Multimedia, Virtual
               Event / Seattle, WA, USA, October 12-16, 2020},
  pages     = {3577--3585},
  publisher = {{ACM}},
  year      = {2020},
  url       = {https://doi.org/10.1145/3394171.3413658},
  doi       = {10.1145/3394171.3413658},
  timestamp = {Wed, 26 Jan 2022 09:31:46 +0100},
  biburl    = {https://dblp.org/rec/conf/mm/GhamsarianATTS20.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

The datasets are licensed under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0, ) and is created as well as maintained by the Multimedia Systems Group of the Institute of Information Technology (ITEC) at Alpen-Adria Universität in Klagenfurt, Austria.

This license allows users of this dataset to copy, distribute, and transmit the work under the following conditions:

Attribution: You must give appropriate credit (including a reference to the abovementioned publication), provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
Non-Commercial: You may not use the material for commercial purposes.

For further legal details, please read the complete license terms.

Download

If you agree to above conditions, you are free to download:

Idle_frame_recognition dataset.zip (~6.6GB).
Mask_segmentation dataset.rar (~146MB).

References

[1] K. Schoeffmann, M. Taschwer, S. Sarny, B. Münzer, M.J. Primus, and D. Putzgruber. Cataract-101: video dataset of 101 cataract surgeries. In Proceedings of the 9th ACM Multimedia Systems Conference, pages 421–425. ACM, 2018.

[2] N. Ghamsarian, H. Amirpour, C. Timmerrer, M. Taschwer, K. Schoeffmann. 2020. Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks. In Proceedings of the ACM International Conference on Multimedia (ACMMM2020), pages 1-9. ACM, 2020 (to appear)