Dataset Release for "Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization"

The current dataset includes video annotations being created for the paper: "Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization". This dataset contains the annotations for detecting and classifying the four relevant phases in cataract surgery. For each relevant phase, we have provided a training and a testing set. In the training set, there are 1000 three-minute sequences from the relevant phase and 1000 three-minute sequences extracted from the rest of the phases from eight cataract surgery videos. In the testing set, there are 200 three-minute sequences from the relevant phase and 200 three-minute sequences extracted from the rest of the phases from eight cataract surgery videos.

Dataset 1

The first set includes manual annotations of idle and action frames in cataract surgery videos for idle-frame-recognition networks. For these annotations, 22 videos are selected from the released Cataract-101 dataset being collected in 2017-2018 at Klinikum Klagenfurt (Austria). For idle frame recognition, all frames of 22 videos from the dataset are annotated and categorized as idle or action frame. From these annotations, 18 videos are randomly selected for training and the remaining videos are used for testing. Subsequently, 500 idle and 500 action frames are uniformly sampled from each video, composing 9000 frames per class in the training set and 2000 frames per class in the testing set.

Examples for Implantation Phase:

Examples for Irrigation_Aspiration Plus Viscoelastic_suction Phase:

Examples for Phacoemulification Phase:

Examples for Rhexis Phase:


The datasets are exclusively provided for scientific research purposes and as such cannot be used commercially or for any other purpose. If any other purpose is intended, you may directly contact the originator of the videos, Prof. Yosuf El-Shabrawi, or Assoc. Prof. DI Dr. Klaus Schoeffmann.

Besides, a reference must be made to the following publication when this dataset is used in any academic and research reports:

N. Ghamsarian, M. Taschwer, D. Putzgruber-Adamitsch, S. Sarny, K. Schoeffmann. 2020. Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization. 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 2021, 8 pages. DOI:10.1007/978-3-319-73603-7_20

    author    = {Negin Ghamsarian and
                 Mario Taschwer and
                 Doris Putzgruber{-}Adamitsch and
                 Stephanie Sarny and
                 Klaus Schoeffmann},
    title     = {Relevance Detection in Cataract Surgery Videos by Spatio- Temporal
                 Action Localization},
    booktitle = {25th International Conference on Pattern Recognition, {ICPR} 2020,
                 Virtual Event / Milan, Italy, January 10-15, 2021},
    pages     = {10720--10727},
    publisher = {{IEEE}},
    year      = {2020},
    url       = {},
    doi       = {10.1109/ICPR48806.2021.9412525},
    timestamp = {Fri, 07 May 2021 12:53:57 +0200},
    biburl    = {},
    bibsource = {dblp computer science bibliography,}

The datasets are licensed under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0, Creative Commons License) and is created as well as maintained by the Multimedia Systems Group of the Institute of Information Technology (ITEC) at Alpen-Adria Universität in Klagenfurt, Austria.

This license allows users of this dataset to copy, distribute, and transmit the work under the following conditions:

For further legal details, please read the complete license terms.


If you agree to above conditions, you are free to download:

Relevance_Detection.rar (75 GB)


[1] N. Ghamsarian, H. Amirpour, C. Timmerrer, M. Taschwer, K. Schoeffmann. 2020. Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks. In Proceedings of the ACM International Conference on Multimedia (ACMMM2020), pages 1-9. ACM, 2020 (to appear)

[2] K. Schoeffmann, M. Taschwer, S. Sarny, B. Münzer, M.J. Primus, and D. Putzgruber. Cataract-101: video dataset of 101 cataract surgeries. In Proceedings of the 9th ACM Multimedia Systems Conference, pages 421–425. ACM, 2018.