Register Now!
Challenge Overview
The Critical View of Safety in Laparoscopic Cholecystectomy (CVS) - one of the most meaningful, clinically validated, surgical safety measures - consists of three visually distinct criteria to be achieved over the course of the surgical procedure. Numerous AI models have been proposed to assess the CVS or otherwise improve safety in laparoscopic cholecystectomy. While these algorithms generally show promising results, thorough testing on large, diverse and representative datasets is necessary to assess real-world performance and build trust prior to clinical translation.
The goal of the SAGES Critical View of Safety Challenge is to bridge this crucial gap. To this end, we have built a uniquely large and diverse dataset of 1000 laparoscopic cholecystectomy videos from across the globe, meticulously annotated by clinical experts for the development of AI algorithms targeting CVS assessment. We strongly believe that this challenge, centered around such a dataset, can foster the development of effective, robust, and trustworthy AI solutions for safe laparoscopic cholecystectomy.
The 2024 edition of the challenge conducted at MICCAI 2024 in Marrakesh focused on frame-level CVS prediction models, which is, at its essence, a form of multi-label classification. It included 3 sub-challenges that measured accuracy, robustness, and uncertainty awareness respectively, and attracted 13 submissions that not only advanced the state-of-the-art in CVS prediction accuracy but also demonstrated impressive robustness to data from under-represented clinical settings and high levels of uncertainty calibration.
This year, we are expanding the scope of the challenge to target two additional goals: (1) computational efficiency and (2) explicit surgical scene understanding. The former is particularly relevant for applicability in low-resource settings, while the latter is crucial in developing trustworthy AI systems. To measure progress towards these goals, we introduce two new sub-challenges: a CPU-only evaluation sub-challenge and a scene segmentation sub-challenge.
Dataset
Check out the Dataset page for details!
Sub-Challenges
We will have a total of 3 sub-challenges this year. Participants will make one submission FOR EACH sub-challenge. Note that this is a change from last year’s edition, where the same submission was evaluated for each sub-challenge.
Sub-Challenge A: CVS Classification
The CVS is a well-established surgical safety measure that targets the prevention of common bile duct injuries, a feared complication with a significant impact on patients’ recovery and survival. The CVS consists of three criteria, defined by the dissection and isolation of three anatomic landmarks, which can be visually distinguished in the intraabdominal video obtained from the laparoscope. For this challenge, these criteria were defined in a consensus-based Annotation Protocol as follows:
Sub-Challenge B: Computationally Efficient CVS Classification
Computationally-efficient CVS prediction models are essential for deployment in low-resource settings. To encourage development in this direction, in the 2025 edition of the challenge, we introduce a computational efficiency sub-challenge, where models are required to run on CPU. The compute environment is as follows:
8 CPU cores
Clock Speed of 2.90 GHz
32 GB CPU RAM
Submissions will be evaluated with the same metrics as in Sub-Challenge A. Participants should prepare a separate submission for this sub-challenge.
Sub-Challenge C: Hepatocystic Anatomy Segmentation
Details coming soon!
CVS Classification Robustness: Performance on per-frame CVS criteria prediction, but the testing videos are divided into numerous subsets representing potential data distribution shifts when moving to "unseen" data. A subset mAP is computed for each of these subsets, then all the subset mAP values are averaged to obtain the Robustness mAP.
In clinical practice, solutions should jointly optimize for each of these scenarios. As a result, for the 2025 challenge, we will rank each submission based on each of these 3 metrics, then average the 3 ranks to obtain the final rank. Ties will be broken based on overall CVS classification rank.
In the 2024 challenge, there were 3 sub-challenges that evaluated A SINGLE per-frame CVS classification model in different scenarios, aiming to test different aspects of model performance. Namely:
Overall CVS Classification: Performance on per-frame CVS criteria prediction, where the ground truth is generated by majority vote consensus of 3 annotators. Performance is measured as the mean average precision (mAP) across all annotated frames and each criterion.
Uncertainty-Aware CVS Classification: Performance on per-frame CVS criteria prediction, but the ground-truth is recomputed to take into account annotation confidence. As the resulting ground-truth is probabilistic, performance is measured using mean squared error (lower is better), averaged across frames then across criteria. The confidence-aware ground-truth is computed as follows:
Important Dates
03/28: Registration Opens!
04/21: Release of Segmentation Annotations, Sub-Challenge C Guidelines
05/30: Release of Official Submission Instructions
07/18: Registration Deadline
07/28 - 08/15: Submission Validation Period
08/29: Challenge Submission Deadline