Track Overview
The PROCEDURE Track is the long-context video track of the ORena SAVE FOCUS Challenge. It evaluates whether a submitted algorithm can answer clinically relevant questions from a laparoscopic video context up to a full procedure, using only the provided video context and the provided question.
This page provides the technical track specification. Dataset composition, taxonomy details, resources, and general challenge background are described in the corresponding Overview and Data tabs.
PROCEDURE Track
Description
The PROCEDURE Track focuses on long-context surgical video understanding. Each task instance consists of a laparoscopic procedure-level video context and a natural-language question about foreign objects, actions, events, object-related context, or retrieval status within the provided video context. The algorithm must return a short text answer.
The track targets all capabilities listed in the taxonomy overview in the taxonomy overview in the data section, provided that the answer can be inferred from the supplied procedure-level video context and its associated question metadata.
Algorithm Docker Input
The algorithm input consists of the procedure-level video context and the question. The question includes the metadata and the question text itself.
| Video context | Laparoscopic video context up to a full procedure. |
| Question | Natural-language VQA question including the relevant metadata, such as procedure name, expected output, and list of foreign objects. |
The exact file structure and schema will follow the official submission template repository.
Algorithm Docker Output
| Answer | Short text answer to the provided question. |
The exact output format and validation rules will follow the official submission template repository.
Runtime Environment
|
AWS Hardware NVIDIA H100 GPU 80GB VRAM |
Time Limit 30 seconds per question |
Execution Docker container No internet access during inference |
Evaluation Scope
PROCEDURE submissions are evaluated on long-context surgical video question answering. Questions are restricted to information that can be inferred from the provided procedure-level video context and question metadata. The track evaluates persistent object tracking, temporal grounding, aggregation over time, event and procedural understanding, retrieval-status reasoning, and visually grounded complex reasoning over extended surgical context.
Official Track Document
For the full formal specification, please consult the official PROCEDURE Track document:
👉 PROCEDURE Track PDF