SAMannot: An Efficient Video Annotation Tool

Gergely Dinya¹, András Gelencsér¹, Krisztina Kupán², Clemens Küpper², Kristóf Karacs³,
Anna Gelencsér-Horváth^1,3^*

¹ Faculty of Informatics, Eötvös Loránd University, Budapest, Hungary
² Max Planck Institute for Biological Intelligence, Seewiesen, Germany
³ Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary

^* Corresponding author: gha@itk.ppke.hu

Citation

@misc{samannot,
              title={SAMannot: A Memory-Efficient, Local, Open-source Framework for Interactive Video Instance Segmentation based on SAM2},
              author={Gergely Dinya and Andr{\'a}s Gelencs{\'e}r and Krisztina Kup{\'a}n and Clemens K{\"u}pper and Krist{\'o}f Karacs and Anna Gelencs{\'e}r-Horv{\'a}th},
              year={2026},
              eprint={2601.11301},
              archivePrefix={arXiv},
              primaryClass={cs.CV},
              url={https://arxiv.org/abs/2601.11301},
            }

About

SAMannot is a versitile video annotation tool built on top of Meta's Segment Anything Model (SAM2). It helps you create high-quality segmentation masks across video frames with minimal user interaction.

Features

Getting Started

For full details, see the SAMannot repository. A minimal example workflow is shown below.

Clone the repository

git clone https://github.com/gergelydinya/SAMannot.git
cd SAMannot
conda create -n samannot python=3.10 -y
conda activate samannot

Installation (Linux)

pip install -r requirements.txt
cd sam2
pip install -e .
cd..
pip install --index-url https://download.pytorch.org/whl/cu121 torch torchvision torchaudio
cd ../checkpoints
./download_chckpts.sh

Run the tool

 conda activate 
python main.py