VNL-STES: a benchmark dataset and model for spatiotemporal event spotting in volleyball analytics
Volleyball video analytics require precisely detecting both the timing and location of key events. We introduce a novel task: Precise Spatiotemporal Event Spotting, which seeks to accurately determine when and where important events occur within a video. To this end, we created the Volley- ball Nations League (VNL) Dataset, including 8 full games, 1,028 rally videos, and 6,137 annotated events with both temporal and spatial localization. Our best model, the Spatiotemporal Event Spotter (STES), outperforms the current state-of-the-art (SOTA) in temporal action spotting by 9.86 mean Temporal Average Precision (mTAP) and achieves a notable 80.21 mAP for spatial localization, accurately pinpointing event locations within a 2-6 pixel range. To the best of our knowledge, this is the first work addressing Precise Spatiotemporal Event Spotting in volleyball, establishing a strong baseline for future research in this domain. The code and data for this paper are available publicly at: https://hoangqnguyen.github.io/stes
© Copyright 2025 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. Published by IEEE. All rights reserved.
| Subjects: | |
|---|---|
| Notations: | sport games technical and natural sciences |
| Tagging: | künstliche Intelligenz Videoanalyse |
| Published in: | Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops |
| Language: | English |
| Published: |
Piscataway, NJ
IEEE
2025
|
| Online Access: | https://openaccess.thecvf.com/content/CVPR2025W/CVSPORTS/html/Nguyen_VNL-STES_A_Benchmark_Dataset_and_Model_for_Spatiotemporal_Event_Spotting_CVPRW_2025_paper.html |
| Pages: | 5861-5870 |
| Document types: | article |
| Level: | advanced |