Multi-person physics-based pose estimation for combat sports

This paper introduces a novel framework for 3D pose estimation in combat sports. Utilizing a sparse multi-camera setup, our approach employs a computer vision-based tracker to extract 2D pose predictions from each camera view, enforcing consistent tracking targets across views with epipolar constraints and long-term video object segmentation. Through a top-down transformer-based approach, we ensure high-quality 2D pose extraction. We estimate the 3D position via weighted triangulation, spline fitting. By employing kinematic optimization and multi-person physics-based trajectory refinement, we achieve state-of-the-art accuracy and robustness under challenging conditions such as occlusion, rapid movements and close interactions. Experimental validation on diverse datasets, including a custom dataset featuring elite boxers, underscores the effectiveness of our approach. Additionally, we contribute a valuable video datasets to advance research in multi-person tracking, in particular for combat sports.
© Copyright 2025 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. Published by IEEE. All rights reserved.

Bibliographic Details
Subjects:
Notations:technical and natural sciences combat sports
Tagging:Posenerkennung
Published in:Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops
Language:English
Published: Piscataway, NJ IEEE 2025
Online Access:https://openaccess.thecvf.com/content/CVPR2025W/CVSPORTS/html/Khoiee_Multi-person_Physics-based_Pose_Estimation_for_Combat_Sports_CVPRW_2025_paper.html
Pages:5832-5841
Document types:article
Level:advanced