Multi-person 3D pose estimation and tracking in sports

We present an approach to multi-person 3D pose estimation and tracking from multi-view video. Following independent 2D pose detection in each view, we: (1) correct errors in the output of the pose detector; (2) apply a fast greedy algorithm for associating 2D pose detections between camera views; and (3) use the associated poses to generate and track 3D skeletons. Previous methods for estimating skeletons of multiple people suffer long processing times or rely on appearance cues, reducing their applicability to sports. Our approach to associating poses between views works by seeking the best correspondences first in a greedy fashion, while reasoning about the cyclic nature of correspondences to constrain the search. The associated poses can be used to generate 3D skeletons, which we produce via robust triangulation. Our method can track 3D skeletons in the presence of missing detections, substantial occlusions, and large calibration error. We believe ours is the first method for full-body 3D pose estimation and tracking of multiple players in highly dynamic sports scenes. The proposed method achieves a significant improvement in speed over state-of-the-art methods.
© Copyright 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE. All rights reserved.

Bibliographic Details
Subjects:
Notations:technical and natural sciences
Published in:IEEE/CVF Conference on Computer Vision and Pattern Recognition
Language:English
Published: 2019
Online Access:https://doi.org/10.1109/CVPRW.2019.00304
Pages:2487-2496
Document types:article
Level:advanced