Beyond playing positions: categorizing soccer players based on match-specific running performance using machine learning

Soccer players are frequently categorized by playing positions, both in the scientific literature and in practice. However, the utility of this approach in evaluating physical match performance and optimizing physical training programs remains unclear. This study compares the effectiveness of categorizing soccer players by their playing position versus using unsupervised machine learning based on match-specific running performance. Match-specific running data were collected from 40 young elite male soccer players over two seasons. Thirty-one of these players completed a 20-meter sprint test and a maximal incremental treadmill test to measure maximal oxygen uptake. Players were categorized both by playing position and by subgroups derived through k-means clustering based on match-specific running performance. Differences in sprint capacity, endurance capacity, and match-specific running performance were compared between and within playing positions, as well as between and within clusters. The two categorization methods were further compared for variance within subgroups and standardized differences between subgroups for total distance (TD), low-intensity running (LIR), moderate-intensity running (MIR), high-intensity running (HIR), and sprint distance during matches. Match-specific running performance differed between playing positions, despite notable inter-individual differences in running intensities within playing positions. Clustering based on match-specific running performance revealed less variance within groups (TD: P = 0.049, LIR: P = 0.032, HIR: P = 0.033) and larger standardized differences between groups (LIR: P = 0.037, MIR: P = 0.041, HIR: P = 0.035, Sprint: P = 0.018) compared to grouping by playing position. Moreover, 20-meter sprint speed differed between the sprint and high intensity endurance clusters (25.22 vs 23.75 km/h, P = 0.012), but not between playing positions. Using unsupervised machine learning to categorize soccer players improves the identification of player groups with similar match-specific running performance, thereby supporting performance evaluation and contributing to the optimization of physical training. Key Points - There is considerable interindividual variation in match-specific running performance within positional groups. - Studying the physical capacities, designing training programs or evaluating match-specific running performance of soccer players based on their playing positions is suboptimal. - Particularly grouping by forwards, midfielders and defenders should be avoided when evaluating match-specific running performance - Identifying subgroups based on match-specific running performance using clustering analysis seems a promising alternative for categorizing soccer players.
© Copyright 2025 Journal of Sports Science & Medicine. Department of Sports Medicine - Medical Faculty of Uludag University. All rights reserved.

Bibliographic Details
Subjects:
Notations:technical and natural sciences sport games biological and medical sciences
Tagging:maschinelles Lernen
Published in:Journal of Sports Science & Medicine
Language:English
Published: 2025
Online Access:https://doi.org/10.52082/jssm.2025.565
Volume:24
Issue:3
Pages:565-577
Document types:article
Level:advanced