The effectiveness of satisfying the assumptions of predictive modelling techniques: An exercise in predicting the FIFA World Cup 2010
(Die Effektivität der Erfüllung der Voraussagen der Predictive-Modelling-Technik: Eine Übung zur Vorhersage der FIFA Weltmeisterschaft 2010)
The assumptions of statistical procedures are enforced more rigorously in some disciplines than in others. Previous research into the accuracy of predictive modelling techniques has provided examples where models based on data that violate the relevant assumptions is greater than that of models where the assumptions were satisfied. The purpose of this investigation was to develop two sets of 4 models; one set being based on untransformed data that violated the assumptions of the modelling techniques and a second set where the data were transformed in order to satisfy the assumptions of the modelling techniques. Data from 153 pool matches and 54 knock out matches from international Soccer tournaments from July 2006 to February 2010 were used to produce predictive models of match outcomes (win, draw or lose) or goal difference with respect to the higher ranked teams within matches according to the FIFA World rankings. The independent variables used were difference between the teams FIFA World ranking points, difference between distance from capital city to capital city of the host nation and difference in recovery days from previous match within the tournament. The two sets of models were used to predict the 2010 FIFA World Cup, 119 human predictions and 20 weighted random predictions were also produced. An evaluation process marked the predictions with respect to the actual outcomes of matches in the 2010 FIFA World Cup out of a total possible score of 64 points. The mean accuracy of the models where the assumptions were satisfied was 33.50 points which was similar to the 35.13 points for those where the assumptions were violated. The accuracy score of the 8 model based predictions was 34.31+-2.70 compared to 33.75+-3.86 for the human predictions and 35.55+-2.50 for the weighted random predictions. There was no significant difference in the accuracy of the three types of prediction (p = 0.116). These results provide evidence that challenges the value of satisfying the assumptions of discriminant function analysis, binary logistic regression and multiple linear regression.
© Copyright 2010 International Journal of Computer Science in Sport. Sciendo. Alle Rechte vorbehalten.
| Schlagworte: | |
|---|---|
| Notationen: | Trainingswissenschaft Naturwissenschaften und Technik Spielsportarten |
| Veröffentlicht in: | International Journal of Computer Science in Sport |
| Sprache: | Englisch |
| Veröffentlicht: |
2010
|
| Online-Zugang: | http://iacss.org/fileadmin/user_upload/IJCSS_Abstracts/Vol9_Ed3/IJCSS-Volume9_Edition3_Abstract_ODonoghue.pdf |
| Jahrgang: | 9 |
| Heft: | 3 |
| Seiten: | 15-27 |
| Dokumentenarten: | Artikel |
| Level: | hoch |