Post by chileduck on Mar 9, 2020 10:57:57 GMT -8
I've made an attempt at measuring the quality of a form chart projection by using a statistic that measures the similarity between two rank order lists. The statistic is the Kendall Tau and it is like the Spearman rho correlation but is better for small n's and has less assumptions about the underlying distributions. The statistic ranges from 1.00 (in complete agreement) to -1.00 ( in complete disagreement). So the higher the positive Tau score, the closer the agreement.
The following is a table looking at the Tau scores for the projected rank order of the top ten teams compared to the actual final rank order after the completion of the 2019 NCAA outdoor championships for five methods.
The yellow cells indicate measures that show a significant correlation between projection and final. That is, yellow indicates there is a good chance that the relationships between the predictions and final outcomes are at least, not random. But the cutoff probability for this level is .05 which is rather arbitrary and may not be that informative (or significant?). It is interesting that not all measures even reached this criteria and that the T&F News was the only one that reached it for both men and women.
Probably, if anything, these measures could be used to compare the success of different prognosticators. For example we could say FloTrack did better than T&F News for the men but did miserably for the women. I would be a little cautious about doing so until I see more examples. I was able to compute these comparisons fairly quickly because I've recently added the Kendall Tau calculation to the form chart and you will now be able to see this measure in the bottom right of the top panel showing the top team scores. Below are some examples. I'd like to see how this measure looks for some previous years but this will take me quite a bit longer, But it's at least something now to think about.
So below I show the final results in the form chart tracker so you can compare Track and Field News with FloTrack. You can also see how I've decided to show original projections compared to final scores.
First the Men... notice the TAU score in the bottom right.
The up and down arrows with numbers signify the shifts in positions from the original rank order (with a blank meaning the position didn't change).
So it looks like FloTrack did a little better because it gets the first two teams ( and two others) in correct order and Houston and Georgia were only off by one.
T&F News only got the top team correct and three were only one out of position, but most all others were off by at least 4 positions.
Now Women:
Here the differences in the predictions don't seem to match the differences in the Tau score. FloTrack doesn't seem so bad in that they got the first two (Arkansas and USC) in the correct position and were correct for two others (Colorado and Stanford) but were off for LSU. I can't really see that the Tau score is very informative in this case so I'm looking into implementing other rank list comparison metrics (e.g. Rank-Biased Overlap ) that uses the scores and not just the rank order.
Stay tuned and I welcome comments.
The following is a table looking at the Tau scores for the projected rank order of the top ten teams compared to the actual final rank order after the completion of the 2019 NCAA outdoor championships for five methods.
The yellow cells indicate measures that show a significant correlation between projection and final. That is, yellow indicates there is a good chance that the relationships between the predictions and final outcomes are at least, not random. But the cutoff probability for this level is .05 which is rather arbitrary and may not be that informative (or significant?). It is interesting that not all measures even reached this criteria and that the T&F News was the only one that reached it for both men and women.
Probably, if anything, these measures could be used to compare the success of different prognosticators. For example we could say FloTrack did better than T&F News for the men but did miserably for the women. I would be a little cautious about doing so until I see more examples. I was able to compute these comparisons fairly quickly because I've recently added the Kendall Tau calculation to the form chart and you will now be able to see this measure in the bottom right of the top panel showing the top team scores. Below are some examples. I'd like to see how this measure looks for some previous years but this will take me quite a bit longer, But it's at least something now to think about.
So below I show the final results in the form chart tracker so you can compare Track and Field News with FloTrack. You can also see how I've decided to show original projections compared to final scores.
First the Men... notice the TAU score in the bottom right.
The up and down arrows with numbers signify the shifts in positions from the original rank order (with a blank meaning the position didn't change).
So it looks like FloTrack did a little better because it gets the first two teams ( and two others) in correct order and Houston and Georgia were only off by one.
T&F News only got the top team correct and three were only one out of position, but most all others were off by at least 4 positions.
Now Women:
Here the differences in the predictions don't seem to match the differences in the Tau score. FloTrack doesn't seem so bad in that they got the first two (Arkansas and USC) in the correct position and were correct for two others (Colorado and Stanford) but were off for LSU. I can't really see that the Tau score is very informative in this case so I'm looking into implementing other rank list comparison metrics (e.g. Rank-Biased Overlap ) that uses the scores and not just the rank order.
Stay tuned and I welcome comments.