Performance

Walk-forward backtest over 1,748 completed series from 2024-01-13 through 2026-06-08. Predictions are made before each rating update.

Brier score
0.2079

Lower is better; 0.250 is a coin-flip baseline for balanced matches.

Log loss
0.6025

Penalizes overconfident misses.

Reliability

Predicted binNAvg predWin rate
10-20%1116.3%18.2%
20-30%5826.5%27.6%
30-40%12235.6%35.2%
40-50%32045.7%52.8%
50-60%36354.8%62.3%
60-70%35665.2%71.6%
70-80%30775.0%74.6%
80-90%17684.5%84.1%
90-100%3592.5%94.3%