What is Elo Rating?

A continuously updated rating system where each team's score moves based on results against opponents of known strength.

TL;DR

Elo ratings assign each team a number that updates after every game based on the result and opponent rating. The system was invented for chess and adapted to almost every sport.

Full explanation

Arpad Elo's chess rating system, published in 1960 and adopted by FIDE in 1970, was the first widely-used continuously-updating rating method. Each player carries a single number — their Elo — that represents their estimated strength. After every game the number updates based on the result and the gap between the player's rating and the opponent's. Win against a much stronger opponent and the rating jumps; lose to a much weaker one and it drops sharply. Win against an equal opponent and the rating ticks up modestly.

Two adapations make the system work across sports. First, the update is sized by an empirically-tuned constant called K — a higher K means ratings move faster, which is appropriate for sports with shorter seasons. Second, the system can incorporate margin of victory, home-field advantage, and other adjustments. Nate Silver's FiveThirtyEight popularized sport-specific Elos for the NFL, NBA, soccer, and several other leagues using exactly this framework.

The strength of Elo is its self-correcting nature. A team that overperforms its rating early in the season gradually pulls the rating up; a team that underperforms gradually pulls it down. Ratings stabilize over enough games to a number that reflects long-run team strength. Predicting a future game is straightforward: the win probability is a logistic function of the Elo difference.

Elo has limitations. It is one-dimensional — it doesn't separately rate offense and defense, or surfaces, or styles. It assumes a roughly fixed talent pool, which fits chess better than it fits sports with major roster churn. And it ignores injuries unless explicitly extended to do so. For most public modeling, however, an Elo system seeded carefully and tuned for the sport is a respectable baseline that's hard to beat without serious work.

Formula

Expected score: E_A = 1 / (1 + 10^((R_B − R_A)/400)). After the match: R'_A = R_A + K × (S_A − E_A), where S_A is the actual result (1 = win, 0.5 = draw, 0 = loss) and K is a tuning constant.

Why it matters in our model

Our MLB, NBA, and NHL team ratings start from Elo systems seeded from external priors and updated daily. Elo gives us a sport-agnostic backbone that pre-game model components can sit on top of.

Frequently asked

What does an Elo difference of 100 mean?

Roughly a 64% win probability for the higher-rated side. A 200-point gap implies about 76%; a 400-point gap implies about 91%.

How is K chosen?

Empirically — by backtesting predictive accuracy across past seasons. Sports with more variance per game (baseball, soccer) need lower K than ones with less (basketball).

Does Elo work for individual sports?

Yes — it was invented for chess, and it underpins the standard rating systems in tennis, golf, and esports.

Related terms

← Back to glossary