## On comparing historical scores.

pier4r
Skynet Posts: 3390

### On comparing historical scores.

This may trigger a bit of drama, but hey, a bit of forum activity!

disclaimer: there may be some typos, I did not proof read enough.

So on telegram we had a bit of a discussion with nullpointer about comparisons of historical scores (in his example he was questioning that I called the tct score outstanding while he had an even better historical score, in his view).

First and foremost I say: one could compare scores in whatever way. One could say "hmm, if I take my score and the fourth root of the score of the opponent, my score is obviously better". Say 1200 vs 4th_root(2000) . 1200 would be obviously higher ( 4th_root of 2000 is 7, rounded).

The point is that for me the comparison has to make sense and the argument exposed by nullpointer are confused at best, if I want to use the less harsh adjective that I can think about. (make sense according to what? Eh, good question. Why does 1200 vs 4th_root(opponent_score) seem meaningless? I do not have an answer aside from "intuition")

Second: in comparing historical scores one has only approximations, because to compare for real two players at a certain time, one should compare them at the same time, in direct fights, or, without direct fights, against the same opponents with the same abilities and tactics. This is obviously not possible.

--
So first one is required to have the skill to combine the effects of multiple formulas in the the model for scores, since the model is not using a single formula but a set of those.

So we have the formula for the expected result for player1 (1) that is
f_1: 1 / [ 1 + 10^{(player1_score - opponent_score )/400}]

Notice that the formula has fixed numbers except for the subtraction: (opponent_score - player1_score) . Moreover the formula can vary between 0 and 1 (one can try to plug numbers)

Then this expected result is used in the computation of the score change or delta for the player
f_2: round( k_factor * (actual_result_player1 - expected_result_player1))

Where k_factor changes in different leagues but otherwise is constant and actual_result_player1 is 1 for victory, 0.5 for draw and 0 for defeat.

Now if k_factor is fixed, what can change in the f_2 is actual_result_player1 and expected_result_player1.

Then there is the matchmaking and fixes to take in account. For example the delta from f_2 in case of draws is halved. Moreover for the past matchmaking (before alpha 7.8) and the current matchmaking, mostly top players played within the top league (2). So far the matchmaking we can consider it as not increasing the complexity, when two top players are in the top league.

Said that, let's go to the argument.

Nullpointer argument is: I measure the percentage difference between scores, of an active player (that is: not hidden), of the top player (or a given position) against the second player (or another given position)

So, say we have player1 and player2, and player1_score > player2_score we can measure the percentage in at least two ways:
f_3: [(player1_score/player2_score) -1] * 100
or
f_4: [1 - (player2_score/player1_score)] * 100

First possible source of confusion if people do not use high school math enough: those two formulas do not always produce the same results.

Example:
f_3: [(1800/1600) -1] * 100 = 12.5
f_4: [1 - (1600/1800)] * 100 = 11.1

Second possible source of confusion if people do not use high school math enough: given the same difference, in subtraction terms, the smaller the numbers the higher the percentage.

Example:
f_3: [(800/600) -1] * 100 = 33.3
f_3: [(1800/1600) -1] * 100 = 12.5
f_3: [(10800/10600) -1] * 100 = 1.88

Therefore having similar gaps but with different number affects the view of historical scores.

And why is using the percentage meaningless for me? Because one has to see how scores can evolve due to f_1 and f_2

For example let's take k_factor = 20 , let's assume that player1 is the top player and player2 is the second top player, player1 wins and then let's use some scores with the same percentage delta according to f_3.

Examples.
player1: 1680 , player2: 1400
f_1: 1 / [ 1 + 10^{(1680 - 1400)/400}] = 0.8336
f_2: round( 20 * (1 - 0.8336)) = 3
player1: 1920 , player2: 1600
f_1: 1 / [ 1 + 10^{(1920 - 1600)/400}] = 0.8631
f_2: round( 20 * (1 - 0.8631)) = 3
player1: 2880 , player2: 2400
f_1: 1 / [ 1 + 10^{(2880 - 2400)/400}] = 0.9406
f_2: round( 20 * (1 - 0.9406)) = 1
player1: 12000, player2: 10000
f_1: 1 / [ 1 + 10^{(12000- 10000)/400}] = 0.9999
f_2: round( 20 * (1 - 0.9999)) = 0

So even with the same percentage difference, the top player will increasingly find difficult to increase his score. A player that has 1680 against a second player that has 1400 needs less wins to accumulate more points compared to the situation where one player has 2880 points and the other 2400. Accumulating more points means: bigger percentage difference. If one consider that players play mostly with different players in the top leagues, with score normally lower than the top2, the possibility of gaining points is even lower for those players with big score difference (in subtraction terms). If one then has thousands of points of difference, but the same percentage difference, there is no way one can get more points.
So for me the comparison using percentages is quite weak or enough for /r/badmathematics .

Instead, as the example above shows, what is interesting is how the score formula expect the result. I mean f_1 . Since the only part of f_1 can change is the subtraction of points between the two players, the comparison is easy: the bigger the delta between two players active at the same time, the better the score of the player with higher score.

Let's see the comparison of scores using the score difference and not the percentage difference.
Examples.
player1: 1600, player2: 1400
f_1: 1 / [ 1 + 10^{(1600 - 1400)/400}] = 0.7597
f_2: round( 20 * (1 - 0.7597)) = 5
player1: 1800 , player2: 1600
f_1: 1 / [ 1 + 10^{(1800 - 1600)/400}] = 0.7597
f_2: round( 20 * (1 - 0.7597)) = 5
player1: 2600 , player2: 2400
f_1: 1 / [ 1 + 10^{(2600 - 2400)/400}] = 0.7597
f_2: round( 20 * (1 - 0.7597)) = 5
player1: 10200, player2: 10000
f_1: 1 / [ 1 + 10^{(10200- 10000)/400}] = 0.7597
f_2: round( 20 * (1 - 0.7597)) = 5

For whatever score, the same difference ensures that if the player 1 wins over the player2, the same amount of points will be collected, so it is the same "score distance" according to the score formula.

In particular if one does not use only singular points (for example a top score reached and immediately lost), it is even better according to my understanding. Indeed I developed something like this to see how the strength of a player evolved while playing games compared to the active playerbase that challenged the player: viewtopic.php?f=5&t=9&start=100#p5171 . In that analysis TcT is outstanding followed by mcompany, nullpointer and others.

(1) viewtopic.php?f=7&t=126&p=1253#p1253
(2) well was not true for some days in alpha 7.8 but now it is fixed

Kanishka
Skynet Posts: 1421
Contact:

### Re: On comparing historical scores.

First thing that I notice: A loooooong forum post. Then I see logic and formulæ.
Fixes break an AI more than bugs do. 