ELO computation changes

User avatar
GFX47
Dev
Dev
Posts: 2891

ELO computation changes

Post#1 » 14 Nov 2016, 16:59

I applied the K factor changes we talked about with @pier4r on the old forum:
- League 1: K factor = 40
- League 2: K factor = 35
- League 3: K factor = 30
- League 4: K factor = 25

Which means the higher your league, the more stable your score should become.
PS: for more details on the ELO rating system > https://en.wikipedia.org/wiki/Elo_rating_system

User avatar
GFX47
Dev
Dev
Posts: 2891

Re: ELO computation changes

Post#2 » 14 Nov 2016, 17:08

I also reduced by half the number of points exchanged in case of a draw.

pier4r
Skynet
Skynet
Posts: 3378

Re: ELO computation changes

Post#3 » 14 Nov 2016, 18:00

great, let's see how does it work :)
http://www.reddit.com/r/Gladiabots/wiki/players/pier4r_nvidia_shield_k1 -> Gladiabots CHAT, stats, insights and more ;

User avatar
GFX47
Dev
Dev
Posts: 2891

Re: ELO computation changes

Post#4 » 28 Nov 2016, 17:58

Any feedback on THIS particular change?

User avatar
NullPointer
Autonomous Entity
Autonomous Entity
Posts: 539

Re: ELO computation changes

Post#5 » 28 Nov 2016, 18:27

This is how the average results work for me at 1650+:

win = +3
draw = -3
loss = -22

and in my opinion this is fair. I have to win 7 matches to compensate 1 loss and in case of a draw I need to compensate it with 1 win.

I didn't like very much when 1 draw would take away 2 wins given that was so easy to draw.

User avatar
Ritter Runkel
Neural Network
Neural Network
Posts: 498

Re: ELO computation changes

Post#6 » 28 Nov 2016, 20:17

I'm going up and down in the TOP 10 and for me it feels fair to.

When I slide down due to buggy change in my tactics I can go up pretty fast after bugs are removed until beeing in the TOP 10 again. After that it is pretty hard, so it feels like there is a saturation point for my actual AIs. I think that is it what ELO System should be about.

pier4r
Skynet
Skynet
Posts: 3378

Re: ELO computation changes

Post#7 » 28 Nov 2016, 22:34

yes the elo (and little variations, for example halving the points exchanged in case of draws) is meant to say "look you are stronger than this guy, if you lose/draw, it is his merit or your demerit, but anyway your score should be closer". Instead if one wins one with little elo, this is "expected".

In short the elo value is a value that tries to predict the next result, if one has 1600+ and another has 1250, the system predicts that the one with 1600+ will win easily.

If one played chess for a while, will see this immediately.

Of course gfx can change a lot of little details. K factor (that can be asimmetric for example), point exchanged, the range to consider for the elo (the "hundreds" at the exponent in the division).

For example given the formula shared by GFX

Code: Select all

The exact formula is:
player 1 expected result = 1 / (1 + 10^( (player 1 score - player 2 score) / 400) )
player 1 result = 1 for victory, 0 for defeat, 0.5 for draw
player 1 score delta = round( K_factor * (player 1 result - player 1 expected result))
player 2 score delta = - player 1 score delta


The 400 is the range considered by the formula. As the wiki reports, if one tweaks that number one can quickly change the expected results. For example if one player is 400 points above the other, the expected result will say "ehy that's obviously a win", If one wants to enlarge it, has to tweak the 400 to 500 or more. If one wants to make it smaller, then 300 or smaller.

But since tweaks are needed only to find balance, I think it is ok how it is. I am interested now in the average elo over time/games, that is quite an indicator of strength in my opinion.
http://www.reddit.com/r/Gladiabots/wiki/players/pier4r_nvidia_shield_k1 -> Gladiabots CHAT, stats, insights and more ;

Christian
Algorithm
Algorithm
Posts: 56

Re: ELO computation changes

Post#8 » 29 Nov 2016, 10:09

ELO works good in my opinion.

The main problem is that the elo does not go well with the rock-paper-scissors gameplay right now. ELO should be rather stable at some point. Instead for most players it is going up and down by quite large numbers. At least for me on some maps a win against the nemesis bot formation is near impossible. Getting pitted against exact such formations 5 times in a row results in an almost instant elo drop of 100 points. The other way around some days ago I got a lucky streak and ended up second with some 1600 points.

So all in all, elo is fine, but limiting special bots will propably improve its performance.

pier4r
Skynet
Skynet
Posts: 3378

Re: ELO computation changes

Post#9 » 29 Nov 2016, 14:05

Christian wrote:ELO works good in my opinion.

The main problem is that the elo does not go well with the rock-paper-scissors gameplay right now. ELO should be rather stable at some point. Instead for most players it is going up and down by quite large numbers. At least for me on some maps a win against the nemesis bot formation is near impossible. Getting pitted against exact such formations 5 times in a row results in an almost instant elo drop of 100 points. The other way around some days ago I got a lucky streak and ended up second with some 1600 points.

So all in all, elo is fine, but limiting special bots will propably improve its performance.


Yup, but for example someone applies a variant of elo also in soccer (see http://clubelo.com/ ). Now soccer is famous for the fact that there is not guarantee that if team A wins B and team B wins C, then A wins C, well also chess is the same actually (see world chess olympiad and so on), but if you see the predictions, around 70% of the time they are working.

Actually gfx could check (or dump the data! I love datasets!) how many times the elo predicts correctly a win/draw/loss. I would expect that even with rosk/paper/scissor, the elo is predicting 70% of the results properly.
http://www.reddit.com/r/Gladiabots/wiki/players/pier4r_nvidia_shield_k1 -> Gladiabots CHAT, stats, insights and more ;

User avatar
GFX47
Dev
Dev
Posts: 2891

Re: ELO computation changes

Post#10 » 29 Nov 2016, 15:08

pier4r wrote:Actually gfx could check (or dump the data! I love datasets!) how many times the elo predicts correctly a win/draw/loss. I would expect that even with rosk/paper/scissor, the elo is predicting 70% of the results properly.


Enjoy!
https://dl.dropboxusercontent.com/u/243 ... rating.csv

Reminder: result values follow this rule:
- player 1 victory: 1
- player 2 victory: 0
- draw: 0.5

pier4r
Skynet
Skynet
Posts: 3378

Re: ELO computation changes

Post#11 » 29 Nov 2016, 15:24

I will do a mindfap ASAP. Thanks a lot!
http://www.reddit.com/r/Gladiabots/wiki/players/pier4r_nvidia_shield_k1 -> Gladiabots CHAT, stats, insights and more ;

Johnbob
Algorithm
Algorithm
Posts: 62

Re: ELO computation changes

Post#12 » 29 Nov 2016, 16:21

Very quick analysis:

ExpectedResult = Draw

555 matches expected draw
67 matches ended draw
Correct prediction = 12,07%



ExpectedResult = Not Draw

85602 matches expected not draw
29297 matches wrongly predicted
56305 matches successfully predicted
Correct prediction = 65,78%

Johnbob
Algorithm
Algorithm
Posts: 62

Re: ELO computation changes

Post#13 » 29 Nov 2016, 16:24

Fun fact: the highest EloDelta for one match is 39 and it happened 10 times

MGBlitz81
Automaton
Automaton
Posts: 135

Re: ELO computation changes

Post#14 » 29 Nov 2016, 16:52

40 is the max. I found that out in alpha 4 when I got bored and created fun ways to lose.

I'm not sure what the separation in rank is but I went from 1600s down to about 360ish.... If the gap is large enough, you gain nothing and risk 40 points.

pier4r
Skynet
Skynet
Posts: 3378

Re: ELO computation changes

Post#15 » 29 Nov 2016, 17:49

Johnbob wrote:Fun fact: the highest EloDelta for one match is 39 and it happened 10 times

The highest elo delta is defined by the K factor (see https://www.reddit.com/r/Gladiabots/wiki/index and https://www.reddit.com/r/Gladiabots/wiki/index ) . In league 4 at most you have 25 points exchanged, before that was k=40 all over. (This is one example of "knowing the formula is faster than analysis")

Thanks for the quick analysis. I still do not have enough time but I will do a bit more. I do not know if GFX has the possibility but having also playerids (I do not mind nicknames) we can build also the history of the elo for every playerif (that would be extra cool). Second question, I wonder if 85K games were all the registered ranked games or only a part and since the 5.2 or also before? If it is since the 5.2 would be a nice sign for the growth of the game.
http://www.reddit.com/r/Gladiabots/wiki/players/pier4r_nvidia_shield_k1 -> Gladiabots CHAT, stats, insights and more ;

User avatar
GFX47
Dev
Dev
Posts: 2891

Re: ELO computation changes

Post#16 » 29 Nov 2016, 18:13

I updated the file: added league and player IDs.
And it's only the matches from Alpha 5.

pier4r
Skynet
Skynet
Posts: 3378

Re: ELO computation changes

Post#17 » 29 Nov 2016, 20:44

Great! Super thanks! (which db do you have? I remeber doing quick exports with mysql workbench,heidisql,pg admin over mysql/postgresql systems)

And only matches from alpha 5 are neat, quite a lot of online activity.
http://www.reddit.com/r/Gladiabots/wiki/players/pier4r_nvidia_shield_k1 -> Gladiabots CHAT, stats, insights and more ;

User avatar
GFX47
Dev
Dev
Posts: 2891

Re: ELO computation changes

Post#18 » 29 Nov 2016, 21:32

I use MySQL.

pier4r
Skynet
Skynet
Posts: 3378

Re: ELO computation changes

Post#19 » 29 Nov 2016, 23:09

So to backup johnbob here the first results (more analysis will come)

results analyzed:86388
correct_expected_results_int:54156
ratio: 0.63 (or 63% )

Now just a quick view over league4
results analyzed league4:13019 (or 15% of the games)
correct_expected_results_league4_int:8257
ratio: 0.63

The problem with the draws is that the elo does not really care about the draw, it is a continuous function. So i arbitrarily set that draw is expected when the elo ratio higherscore/lowerscore is less or equal than 5% (it is a possible interpretation, in the case I have to change the threshold).

So, given that, the prediction is 0.63%, but more will come considering the league and so on.
Anyway it is obvious that the elo is near to the 70% prediction accuracy that normally has shown in other contexts.

edit: I will publish the code ASAP, it is just an awk script.

edit2: updated with a quick overview of league 4. 2 notable observations: the prediction of the elo stays the same (more will come about this) and the amount of game played in league 4 is not so much as I would have expected. I thought that players with hundreds of game were mostly in league 4 and not the other 3, it seems that there is way more activity in other leagues.
Last edited by pier4r on 29 Nov 2016, 23:30, edited 1 time in total.
http://www.reddit.com/r/Gladiabots/wiki/players/pier4r_nvidia_shield_k1 -> Gladiabots CHAT, stats, insights and more ;

User avatar
Ritter Runkel
Neural Network
Neural Network
Posts: 498

Re: ELO computation changes

Post#20 » 29 Nov 2016, 23:30

Nice. More info plz and some fun facts

Return to “Devlog”

Who is online

Users browsing this forum: No registered users and 1 guest