Video Games

Expanding the Elo rating system

Oliver Brown
— This upcoming video may not be available to view yet.

One of the features I have planned for Tic-tac-toe Collection is a player rating system.

One such common rating system is the Elo rating system, most famously used in chess. There is a lot of discussion about how good it is, but for my purposes it seems fine - at least as a starting point.

There are two fundamental restrictions on Elo that I would have to overcome though:

  • It is limited to two players.
  • It assumes each player has an equal chance of winning if they have equal skill.

Elo

First an introduction into how Elo rating works. There are two parts:

Given two player ratings, calculate an expected result.

The expected result is essentially the chance of winning, given as a decimal between 0 and 1, in which 0 represents a guaranteed loss and 1 a guaranteed win.

Given a player rating, an expected result, and an actual result, calculate how much the rating should change.

If you do better than expected then your score will go up. If you do worse then your score will go down. With a bigger difference there will be a bigger change.

Multiplayer

Simple Multiplayer Elo

I found an existing implementation of Elo for more than two players by Tom Kerrigan called Simple Multiplayer Elo.

In SME, players are ordered by performance, and then a simulated match is played between pairs of consecutive players, with the players’ ratings updated using the normal Elo method for two players.

This is certainly acceptable, and Tom includes some tests to show it works well. One oddity is if you get a victory over someone with a much higher rating than you, but also beat them by more than one place, you essentially don’t get the benefit for it. For example, consider the following result:

Position Player Initial rating vs player above vs player below Total
1st A 1000 +24 +24
2nd B 1200 −24 +27 +3
3rd C 1500 −27 −27

Here, Player A with a rating of 1000 has beaten Player C with a rating of 1500, but essentially hasn’t gotten the credit for it.

My unnamed Elo

My solution is to conceptually run the algorithm on every pair of players (in this case there would be A v B, B v C and A v C). There is a straightforward optimization so that the estimating part of the operation does not directly depend on the other players, just the difference between a single player’s expected and actual scores. So the algorithm is actually:

  1. Calculate the expected scores for each pair of players.
  2. Sum the expected score for each player.
  3. Divide each score by the number of players to normalize back to the range 0 to 1.
  4. Calculate the rating change for each player using their actual score and the combined expected score.

The details

With the same data as above, the results are as follows:

Player Initial rating Expected score
A 1000 0.097
B 1200 0.304
C 1500 0.599

Which is made up of the following score pairs:

Pair A B C
A v B 0.24 0.26
B v C 0.15 0.85
A v C 0.05 0.95

This results in rating changes of:

Position Player Initial rating Change
1st A 1000 +29
2nd B 1200 −9
3rd C 1500 −19
  • Player A has gained more which is good. That was basically the goal of the change.
  • Player B has now changed a small gain into a moderate loss. That’s a little odd and probably not desired, after all the victory against C should be more significant than the loss against A
  • Player C has changed a big loss into a slightly smaller (but still big) loss. After some thought, that is probably reasonable. Although C is expected to win just having more players should generally reduce your expectation of winning and therefore how much you are penalized for failing.

Improvements

So why did these results happen? It might be better to look at a table also including expected and actual results, which reveals a choice that has to be made that I have not yet mentioned:

Position Player Initial rating Expected score Actual score Change
1st A 1000 0.097 1 +29
2nd B 1200 0.304 0 −9
3rd C 1500 0.599 0 −19

Notice the actual scores used. Player A has a score of 1 indicating a win. Players B and C have a score of 0, which indicates a loss, but an equal loss.

If we instead use different values for the actual score we get a more sensible result

Position Player Initial rating Actual score Change
1st A 1000 0.67 +18
2nd B 1200 0.33 +1
3rd C 1500 0 −19

In this case, the “actual score” for a player placed Nth out of C is:

2 N C × C - 1
Explain this formula

This formula is conceptually simple, but a bit opaque when simplified.

Think of it as each player getting a number of shares of the total score. The person in last gets 0 shares, the person second from last gets 1 share, then next player gets 2 shares, and so on. The person in first place gets C−1 shares.

That means each player gets N shares and the total number of shares is equal to the C−1th Triangular number.

The formula for a triangular number is:

T x = x × x + 1 2

Substituting C-1:

T C - 1 = C - 1 × C - 1 + 1 2

This simplifies to: T C - 1 = C × C - 1 2

Therefore each player gets:

N / C × C - 1 2

which simplifies to:

2 N C × C - 1

Handling ties

To be as general as possible (and because it actually happens in some of my multiplayer Tic-tac-toe variants) we need to handle arbitrary ties.

The simplest way is to evenly distribute the score to all the players that are tied. So if the top two players in a three-player game tie, their scores of 0.67 and 0.33 are summed and split between them (0.5 each).

As a more complex example, consider:

Position “Raw” score Actual score
1st 0.286 0.262
1st 0.238 0.262
3rd 0.190 0.143
3rd 0.143 0.143
3rd 0.095 0.143
6th 0.048 0.048
7th 0 0

Final thoughts

I’ve addressed the first restriction of Elo, only supporting two players. As for unfair games, that will have to wait as this post is long enough as it is.

Blazor Numbers Game

Oliver Brown
— This upcoming video may not be available to view yet.

Over the weekend I was playing around with Blazor and created a little number game based on part of a popular British TV show.

In the future this may expand in functionality as part of my exploration into Blazor.

You can try the number game full screen.

New Tic-tac-toe Collection website

Oliver Brown
— This upcoming video may not be available to view yet.

Tic-tac-toe Collection has a new website: [https://www.tictactoecollection.app/].

I switched from using Wordpress.com to using a static site generator, specifically Hugo.

Overcooked - Fun Coop Multiplayer Action

Oliver Brown
— This upcoming video may not be available to view yet.

I recently a recorded a bunch of videos of the game Overcooked on Xbox One.

Interacting with this video is done so under the Terms of Service of YouTube
View this video directly on YouTube

Your goal is to assemble meals out of various ingredients, cook them, and serve them. Over time the meals get a bit more complicated and the levels get a lot more complicated. It is strongly designed to be played cooperatively with up to four people, and even supports two players on a single controller.

My only complaint would be the difficulty is based too much on complicated level design (and jumps up a bit too quickly). Some times the controls are not exactly tight and you can end up selecting the wrong thing - having levels with moving targets or slippery floors for instance just accentuates an otherwise minor problem. I would have preferred more meal variations (that are also more complicated) on simpler levels.

But despite all that it’s a fun party game that almost anyone can play. And of course it is made in Unity.

One final note. The first video in the playlist above was generated by Google Photos. It turned out well, except for its automatic cropping.

Visualizing Gravity

Oliver Brown
— This upcoming video may not be available to view yet.

Unity has a pretty cool feature called “gizmos”, that are things rendered only in the scene view of the Unity editor. Many built in game object types render a gizmo of some sort, but you can freely add your own. This can be very useful for debugging.

This is a visualization of the gravity (more precisely it’s the low resolution grid of gravity mentioned in the previous post).

The direction of the line is the direction of the gravity, and the length is the strength.

The Gravity of the Situation

Oliver Brown
— This upcoming video may not be available to view yet.

Interacting with this video is done so under the Terms of Service of YouTube
View this video directly on YouTube
The most important feature in Gravitas is clearly the gravity, so getting it right is crucial. One important principal to be aware of when making games though is that “right” doesn’t necessarily mean “technically perfect” or “scientifically accurate”. “Feeling right” and being fun to play with is often more important. Despite that, the gravity in Gravitas is essentially correct. By correct, I mean a numerical approximation of Newton’s law of universal gravitation. The formula for calculating the gravitational force between two masses is:
Gravitation

Gravitation

where:

  • F is the force between the masses;
  • G is the gravitational constant (6.674×10−11 N · (m/kg)2);
  • _m_1 is the first mass;
  • _m_2 is the second mass;
  • r is the distance between the centers of the masses.

For simplicity and performance reasons, there are a few things to be aware of though. Firstly, I decided not to worry about real world units since the planets, ships and torpedoes are not realistically scaled to each other (either by size, mass or distance from each other). This means I can decide that my torpedoes have a mass of 1, thereby eliminating a multiplication for the most common case. This also means I can choose the value of G to be whatever looks or feels right.

This is still quite a significant calculation that has to be performed every frame for every torpedo/planet pair.

As it turns out though, the maximum number of torpedo/planet pairs is quite low. Based on the limits of the previous version of Gravitas, there were at most 9 ships, firing at most 3 torpedoes, with at most 11 planets. This means 9 × 3 × 11 = 297 times to calculate the strength of the gravity each frame. In practice I don’t think I ever saw so that many torpedoes (every ship using a triple-shot special power at once).

But then I wanted to add dust. By dust, I mean a trail from the torpedoes, that is also affected by gravity. At a minimum this should generate one particle per frame and be at least a couple of seconds long at normal torpedo speed. This means each torpedo could easily have a hundred dust particles all needing the same gravity calculation. All of a sudden there could be 300,000 gravitation calculations per frame. On my laptop it ran fine. On my phone, not so much.

There are a number of possible ways of solving this issue, but the one I settled on is based on the principle that accuracy of simulation of the dust particles (as opposed to the torpedoes) is not so important. That is, if the dust particles don’t behave perfectly, it doesn’t really matter. One thing to notice about the gravity is that the only dynamic data it depends on is the position of the dust. The mass of the dust is fixed, and the position and mass of the planets are fixed (at least per round).

My solution was to pre-calculate the strength of gravity on a dust particle for all the positions in a low-resolution grid. This means I can “calculate” the gravity by doing a relatively cheap lookup.