#### Introduction

Keith Burgun believes that the optimal win-rate for a game is 50%. This causes no end of debate every time he mentions it which annoys me both because I mostly agree with him and because it’s the same conversation every time. Here I want to give my opinion and by opinion I mean a very strict proof of a very specific thing that will hopefully inform, possibly entertain and likely wont stop everyone arguing. First, let me give my more formal hypothesis,

*For any game that ends with a binary win or loss with probabilities p and 1-p respectively, expected information gain for the player on the quality of their play is at a maximum when p=0.5**.*

Here is an outline of how we will prove this. First we must give a formal definition of ‘information’. We will then use Shannon entropy to derive a formula for the average information gain of a random variable. We will then find the maximum of this function using differentiation.

#### Information Defined

Lets start with the particular example we are interested. Imagine we have a player about to start a game. Say we also have records of a large number of their past games and we know that they have won 80% of them. Before we know what they are going to do during the course of the game is it reasonable to believe that we could assign a 80% probability to the event that they win? Given that win-rate, if I told you they won the game, how surprised would you be? How about if they lost the game, would you be more or less surprised? If they had a 100% chance of winning how surprised would you be when they won? This idea of ‘amount of surprise’ is synonymous with the idea of information content. For an event with probability *p* the information content *I(p)* has the following properties,

*I(p)*is anti monotonic, as*p*increases*I(p)*decreases and as*p*decreases*I(p)*increases.*I(0)*is undefined.*I(p)*is always greater than or equal to 0.*I(1) = 0*, events that are guaranteed to occur communicate no information.*I(p1 p2) = I(p1) + I(p2)*, information due to independent events is additive.

It turns out that the following function has all these properties,

*I(p) = log(1/p) = -log(p)*

We will use this function in the next section as our measurement of the information content of an event with probability *p*. The base of the logarithm must be some fixed number greater than 1 but other than that it doesn’t actually matter as it is possible to convert between the bases of the logarithm. For the rest of this article you can assume if not stated that the base of our logarithms will always be 2 to keep the working simple and consistent.

#### Shannon entropy

Now that we can calculate the information content of an event, how can we calculate something similar for a random variable? If we want to find the average information content of a random variable, we will need to find the information content of each even and then weight it by the probability that event occurs. This average information content is known as Shannon entropy and is calculated as follows,

When a random variable X has only two possible outcomes with probabilities p and q, we can redefine *q* as *q = 1-p*. This allows us to rewrite the formula for the entropy of a random variable with only two outcomes using just the probability of one event as follows,

#### Proof of Maximum

Now that we have a function of one variable for the information content of a random variable of only two possible outcomes we want to know what probability gives us the highest average information content. It is possible to find the maximum of a function using differentiation. Differentiating a function f(x) gives us another function f'(x) which describes the gradient of the original function. As an example, for f(x)=2x, f'(x)=2, because the gradient of f(x) is a constant upwards slope. For any function f(p) with derivative f'(p) over a closed interval [a,b], the maximum value of f(p) must be at either,

- f(a)
- f(b)
- f'(p)=0
- f'(p) is undefined.

If the gradient of a function at a point *x* is non zero the function is either still increasing of decreasing. This means if the gradient is positive it we could increase *x* slightly and get a greater value and if it were negative we could decrease *x* slightly and get a greater value. Because of this we know the maximum value must be at a point where either the gradient is 0 or undefined, or at the edge of our closed interval as that would stop us from increasing or decreasing x slightly.

Now the proof that the maximum entropy is at 0.5:

- Differentiate f(p) to find f'(p).
- Let f'(p)=0 and solve for p.
- Show that f(0.5) is the maximum

From this we see that the maximum entropy is achieved when the probability of both outcomes is equal to 0.5. So to bring this back to the point, the win-rate that communicates the most information on average to the player is a 50% win-rate. Extensions to this proof that I may attempt to prove in the future are that you can increase information transmission by adding more possible outcomes than just win/loss and that the closer to even probabilities the outcomes are the greater the information transmission is. This would suggest that a game with a fine tuned exactly 50% win-rate would be less efficient at transmitting information than a game with 5 possible ending outcomes, each only roughly 20% likely.

-Conor.

“This would suggest that a game with a fine tuned exactly 50% win-rate would be less efficient at transmitting information than a game with 5 possible ending outcomes, each only roughly 20% likely.”

Would you say that there is a turning point, where X possible outcomes is more informative than X+1 possible outcomes? Or do you think that more possible outcomes always means better feedback?

I ask this because although I agree that 5 possible outcomes is more informative than 2, it seems to me that after a certain amount of possible outcomes there is not enough of a clear distinction between them.

LikeLiked by 1 person

Good question and one I am planning on riddling out in full at a later date but I’m happy to give an informal answer. So lets consider entropy to be a good measure of player feedback; the greater the entropy the greater the feedback. If we run with that, the question we want the answer to is “because higher entropy implies better feedback, what ways can we increase entropy”. The awkward thing about this is that entropy changes based on two things, the number of possible outcomes and their relative probabilities. X+1 possible outcomes guarantees a higher maximum possible entropy than just X outcomes but only if the relative probabilities of the outcomes are right. This is what I’m planning on giving a proof for on a later date.

Now in reality I agree with you, the model starts to fall apart when we start getting big enough numbers but only because of psychological reasons. When told they scored 124876 and last time they only scored 124875 it doesn’t really click with players that they are doing better because big numbers are difficult to think of and remember so most people would just classify the two scores as the same thing “around about 125000… Ish”. So really my argument with this will be to suggest that to teach players the game quickly it’s generally a better idea to not limit yourself to the win/lose binary but to use things like scoring, ranks and placements as the final outcomes of a match. So yeah, entirely fair intuition there VMaikel.

LikeLike