AnyDice Classic Archive 20

Dice and Averages

Fri, 25 Dec 2009 00:00:00 +0000

I had a fun chat through the contact page recently, which – among other things – dealt with averages.

How do you calculate the average of something? You add lowest and highest, then divide that by two, right? Well, yes, but that's actually a special case. How do you calculate the average of an unconstrained collection of values, in general?

In the general case, you sum all elements of the collection, then divide that by the number of elements. 1d4 is the collection {1,2,3,4}. Its average is (1 + 2 + 3 + 4) / 4 = 10/4 = 2.5.

2d4 is the collection {2,3,3,4,4,4,5,5,5,5,6,6,6,7,7,8}. Its average is = (2 + 3*2 + 4*3 + 5*4 + 6*3 + 7*2 + 8) / 16 = 80/16 = 5.

2d4h1 is the collection {1,2,2,2,3,3,3,3,3,4,4,4,4,4,4,4}. Its average is (1 + 2*3 + 3*5 + 4*7) / 16 = 50/16 = 3.125.

So if you know how often each element occurs, then you can calculate the average, no matter how strange the distribution.

But what if you only know the odds per value? That's not a problem. The only difference is that the division at the end has already been done for you. So all that's left to do is sum each value multiplied by its odds. Demonstrated using 2d4h1: (1 + 2*3 + 3*5 + 4*7) / 16 = 1/16 + 2*3/16 + 3*5/16 + 4*7/16 = 1 * 0.0625 + 2 * 0.1875 + 3 * 0.3125 + 4 * 0.4375 = 3.125.

The special case

If you want to sum a straight sequence of 1 + 2 + 3 + … + N, then you can use the famous formula (1 + N) * N/2. It works because it takes advantage of the symmetry of the problem; you can calculate it like (lowest + highest) + (2nd lowest + 2nd highest) + … up to the middle. As in this case N is also the number of elements, to get the average you have to divide the result by N. So the average becomes (1 + N) * N/(2*N) = (1 + N) / 2.

Of course, the lowest value needn't be one, so you get to the common formula (L + H) / 2.

The simple formula works for 1d4, but does it also work for 2d4, 3d4, etcetera? Yes it does! It works for any distribution that is symmetrical around the average. In those cases you can ignore the odds of the individual elements and treat is as a simple linear range. In all other cases you can use the general approach.

comments are closed