Counting to 10 on the Island of Catan
The night before our junior spring semester was brought to an abrupt end by the coronavirus pandemic, a couple friends and I decided to bring out the Settlers of Catan game that had been busy collecting dust under the coffee table for the semester. For all but one of us, this was our first game. Coincidentally, Ares, our one friend who had played the game before, was going to show up a little late.
After briefly skimming the Game Almanac, we bravely set out to settle the island of Catan. That day, the die decided to repeatedly roll out 4’s and 5’s, and we quickly found ourselves in two groups, those with a lot of brick and those with a lot of nothing. So we rolled on for around half an hour, two of us continuing empty handed and the other two amassing a healthy abundance of bricks and not much else. We didn’t make any progress until Ares showed up, took a quick glance at our growing collections of bricks, and asked us why we weren’t playing with the limit of 7 cards in hand, or why nobody had traded in their bricks for another resource.
Ten months since school has ended, we are now about 2 weeks from returning to school. Maybe this is quarantine boredom finally getting to my head, but I thought I would take this time to try to understand the game of Catan and try to figure out some basic strategy for the game. I have also always seen Peter Brandt (Jonah Hill) in Moneyball as a bit of a personal hero, so look out for my first foray into sports analytics in the coming weeks!
The Game of Catan
The island of Catan is made up of hexagonal tiles (hexes) each providing a certain resource and with a number between 2 and 12 associated with it. Players can build settlements and cities on the intersections where 3 tiles meet, and roads on the edges where 2 tiles meet. (On the coast, players can simply build settlements and cities on corners, and roads on edges).
When the 2 dice rolls sum up to some number, all players with settlements adjacent to a tile associated to this number receive one of this tile’s resource, one for each settlement and two for each city. If we take a quick look at the board, there are no hexes associated with 7. This is because when someone rolls a 7 (the most likely outcome occurring 1/6 of the time), they get to move the robber to a new tile, stealing a resource card from a player on that tile, as well as blocking all future production on the chosen tile, that is, until a 7 is rolled again, or a knight Card is drawn from the Development Card supply (more on this later).
Placing The Robber
As a bit of a warmup and first question, then, supposing we have played a Knight or rolled a 7, where do we want to place the robber? There are many things to consider here, the most obvious being the number on the tile, or how often the robber’s effects will take hold. We might also consider how many settlements are on a given tile and who they belong to. The more settlements on a certain tile, the more “damage” a successful roll will do. There may be a specific player you would like to target, perhaps they are winning, or maybe you just don’t like them very much (but then why would you be playing Catan with them?) Finally, we might also think about what type of resource the tile produced. For example, there are only 3 ore tiles, as opposed to 4 wheat tiles, and the “3” tile is very unlikely to be rolled, so if we occupy the “10” tile, it may be tempting to block the “8” tile, temporarily monopolizing the supply of ore and simultaneously blocking what is likely a popular and productive tile.
First, let’s see how effective our robber will be until his next relocation. While the purchase of a Development Card (specifically the Knight) will also force a relocation, this is a little hard to predict so we will only consider rolls of 7 for now. We will answer two questions, how likely it is that the robber will do nothing until we lose control of it, and on average how many times the robber’s effect will take place.
To answer the first question, let k denote the number on the tile. We can then rephrase this question as a simpler one. In particular, what is the probability that we roll a 7 before a k? To do this, let’s take a quick detour to Princeton Junction and watch some trains.
Poisson Processes, Superposition, and Thinning
Let’s say you are sitting at Princeton Junction, and taking some time to watch the trains. After a day or two, you notice that a NJ Transit train passes by approximately once an hour, and the Dinky, being a little smaller, passes by the station 3 times every hour, or once every 20 minutes. Each of these describes a Poisson process, an idea named after French mathematician Simeon Denis Poisson, used to model arrival times.
You might then draw the conclusion that approximately 4 trains pass by the station every hour. This is unfortunately not the best example, as train arrivals tend not to be independent, they are scheduled to be certain intervals apart, and indeed, the Dinky is probably scheduled to coincide with the NE Corridor. However, for argument’s sake, if these schedules were independent, and each train was operated independently of the previous one, you would be correct to predict one arrival every 15 minutes. This combination of Poisson processes is called Superposition.
Now, your friend comes along, and places a bet with you, betting that a Dinky train will show up before the NJ Transit does. Being mathematically inclined, you want to make this a fair bet, and based on their arrival rates, you come to the conclusion that the next arrival will be a Dinky with ¾ probability. This (roughly) is the concept of Thinning. Now, let’s apply this to our problem.
Back to Catan
First, we note for each k, the probability of a given dice roll coming up k. This table can also be found in the Catan Almanac itself.
Now, k occurs with probability p, while a 7 occurs with probability 6/36 = 1/6. Modeling rolls of the die as a Poisson process, a k occurs every p times every roll, and a 7 occurs 1/6 times every roll. With superposition, we see then that either of these occurs p + 1/6 times every roll. Now, with thinning, if we begin with an arbitrary Poisson process with parameter p + 1/6 we can label each of these rolls either a k or a 7 with probability p, 1/6 respectively. Indeed, by the thinning principle, these labelled processes are poisson processes with the original parameters. Then, the probability that a 7 occurs before a k is simply the probability the first arrival is labelled a 7, which we can compute as below.
For example, if we place a robber on a tile with k = 2, there is a 6/7 = 0.86 chance the robber will do nothing. On the other hand, if k = 8, there is a significantly smaller chance 6/11 = 0.55 chance it does nothing.
NOTE: This was a little (or a lot) too fast, but I didn’t want to delve into a whole semester’s worth of probability theory for this short article (which I feel is already getting a little long). If you’re interested for a more thorough explanation, leave a note in the comments and I’ll try to answer! Alternatively, check out a course in probability theory at any university, MIT’s course is probably pretty good.
Now, we turn to our next question, how often will the robber be effective? Again, we can use our ideas of superposition and thinning. After superposition, we again have our process with p + 1/6 arrivals on every roll. We want to know how many times k will show up before the first 7. In particular, we want to know how many arrivals, on average, are labelled k before the first 7 is labelled. This is simply a geometric random variable with probability of failure the answer we arrived at for the first question. Therefore, 6p + 1 is the expected number of k or 7 to occur before the relocation, so that on average, we can expect the robber to be invoked 6p times (subtracting 1 for the first roll of 7). For example, if k = 6, we can expect 5/6 uses of the robber on average.
Putting it all together
This first post is starting to get pretty long, so I will wrap this up with a discussion of robber placement, and next time we will continue with settlement placement, development cards, and resource comparisons! Also, if there are other aspects of Catan, other board games, or just other fun ideas and puzzles you want to hear about, let me know in the comments!
We know that on average, for a given k, the robber will be invoked 6p times, but we are not yet done, as not all invocations are equally powerful. We want our robber to do something meaningful when his number is called. For example, if an opponent has 2 cities on one tile, each use of the robber is 4 times as powerful than if they only had a single settlement. Conversely, if there are no settlements on a tile, a robber placed there, regardless of how many times his number is rolled, will do nothing. Let’s call this then damage coefficient, denoted D. If all players are treated equally, then we can simply calculate this by adding 1 point for an enemy settlement, 2 points for an enemy city to calculate D. For example, if a tile has 1 enemy city and 1 enemy settlement, we would find D = 3.
On the other hand, you might want to target a certain player, and in rare cases, perhaps one player is so dangerously close to winning that you are willing to sacrifice a settlement of your own to stop them, even if that means hurting your standing relative to the other players. As a very crude way to find D in general then, you could decide for each player how much you want to impede them, say with some coefficient Hₚ (higher for more dangerous opponents). Then, you could find, H₀ a coefficient to determine how much you are willing to take damage yourself (lower if you willing to hamper yourself temporarily, higher if you are not). For a given tile, H₀ could be the number of players without income from this tile. Then, with Dₚ, D₀ calculated above as the income a player receives from a tile, we can then calculate for any tile,
Recall that we also might be concerned about what type of resource we want to place the robber on. Let’s call this the resource multiplier, or R. I have not really come up with a great way to define R, but a possible option is the number of tiles on which this resource is available. For example, the more common a resource, the less valuable it is to block a single tile. Consider ore, of which there are only 3 tiles, we may choose R = ⅓. For wheat, we may choose R = ¼. Note that we are dividing so that when we multiply by R, tiles with more common resources become less valuable.
You may also want to adjust this coefficient based on what squares you occupy. We may then choose R based on the number of tiles of this resource you do not occupy. For example, if you occupy a square of ore, then we may choose R = ½ as opposed to ⅓. This is pretty crude, so if you have any suggestions on how to compute R, leave it in the comments! Of course, you may also want to ignore the type of resource completely, and simply block production, in which case simply setting R = 1 for all tiles will suffice.
Putting it all together, for a given tile of some resource and some number k, we can then compute the overall efficiency of placing a robber on this tile as,
Of course, it is pretty unlikely you will show up to your next Catan game with a calculator, or do any of this analysis, but hopefully this article was a fun and interesting read. Next time, we’ll look at some other aspects of Catan, and hopefully more posts about other board games and possibly sports are to come in the future!
Acknowledgements: Thank you Ares for teaching me how to play Catan. Thank you Byron for listening to my ideas and helping me put this together.