Class 6

The Shadow of Tomorrow: How Repetition Breeds Cooperation

When the future matters enough, even rivals learn to cooperate — and game theory can explain exactly when and why.

In 1980, political scientist Robert Axelrod sent an unusual invitation to game theorists, economists, psychologists, sociologists, and computer scientists around the world. The challenge: submit a computer program to play a repeated Prisoner's Dilemma tournament. Each strategy would face every other strategy in a round-robin of 200-round matches, accumulating points according to the standard payoff matrix. The world's leading strategists submitted entries ranging from elaborate conditional programs to ruthlessly exploitative algorithms. The winner? A four-line program submitted by mathematical psychologist Anatol Rapoport. It was called Tit-for-Tat, and its strategy was almost childishly simple: cooperate on the first move, then do whatever your opponent did last round.

That a strategy so simple could defeat far more sophisticated competitors stunned the academic world. But the deeper lesson was more stunning still: in a world of repeated interaction, niceness wins. The shadow of the future had transformed the Prisoner's Dilemma from a tragedy of mutual betrayal into an ecosystem where cooperation could not only survive but flourish. This chapter tells the story of how.

From One-Shot Despair to Repeated Hope

Recall the devastating conclusion from our earlier chapters: in a one-shot Prisoner's Dilemma, the unique Nash equilibrium is mutual defection. Both players betray each other, both end up worse off than if they had cooperated, and rational self-interest is the engine of their shared misery. If you're playing just once, there is no escape. Your opponent's choice is already made (or will be, independently of yours), and defection strictly dominates cooperation regardless of what they do.

But how often in life do we truly interact with someone just once? Businesses compete in the same markets quarter after quarter. Nations negotiate treaties knowing they will face each other at the next summit. Neighbors share a fence for decades. Colleagues collaborate on project after project. The one-shot game is the exception; the repeated game is the rule. And repetition changes everything.

The key insight is deceptively simple: when you will encounter the same player again tomorrow, your choice today affects not just today's payoff but the entire future trajectory of the relationship. Defecting now might earn you a quick windfall, but it could provoke retaliation that costs you dearly for rounds to come. Cooperating now might mean sacrificing a short-term advantage, but it could sustain a mutually profitable partnership far into the future. The future casts a shadow over the present — and if that shadow is long enough, cooperation becomes rational even for the purely self-interested.

This idea had been circulating informally for decades. But it was Axelrod's tournament that transformed it from an abstract possibility into a vivid, empirically demonstrated reality (Axelrod & Hamilton, 1981).


The Tournament That Changed Everything

Axelrod's first tournament in 1980 attracted fourteen entries from scholars across multiple disciplines. Each strategy played every other strategy (and a copy of itself) in matches of exactly 200 rounds. The standard Prisoner's Dilemma payoffs applied: mutual cooperation earned each player 3 points (the "Reward"), mutual defection earned 1 point each (the "Punishment"), and if one defected while the other cooperated, the defector received 5 (the "Temptation") while the cooperator received 0 (the "Sucker's payoff").

Tit-for-Tat's victory was remarkable not because of any clever exploitation but because of its elegant simplicity. Axelrod (1984) identified four properties that made it so effective:

The results revealed a striking pattern: the top eight strategies in the tournament were all nice — meaning none of them were the first to defect. The bottom six were all not nice. This was not a coincidence. Nice strategies prospered because they could cooperate with each other, earning the steady stream of mutual cooperation payoffs (3 per round). Nasty strategies might occasionally exploit a cooperative opponent, but they also triggered retaliation and locked into cycles of mutual defection (1 per round) against each other.

Think About It

If all the top strategies were "nice," does that mean being nice is always optimal? Consider: what would happen if you entered an Always Cooperate strategy into a tournament where most other strategies were exploitative? What does niceness require to succeed?

Axelrod published the results and invited a second, larger tournament. This time, sixty-two entries arrived — many deliberately designed to beat Tit-for-Tat. Once again, Tit-for-Tat won. The strategies designed to exploit it couldn't gain enough against it to overcome the costs of mutual defection when they fought each other.

The Ecological Tournament

Axelrod then ran a thought experiment that made the results even more dramatic. He imagined a population where each strategy's representation was proportional to its success. In each "generation," strategies that scored well would grow in numbers, while poor performers would shrink. This ecological simulation modeled a kind of natural selection among strategies.

The results were stunning. Early on, exploitative strategies did reasonably well because there were plenty of cooperative strategies to prey upon. But as the exploitative strategies drove down the cooperative pushovers, they lost their food supply and began cannibalizing each other. Tit-for-Tat, meanwhile, steadily grew. By around generation 200, the population was dominated by Tit-for-Tat and other nice, retaliatory strategies. The exploiters had driven themselves to extinction (Axelrod, 1984).

Axelrod's ecological simulation showed that cooperative strategies gradually dominated as exploitative strategies lost their victims and turned on each other.
Axelrod's ecological simulation showed that cooperative strategies gradually dominated as exploitative strategies lost their victims and turned on each other.

The lesson was profound: cooperation can invade a population of defectors and, once established, resist invasion by exploiters — provided cooperators are retaliatory enough to punish defection and forgiving enough to restore cooperation after conflicts. Later research by Nowak and Sigmund (2008) showed that while Tit-for-Tat has vulnerabilities (particularly to noise and random errors), the broader principle of reciprocal cooperation is remarkably robust across many variants.

Axelrod's Tournament Simulator

Select strategies to include, run the tournament, then watch the ecological simulation unfold.


Making It Formal: The Repeated Game

Axelrod's tournaments gave us a vivid demonstration. Now we need the formal machinery to understand why repetition enables cooperation and exactly when cooperation is sustainable. This requires us to define repeated games precisely and introduce the concept that makes everything work: the discount factor.

A repeated game (sometimes called a supergame) consists of a base game — called the stage game — that is played over and over by the same set of players. After each round, all players observe the actions that were taken, and then the stage game is played again. A player's strategy in the repeated game is a complete contingent plan: it specifies what to do in every round as a function of the entire history of play up to that point.

There are two versions worth distinguishing. In a finitely repeated game, players know the game will last exactly T rounds. In an infinitely repeated game, either the game literally continues forever, or — more realistically — there is some probability that the game continues after each round, and the players don't know when it will end. Both interpretations lead to the same mathematical structure, but they have very different strategic implications.

The Unraveling Problem

Consider a finitely repeated Prisoner's Dilemma lasting exactly 100 rounds. Can cooperation be sustained? Surprisingly, the answer is no — at least not by the logic of backward induction. In round 100, there is no future to worry about, so both players defect (it's a one-shot game). But if both will defect in round 100 regardless, then round 99 is effectively the last strategically relevant round — so both defect in round 99 too. This logic unravels all the way back to round 1. The unique subgame perfect equilibrium of a finitely repeated Prisoner's Dilemma (with complete information) is defection in every round.

This result sounds absurd — and in practice, people do cooperate in finitely repeated games. But the theoretical point is important: for cooperation to be sustained by purely rational, self-interested agents, the game must either be infinitely repeated or have an uncertain endpoint. The shadow of the future must extend indefinitely.

The Discount Factor

In an infinitely repeated game, we need a way to compare payoff streams that extend forever. The discount factor, denoted δ (delta), serves this purpose. It is a number between 0 and 1 that represents how much a player values future payoffs relative to present ones. A payoff of x received one round from now is worth δx today. A payoff received two rounds from now is worth δ²x, and so on.

The discount factor captures two related ideas. First, it reflects patience: a player with δ close to 1 is very patient and cares almost as much about future payoffs as present ones. A player with δ close to 0 is impatient and cares mostly about the immediate round. Second, δ can represent the probability that the game continues after each round: if there is a 10% chance the game ends after any given round, then δ = 0.9, because each future round is "worth" 90% as much as the certainty of today (MIT OpenCourseWare, 2010).

Using the discount factor, we can calculate the present value of any payoff stream. If a player receives payoff π in every round forever, the present value is:

PV = π + δπ + δ²π + δ³π + … = π / (1 − δ)

This geometric series formula is the engine of the entire analysis. It lets us compare the value of sustained cooperation against the value of defecting today and suffering the consequences tomorrow.

Think About It

Before reading on, try to work out the following: In our standard Prisoner's Dilemma (Cooperate/Cooperate = 3 each, Defect/Cooperate = 5 for defector), what is the present value of cooperating forever versus defecting once and then being punished with mutual defection forever? At what discount factor are these two options exactly equal?

The Discount Factor Dashboard

Adjust the discount factor δ and watch how it affects the sustainability of cooperation.


Trigger Strategies: Cooperation Through Threat

Now we can state precisely how cooperation works in repeated games. The mechanism is a trigger strategy: a strategy that begins by cooperating and switches to punishment if the other player ever defects. The punishment creates a credible threat that makes defection unprofitable — provided the future matters enough.

Grim Trigger

The simplest and most severe trigger strategy is the grim trigger (also called the "grim strategy" or "trigger strategy"). It works as follows: cooperate in every round until the opponent defects. After the first defection, defect in every round forever, with no possibility of forgiveness. One strike and you're out, permanently.

Let's verify that grim trigger can sustain cooperation. Consider two players both using grim trigger, with our standard payoffs (R = 3, T = 5, P = 1, S = 0). If both cooperate forever, each receives:

PV(cooperate) = 3 / (1 − δ)

Now suppose a player considers deviating: defecting today while the opponent cooperates (earning 5), but then being punished with mutual defection forever (earning 1 per round). The present value of this deviation is:

PV(defect) = 5 + δ · 1 / (1 − δ)

Cooperation is sustainable when the cooperation payoff exceeds the defection payoff:

3 / (1 − δ) ≥ 5 + δ / (1 − δ)

Solving: 3 / (1 − δ) − δ / (1 − δ) ≥ 5, which gives (3 − δ) / (1 − δ) ≥ 5, so 3 − δ ≥ 5 − 5δ, yielding 4δ ≥ 2, or δ ≥ 1/2.

This is the critical discount factor, often written δ*. When δ ≥ 1/2, cooperation is a subgame perfect Nash equilibrium of the infinitely repeated Prisoner's Dilemma. When δ < 1/2, the future doesn't matter enough to deter defection, and the only equilibrium is perpetual mutual defection — just as in the one-shot game. Friedman (1971) first established this result formally for oligopoly supergames, providing the mathematical foundation for understanding how cartels and other cooperative arrangements can be self-enforcing.

Beyond Grim Trigger

Grim trigger is powerful but extreme. In practice, a single defection — which might be a mistake or misunderstanding — triggers eternal punishment. More forgiving strategies can also sustain cooperation:

Each of these strategies sustains cooperation through a different balance of retaliatory severity and forgiveness. The grim trigger requires the lowest critical discount factor (maximizing the range of δ values that support cooperation) because its punishment is so severe. More forgiving strategies require a somewhat higher δ to work, but they are more robust when mistakes happen.

Four trigger strategies compared: each balances punishment severity against forgiveness, trading off deterrence against robustness to errors.
Four trigger strategies compared: each balances punishment severity against forgiveness, trading off deterrence against robustness to errors.

The Cooperation History Explorer

Play 30 rounds against a mystery AI opponent, then analyze its strategy. Can you figure out when the AI changed tactics?


The Folk Theorem: Anything Is Possible

We've shown that cooperation can be sustained in the repeated Prisoner's Dilemma when players are patient enough. But the story goes much further. The folk theorem — so named because the result was widely known among game theorists before anyone published a formal proof — makes a startling claim: in an infinitely repeated game with sufficiently patient players, virtually any payoff profile that gives each player more than their minimax value can be sustained as a Nash equilibrium.

What does this mean? The minimax value is the worst payoff a player can guarantee for themselves regardless of what others do. In our Prisoner's Dilemma, each player can guarantee at least 1 (by always defecting). The folk theorem says that any combination of average payoffs where each player gets at least 1 can be an equilibrium outcome — not just mutual cooperation (3, 3), but also asymmetric outcomes like (4, 2) or even near-mutual-defection payoffs like (1.5, 1.5), as long as δ is close enough to 1.

Fudenberg and Maskin (1986) provided the definitive formal proof, establishing that for any feasible, individually rational payoff vector, there exists a discount factor close enough to 1 such that those payoffs can be achieved as a subgame perfect equilibrium. The proof is constructive: it shows how to build reward-and-punishment strategies that sustain any target payoff.

Think About It

The folk theorem tells us that cooperation can be an equilibrium — but so can many other outcomes, including continued defection. What determines which equilibrium actually occurs? If the theory predicts "almost anything is possible," is it really predicting anything at all? This is one of the deepest critiques of repeated game theory.

This is both the folk theorem's power and its weakness. It tells us cooperation can emerge, but it also tells us that defection, alternating exploitation, and countless other patterns are equally valid equilibria. The theory of repeated games is a theory of possibility, not of prediction. To predict which equilibrium will actually be selected, we need additional concepts — focal points, norms, institutions, or evolutionary dynamics — that go beyond the folk theorem itself. We'll explore some of these in Chapters 9 and 12.


Cooperation in the Wild

The theory of repeated games is not just an elegant abstraction. Its logic explains real cooperative arrangements that would otherwise be deeply puzzling — situations where self-interested actors cooperate without contracts, courts, or centralized enforcement.

OPEC and the Temptation to Cheat

The Organization of the Petroleum Exporting Countries (OPEC) is essentially a cartel: member nations agree to restrict oil production to keep prices high. But every member faces a Prisoner's Dilemma: each country benefits from the high price sustained by others' restraint, yet each could earn even more by secretly exceeding its quota while others hold the line. If everyone cheats, prices collapse and everyone suffers.

OPEC's history is a case study in the fragility of repeated-game cooperation. For years, Saudi Arabia acted as a "swing producer," absorbing production cuts to maintain the cartel price. But when cheating by other members became too costly, Saudi Arabia dramatically increased production in 1985–86, crashing oil prices. Griffin (1994) interprets this as a shift to a tit-for-tat strategy: Saudi Arabia punished excessive cheating to restore discipline, then returned to cooperation once other members fell into line. The episode illustrates how punishment phases — costly and disruptive though they are — can be essential for sustaining long-run cooperation.

Trench Warfare and the Live-and-Let-Live System

Perhaps the most remarkable real-world example of repeated-game cooperation comes from an unlikely setting: the trenches of World War I. Historian Tony Ashworth (1980) documented a pervasive system of informal truces that emerged between opposing units along the Western Front. Soldiers on both sides developed tacit agreements not to shoot to kill, particularly during meals, rest periods, and predictable daily routines.

How could cooperation emerge between soldiers whose explicit orders were to destroy each other? The answer is repetition. Unlike the mobile warfare of later conflicts, WWI trench warfare kept the same units facing each other for weeks or months at a time. Each side could observe the other's actions and respond in kind. If one side showed restraint, the other reciprocated. If one side escalated, the other retaliated. The repeated-game structure naturally produced Tit-for-Tat-like dynamics.

"In one sector the weights arrived daily at the same time, and the weights were fired at the same targets, at the same time. After being relieved, the weights continued at the same time, at the same targets." — A British soldier describing ritualized, harmless artillery exchanges (quoted in Ashworth, 1980).

The military high command, recognizing the problem, eventually broke the live-and-let-live system by ordering unpredictable raids that destroyed the repeated-game structure. When soldiers no longer knew whether their current opponents would be there tomorrow, the shadow of the future shortened, and cooperation collapsed. This tragic outcome perfectly illustrates the theory: shorten the shadow, and defection becomes inevitable.

Environmental Agreements and Sovereign Nations

International environmental agreements face the same structural challenge. Nations must cooperate to reduce emissions, preserve fisheries, or protect the ozone layer. But each nation has an incentive to free-ride on others' sacrifices. There is no global government to enforce agreements. Cooperation must be self-enforcing — sustained, as in the repeated game, by the threat that defection today will unravel cooperation tomorrow.

The repeated-game framework explains both the successes and failures of international cooperation. The Montreal Protocol on ozone-depleting substances succeeded in part because the interaction was indefinitely repeated, the consequences of defection were observable, and the number of key players was small enough for reputational mechanisms to work. Climate agreements have been far harder because the shadow of the future is diluted by discounting (the worst effects are decades away), monitoring is difficult, and the number of players is very large — all factors that the discount factor framework predicts would undermine cooperation. We will return to this analysis in depth in Chapter 12.

Three real-world domains — cartels, warfare, and international agreements — where repeated-game logic explains the emergence and fragility of cooperation.
Three real-world domains — cartels, warfare, and international agreements — where repeated-game logic explains the emergence and fragility of cooperation.

The Fragility of Cooperation

The results in this chapter are among game theory's most hopeful. But they come with important caveats that any honest assessment must acknowledge.

First, cooperation requires a long enough shadow. If the discount factor is too low — if players are too impatient, if the relationship might end soon, if monitoring is too slow — cooperation collapses. The theory is precise about this: below δ*, no trigger strategy can sustain cooperation. This explains why cooperation breaks down in end-of-relationship situations, in environments of rapid turnover, and when players cannot observe each other's actions clearly.

Second, the folk theorem's multiplicity problem is genuine. The theory says many equilibria are possible, but it doesn't tell us which one will emerge. In some environments, cooperative norms take hold; in others, exploitative ones do. History, culture, institutions, and focal points determine which equilibrium is selected — forces that lie outside the formal model.

Third, cooperation in repeated games is fragile to noise. If players sometimes make mistakes — pressing the wrong button, misreading a signal — then even gentle trigger strategies can spiral into mutual punishment. Tit-for-Tat, for all its virtues, can get trapped in alternating defection cycles when a single error occurs. More sophisticated strategies like Pavlov or generous Tit-for-Tat are more robust to noise but harder to analyze formally (Nowak & Sigmund, 2008).

Finally, these results assume players can observe each other's actions perfectly. In many real-world settings — international relations, complex business environments, online markets — monitoring is imperfect. When you can't tell whether your partner defected or just got unlucky, the logic of trigger strategies becomes much harder to apply. Imperfect monitoring is one of the frontier topics of modern game theory research.

Despite these caveats, the central message endures: repetition creates the possibility of cooperation among the self-interested. This is one of social science's genuinely important insights. It tells us that trust, reciprocity, and cooperation are not naive sentiments but can be the strategic choices of rational agents who understand that today's decisions shape tomorrow's relationships.

Key Takeaways

Looking Ahead

We've seen that cooperation can emerge when purely self-interested players interact repeatedly. But what happens when players have private information — when you don't know your opponent's payoffs, their type, or even whether they're rational? Chapter 7 introduces Bayesian games and incomplete information, where the challenge shifts from sustaining cooperation to figuring out who you're dealing with. The discount factor and repeated-game logic we've built here will return in Chapter 10's bargaining models and Chapter 12's analysis of climate agreements.

References

Ashworth, T. (1980). Trench warfare 1914–1918: The live and let live system. Holmes & Meier. https://www.amazon.com/Trench-warfare-1914-1918-live-system/dp/0841906157

Axelrod, R. (1984). The evolution of cooperation. Basic Books. https://www.amazon.com/Evolution-Cooperation-Revised-Robert-Axelrod/dp/0465005640

Axelrod, R., & Hamilton, W. D. (1981). The evolution of cooperation. Science, 211(4489), 1390–1396. https://doi.org/10.1126/science.7466396

Friedman, J. W. (1971). A non-cooperative equilibrium for supergames. The Review of Economic Studies, 38(1), 1–12. https://doi.org/10.2307/2296617

Fudenberg, D., & Maskin, E. (1986). The folk theorem in repeated games with discounting or with incomplete information. Econometrica, 54(3), 533–554. https://scholar.harvard.edu/files/maskin/files/folk_theorem_in_repeated_games_with_discounting_or_incomplete_information.pdf

Griffin, J. M. (1994). The 1985–86 oil price collapse and afterwards: What does game theory add? Economic Inquiry, 32(4), 543–561. https://doi.org/10.1111/j.1465-7295.1994.tb01350.x

MIT OpenCourseWare. (2010). Game theory with engineering applications: Lecture 15 — Repeated games. Massachusetts Institute of Technology. https://ocw.mit.edu/courses/6-254-game-theory-with-engineering-applications-spring-2010/

Nowak, M., & Sigmund, K. (2008). Tit-for-tat or win-stay, lose-shift? Journal of Theoretical Biology, 253(1), 129–136. https://pmc.ncbi.nlm.nih.gov/articles/PMC2460568/