module-ii/class-06

The Shadow of Tomorrow
How Repetition Builds Cooperation

Axelrod's tournaments, the discount factor delta, trigger strategies, and the folk theorem — why repeated interaction can convert the Prisoner's Dilemma from a tragedy into an ecosystem.

22 min read8 cited works

When the future matters enough, even rivals learn to cooperate — and game theory can explain exactly when and why.

In 1980, political scientist Robert Axelrod sent an unusual invitation to game theorists, economists, psychologists, sociologists, and computer scientists around the world. The challenge: submit a computer program to play a repeated Prisoner's Dilemma tournament. Each strategy would face every other strategy in a round-robin of two hundred-round matches, accumulating points according to the standard payoff matrix. The world's leading strategists submitted entries ranging from elaborate conditional programs to ruthlessly exploitative algorithms. The winner? A four-line program submitted by mathematical psychologist Anatol Rapoport. It was called TIT-FOR-TAT, and its strategy was almost childishly simple: cooperate on the first move, then do whatever your opponent did last round.

That a strategy so simple could defeat far more sophisticated competitors stunned the academic world. But the deeper lesson was more stunning still: in a world of repeated interaction, niceness wins. The shadow of the future had transformed the Prisoner's Dilemma from a tragedy of mutual betrayal into an ecosystem where cooperation could not only survive but flourish. This chapter tells the story of how.

From One-Shot Tragedy to Repeated Game

Recall the devastating conclusion from our earlier chapters: in a one-shot Prisoner's Dilemma, the unique Nash equilibrium is mutual defection. Both players betray each other, both end up worse off than if they had cooperated, and rational self-interest is the engine of their shared misery. If you're playing just once, there is no escape. Your opponent's choice is already made, or will be, independently of yours, and defection strictly dominates cooperation regardless of what they do.

But how often in life do we truly interact with someone just once? Businesses compete in the same markets quarter after quarter. Nations negotiate treaties knowing they will face each other at the next summit. Neighbors share a fence for decades. Colleagues collaborate on project after project. The one-shot game is the exception; the repeated game is the rule. And repetition changes everything.

The key insight is deceptively simple: when you will encounter the same player again tomorrow, your choice today affects not just today's payoff but the entire future trajectory of the relationship. Defecting now might earn you a quick windfall, but it could provoke retaliation that costs you dearly for rounds to come. Cooperating now might mean sacrificing a short-term advantage, but it could sustain a mutually profitable partnership far into the future. The future casts a shadow over the present — and if that shadow is long enough, cooperation becomes rational even for the purely self-interested.

This idea had been circulating informally for decades. But it was Axelrod's tournament that transformed it from an abstract possibility into a vivid, empirically demonstrated reality, as Axelrod and Hamilton described in 1981.

Axelrod's First Tournament · 1980

Axelrod's first tournament in 1980 attracted fourteen entries from scholars across multiple disciplines. Each strategy played every other strategy, and a copy of itself, in matches of exactly two hundred rounds. The standard Prisoner's Dilemma payoffs applied: mutual cooperation earned each player three points, the Reward; mutual defection earned one point each, the Punishment; and if one defected while the other cooperated, the defector received five, the Temptation, while the cooperator received zero, the Sucker's payoff.

Tit-for-Tat's victory was remarkable not because of any clever exploitation but because of its elegant simplicity. Axelrod, in 1984, identified four properties that made it so effective:

Nice · Retaliatory · Forgiving · Clear

Nice: It never defected first. It started every relationship with an act of trust. Retaliatory: It immediately punished defection by defecting on the next round. It was not a pushover. Forgiving: After retaliating once, it was willing to return to cooperation if the opponent did. It did not hold grudges. Clear: Its pattern was so simple that opponents could quickly learn what to expect, reducing uncertainty and enabling coordination.

The results revealed a striking pattern: the top eight strategies in the tournament were all nice — meaning none of them were the first to defect. The bottom six were all not nice. This was not a coincidence. Nice strategies prospered because they could cooperate with each other, earning the steady stream of mutual cooperation payoffs, three per round. Nasty strategies might occasionally exploit a cooperative opponent, but they also triggered retaliation and locked into cycles of mutual defection, one per round, against each other.

Fig. 1 Axelrod's 1980 round-robin, sorted. The top half of the leaderboard is unbroken by a single not-nice strategy. The four properties of TIT-FOR-TAT — nice, retaliatory, forgiving, clear — turned out to be exactly the right virtues for an iterated prisoner's dilemma played against opponents you've never met.

The Second Tournament & the Ecological Sim

Axelrod published the results and invited a second, larger tournament. This time, sixty-two entries arrived — many deliberately designed to beat Tit-for-Tat. Once again, Tit-for-Tat won. The strategies designed to exploit it couldn't gain enough against it to overcome the costs of mutual defection when they fought each other.

Axelrod then ran a thought experiment that made the results even more dramatic. He imagined a population where each strategy's representation was proportional to its success. In each generation, strategies that scored well would grow in numbers, while poor performers would shrink. This ecological simulation modeled a kind of natural selection among strategies.

The results were stunning. Early on, exploitative strategies did reasonably well because there were plenty of cooperative strategies to prey upon. But as the exploitative strategies drove down the cooperative pushovers, they lost their food supply and began cannibalizing each other. Tit-for-Tat, meanwhile, steadily grew. By around generation two hundred, the population was dominated by Tit-for-Tat and other nice, retaliatory strategies. The exploiters had driven themselves to extinction, as Axelrod documented in 1984.

The lesson was profound: cooperation can invade a population of defectors and, once established, resist invasion by exploiters — provided cooperators are retaliatory enough to punish defection and forgiving enough to restore cooperation after conflicts. Later research by Nowak and Sigmund in 2008 showed that while Tit-for-Tat has vulnerabilities, particularly to noise and random errors, the broader principle of reciprocal cooperation is remarkably robust across many variants.

player-a

player-b

$ awaiting inputpick two strategies and press run. you'll see total payoffs after 100 rounds and a one-line read on what happened.

Formal Machinery · Repeated Games & the Discount Factor

Axelrod's tournaments gave us a vivid demonstration. Now we need the formal machinery to understand why repetition enables cooperation and exactly when cooperation is sustainable. This requires us to define repeated games precisely and introduce the concept that makes everything work: the DISCOUNT FACTOR.

A repeated game, sometimes called a supergame, consists of a base game — called the stage game — that is played over and over by the same set of players. After each round, all players observe the actions that were taken, and then the stage game is played again. A player's strategy in the repeated game is a complete contingent plan: it specifies what to do in every round as a function of the entire history of play up to that point.

There are two versions worth distinguishing. In a finitely repeated game, players know the game will last exactly T rounds. In an infinitely repeated game, either the game literally continues forever, or — more realistically — there is some probability that the game continues after each round, and the players don't know when it will end. Both interpretations lead to the same mathematical structure, but they have very different strategic implications.

The Backward-Induction Trap

Consider a finitely repeated Prisoner's Dilemma lasting exactly one hundred rounds. Can cooperation be sustained? Surprisingly, the answer is no — at least not by the logic of backward induction. In round one hundred, there is no future to worry about, so both players defect. It's a one-shot game. But if both will defect in round one hundred regardless, then round ninety-nine is effectively the last strategically relevant round — so both defect in round ninety-nine too. This logic unravels all the way back to round one. The unique subgame perfect equilibrium of a finitely repeated Prisoner's Dilemma with complete information is defection in every round.

This result sounds absurd — and in practice, people do cooperate in finitely repeated games. But the theoretical point is important: for cooperation to be sustained by purely rational, self-interested agents, the game must either be infinitely repeated or have an uncertain endpoint. The shadow of the future must extend indefinitely.

The Discount Factor · Delta

In an infinitely repeated game, we need a way to compare payoff streams that extend forever. The discount factor, denoted delta, serves this purpose. It is a number between zero and one that represents how much a player values future payoffs relative to present ones. A payoff of x received one round from now is worth delta times x today. A payoff received two rounds from now is worth delta squared times x, and so on.

The discount factor captures two related ideas. First, it reflects patience: a player with delta close to one is very patient and cares almost as much about future payoffs as present ones. A player with delta close to zero is impatient and cares mostly about the immediate round. Second, delta can represent the probability that the game continues after each round: if there is a ten percent chance the game ends after any given round, then delta equals zero point nine, because each future round is worth ninety percent as much as the certainty of today, as explained by MIT OpenCourseWare in 2010.

Using the discount factor, we can calculate the PRESENT VALUE of any payoff stream. If a player receives payoff pi in every round forever, the present value is pi divided by one minus delta. This geometric series formula is the engine of the entire analysis. It lets us compare the value of sustained cooperation against the value of defecting today and suffering the consequences tomorrow.

Trigger Strategies & the Critical Discount Factor

Now we can state precisely how cooperation works in repeated games. The mechanism is a TRIGGER STRATEGY: a strategy that begins by cooperating and switches to punishment if the other player ever defects. The punishment creates a credible threat that makes defection unprofitable — provided the future matters enough.

The simplest and most severe trigger strategy is the GRIM TRIGGER, also called the grim strategy or trigger strategy. It works as follows: cooperate in every round until the opponent defects. After the first defection, defect in every round forever, with no possibility of forgiveness. One strike and you're out, permanently.

Let's verify that grim trigger can sustain cooperation. Consider two players both using grim trigger, with our standard payoffs: R equals three, T equals five, P equals one, S equals zero. If both cooperate forever, each receives a present value of three divided by one minus delta.

Now suppose a player considers deviating: defecting today while the opponent cooperates, earning five, but then being punished with mutual defection forever, earning one per round. The present value of this deviation is five plus delta times one divided by one minus delta.

Cooperation is sustainable when the cooperation payoff exceeds the defection payoff. Working through the algebra: three divided by one minus delta must be greater than or equal to five plus delta divided by one minus delta. Solving this inequality gives us delta greater than or equal to one-half.

This is the CRITICAL DISCOUNT FACTOR, often written delta star. When delta is greater than or equal to one-half, cooperation is a subgame perfect Nash equilibrium of the infinitely repeated Prisoner's Dilemma. When delta is less than one-half, only mutual defection can be sustained.

Two curves, one tipping point. The present value of cooperating forever and the present value of defecting once and being punished forever cross at \u03b4\u2002\u002A\u2003=\u20031\u2009/\u20092. To the left of that crossing, the future is too short to be worth investing in; to the right, the shadow of tomorrow buys today's cooperation. — Fig. 2 Two curves, one tipping point. The present value of cooperating forever and the present value of defecting once and being punished forever cross at *\u03b4\u2002\u002A*\u2003=\u20031\u2009/\u20092. To the left of that crossing, the future is too short to be worth investing in; to the right, the shadow of tomorrow buys today's cooperation.

Beyond Grim · Forgiving Strategies

Grim trigger is powerful but extreme. In practice, a single defection — which might be a mistake or misunderstanding — triggers eternal punishment. More forgiving strategies can also sustain cooperation:

Tit-for-Tat

Cooperate initially, then copy the opponent's last move. Punishes defection but forgives immediately once the opponent returns to cooperation.

Tit-for-Two-Tats

Only punish after the opponent defects twice in a row. Even more forgiving, and more robust to occasional mistakes.

Pavlov · Win-Stay, Lose-Shift

If the last round's outcome was good — meaning you got three or five — repeat your action; if bad — meaning you got zero or one — switch. Nowak and Sigmund in 2008 showed this strategy can outperform Tit-for-Tat in noisy environments because it can correct mutual defection without outside help.

Each of these strategies sustains cooperation through a different balance of retaliatory severity and forgiveness. The grim trigger requires the lowest critical discount factor, maximizing the range of delta values that support cooperation, because its punishment is so severe. More forgiving strategies require a somewhat higher delta to work, but they are more robust when mistakes happen.

The Folk Theorem

We've shown that cooperation can be sustained in the repeated Prisoner's Dilemma when players are patient enough. But the story goes much further. The FOLK THEOREM — so named because the result was widely known among game theorists before anyone published a formal proof — makes a startling claim: in an infinitely repeated game with sufficiently patient players, virtually any payoff profile that gives each player more than their minimax value can be sustained as a Nash equilibrium.

What does this mean? The minimax value is the worst payoff a player can guarantee for themselves regardless of what others do. In our Prisoner's Dilemma, each player can guarantee at least one by always defecting. The folk theorem says that any combination of average payoffs where each player gets at least one can be an equilibrium outcome — not just mutual cooperation at three, three, but also asymmetric outcomes like four, two, or even near-mutual-defection payoffs like one point five, one point five, as long as delta is close enough to one.

Fudenberg and Maskin, in 1986, provided the definitive formal proof, establishing that for any feasible, individually rational payoff vector, there exists a discount factor close enough to one such that those payoffs can be achieved as a subgame perfect equilibrium. The proof is constructive: it shows how to build reward-and-punishment strategies that sustain any target payoff.

This is both the folk theorem's power and its weakness. It tells us cooperation can emerge, but it also tells us that defection, alternating exploitation, and countless other patterns are equally valid equilibria. The theory of repeated games is a theory of possibility, not of prediction. To predict which equilibrium will actually be selected, we need additional concepts — focal points, norms, institutions, or evolutionary dynamics — that go beyond the folk theorem itself. We'll explore some of these in later chapters.

Three Real Cartels in the Wild

The theory of repeated games is not just an elegant abstraction. Its logic explains real cooperative arrangements that would otherwise be deeply puzzling — situations where self-interested actors cooperate without contracts, courts, or centralized enforcement.

OPEC · Saudi Arabia & the 1985–86 Punishment Phase

The Organization of the Petroleum Exporting Countries, OPEC, is essentially a cartel: member nations agree to restrict oil production to keep prices high. But every member faces a Prisoner's Dilemma: each country benefits from the high price sustained by others' restraint, yet each could earn even more by secretly exceeding its quota while others hold the line. If everyone cheats, prices collapse and everyone suffers.

OPEC's history is a case study in the fragility of repeated-game cooperation. For years, Saudi Arabia acted as a swing producer, absorbing production cuts to maintain the cartel price. But when cheating by other members became too costly, Saudi Arabia dramatically increased production in 1985 to 1986, crashing oil prices. Griffin, in 1994, interprets this as a shift to a tit-for-tat strategy: Saudi Arabia punished excessive cheating to restore discipline, then returned to cooperation once other members fell into line. The episode illustrates how punishment phases — costly and disruptive though they are — can be essential for sustaining long-run cooperation.

The Western Front · 1914–1918

Perhaps the most remarkable real-world example of repeated-game cooperation comes from an unlikely setting: the trenches of World War One. Historian Tony Ashworth, in 1980, documented a pervasive system of informal truces that emerged between opposing units along the Western Front. Soldiers on both sides developed tacit agreements not to shoot to kill, particularly during meals, rest periods, and predictable daily routines.

How could cooperation emerge between soldiers whose explicit orders were to destroy each other? The answer is repetition. Unlike the mobile warfare of later conflicts, World War One trench warfare kept the same units facing each other for weeks or months at a time. Each side could observe the other's actions and respond in kind. If one side showed restraint, the other reciprocated. If one side escalated, the other retaliated. The repeated-game structure naturally produced Tit-for-Tat-like dynamics.

In one sector the weights arrived daily at the same time, and the weights were fired at the same targets, at the same time. After being relieved, the weights continued at the same time, at the same targets.
British soldier on ritualised artillery exchanges, Ashworth (1980)

The military high command, recognizing the problem, eventually broke the live-and-let-live system by ordering unpredictable raids that destroyed the repeated-game structure. When soldiers no longer knew whether their current opponents would be there tomorrow, the shadow of the future shortened, and cooperation collapsed. This tragic outcome perfectly illustrates the theory: shorten the shadow, and defection becomes inevitable.

Climate · Why the Shadow Is Diluted

International environmental agreements face the same structural challenge. Nations must cooperate to reduce emissions, preserve fisheries, or protect the ozone layer. But each nation has an incentive to free-ride on others' sacrifices. There is no global government to enforce agreements. Cooperation must be self-enforcing — sustained, as in the repeated game, by the threat that defection today will unravel cooperation tomorrow.

The repeated-game framework explains both the successes and failures of international cooperation. The Montreal Protocol on ozone-depleting substances succeeded in part because the interaction was indefinitely repeated, the consequences of defection were observable, and the number of key players was small enough for reputational mechanisms to work. Climate agreements have been far harder because the shadow of the future is diluted by discounting — the worst effects are decades away — monitoring is difficult, and the number of players is very large. All of these are factors that the discount factor framework predicts would undermine cooperation. We will return to this analysis in depth in a later chapter.

Limits of the Theory

The results in this chapter are among game theory's most hopeful. But they come with important caveats that any honest assessment must acknowledge.

First, cooperation requires a long enough shadow. If the discount factor is too low — if players are too impatient, if the relationship might end soon, if monitoring is too slow — cooperation collapses. The theory is precise about this: below delta star, no trigger strategy can sustain cooperation. This explains why cooperation breaks down in end-of-relationship situations, in environments of rapid turnover, and when players cannot observe each other's actions clearly.

Second, the folk theorem's multiplicity problem is genuine. The theory says many equilibria are possible, but it doesn't tell us which one will emerge. In some environments, cooperative norms take hold; in others, exploitative ones do. History, culture, institutions, and focal points determine which equilibrium is selected — forces that lie outside the formal model.

Third, cooperation in repeated games is fragile to noise. If players sometimes make mistakes — pressing the wrong button, misreading a signal — then even gentle trigger strategies can spiral into mutual punishment. Tit-for-Tat, for all its virtues, can get trapped in alternating defection cycles when a single error occurs. More sophisticated strategies like Pavlov or generous Tit-for-Tat are more robust to noise but harder to analyze formally, as Nowak and Sigmund noted in 2008.

Finally, these results assume players can observe each other's actions perfectly. In many real-world settings — international relations, complex business environments, online markets — monitoring is imperfect. When you can't tell whether your partner defected or just got unlucky, the logic of trigger strategies becomes much harder to apply. Imperfect monitoring is one of the frontier topics of modern game theory research.

Despite these caveats, the central message endures: repetition creates the possibility of cooperation among the self-interested. This is one of social science's genuinely important insights. It tells us that trust, reciprocity, and cooperation are not naive sentiments but can be the strategic choices of rational agents who understand that today's decisions shape tomorrow's relationships.

Key Takeaways

In a one-shot Prisoner's Dilemma, defection is the only Nash equilibrium. In an infinitely repeated Prisoner's Dilemma, cooperation can be sustained as an equilibrium when the discount factor is high enough.
Axelrod's tournaments (1980, 1984) demonstrated that simple, nice, retaliatory, and forgiving strategies like Tit-for-Tat outperform sophisticated exploitative strategies in iterated play.
The discount factor delta measures how much players value the future relative to the present. It can represent patience or the probability that the game continues.
Trigger strategies — especially grim trigger — sustain cooperation through the credible threat of future punishment.
For the standard Prisoner's Dilemma, the critical discount factor is delta-star equals one-half.
The folk theorem (Fudenberg & Maskin, 1986) establishes that virtually any feasible, individually rational payoff can be sustained as an equilibrium in infinitely repeated games with sufficiently patient players — a powerful but also permissive result.
Real-world cooperation in OPEC cartels (Griffin, 1994), World War One trenches (Ashworth, 1980), and international agreements follows the logic of repeated games: repetition, observability, and patient players enable cooperation; disruptions to any of these factors cause it to collapse.
Cooperation in repeated games is fragile: it depends on a long enough shadow of the future, sufficient observability, and some mechanism for handling noise and mistakes (Nowak & Sigmund, 2008).

looking ahead · class-07 — Hidden Knowledge

We've seen that cooperation can emerge when purely self-interested players interact repeatedly. But what happens when players have private information — when you don't know your opponent's payoffs, their type, or even whether they're rational? The next chapter introduces Bayesian games and incomplete information, where the challenge shifts from sustaining cooperation to figuring out who you're dealing with. The discount factor and repeated-game logic we've built here will return in later analyses of bargaining models and climate agreements.

References

Ashworth, T. (1980). Trench Warfare 1914–1918: The Live and Let Live System. Holmes & Meier.

Axelrod, R. (1984). The Evolution of Cooperation. Basic Books.

Axelrod, R., & Hamilton, W. D. (1981). The evolution of cooperation. Science, 211(4489), 1390–1396.

Fudenberg, D., & Maskin, E. (1986). The folk theorem in repeated games with discounting or with incomplete information. Econometrica, 54(3), 533–554.

Griffin, J. M. (1994). OPEC and world oil prices: Is the genie back in the bottle? Energy Studies Review, 6(3), 211–218.

MIT OpenCourseWare. (2010). 14.12 Economic Applications of Game Theory · Lecture notes on repeated games. MIT.

Nowak, M. A., & Sigmund, K. (2008). Evolutionary dynamics of biological games. In Evolutionary Dynamics. Harvard University Press.

Rapoport, A. (1980). Tit-for-Tat (FORTRAN entry, Axelrod tournament). University of Toronto.