module-iii/class-09

Evolution's Strategists
Game Theory in Biology

When the players have no brains. ESS, replicator dynamics, kin selection — and the discovery that Nash equilibrium is older than rationality.

22 min read8 cited works

Two male red deer stand in a Scottish glen, antlers locked, muscles straining. They have been fighting for twenty minutes. One suddenly disengages, turns, and walks away, conceding the harem of females to his rival. Why did he quit? He wasn't outmatched; his antlers were equally impressive. But here's the deeper puzzle: why didn't this escalate further? Red deer could drive their antlers into a rival's unprotected flank, a move that would almost certainly win the contest. Yet across millions of years and billions of encounters, this almost never happens. Instead, stags follow an elaborate ritual: they roar, they walk in parallel, they lock antlers and push. The fight is fierce, but it's also restrained.

For decades, biologists assumed this restraint existed "for the good of the species." Then, in 1973, a mathematician named John Maynard Smith and an eccentric polymath named George Price asked a different question, a game-theoretic question: What if each individual animal is playing a strategy, and natural selection is the force that determines which strategies survive? The answer would revolutionize evolutionary biology and, in the process, reveal that the logic of Nash equilibrium runs far deeper than human rationality. It is woven into the fabric of life itself.

Strategy Without a Brain

For the first eight chapters of this course, we have built game theory on a bedrock assumption: players are rational. They have preferences, they form beliefs about opponents, and they choose strategies that maximize their expected payoffs. This framework has taken us remarkably far, from the Prisoner's Dilemma to repeated games, from mixed strategies to mechanism design. But now we encounter a profound challenge: strategic behavior is everywhere in nature, yet the players, bacteria, insects, birds, bats, have no capacity for rational deliberation whatsoever.

Consider just a few examples. Female digger wasps decide whether to dig their own burrows or invade occupied ones, and the population-level proportions match the predictions of mixed-strategy Nash equilibrium, as Maynard Smith reported in 1982. Male dung beetles adopt one of two strategies: growing large horns to fight rivals, or remaining hornless to sneak copulations, at frequencies that equalize reproductive success. Bacteria engage in warfare, producing toxins at precise rates that balance the costs of production against the benefits of eliminating competitors.

None of these organisms is choosing a strategy. They are programmed by their genes to behave in particular ways, and those programs have been shaped by millions of years of natural selection. The insight that launched evolutionary game theory is breathtaking in its simplicity: if we replace "rational choice" with "natural selection" and "payoff" with "reproductive fitness," then the mathematical apparatus of game theory applies to any population of interacting organisms. Evolution, it turns out, is the ultimate strategist, one that never thinks, never plans, and never makes a mistake that it doesn't eventually correct.

This chapter traces that insight from its origins in a landmark 1973 paper by Maynard Smith and Price to its modern applications. We will discover that the central solution concept of evolutionary game theory, the Evolutionarily Stable Strategy, or E-S-S, is a close cousin of the Nash equilibrium we already know. And we will find that the dynamics of evolution, formalized as replicator dynamics, provide something our rational-agent models never did: a concrete mechanism explaining how populations reach equilibrium, as Taylor and Jonker described in 1978.

The Hawk-Dove Game

The foundational model of evolutionary game theory is the Hawk-Dove game, introduced by Maynard Smith and Price in 1973. You will recognize it immediately. It is the game of Chicken from Chapter 3, recast in biological terms. But the biological framing reveals insights that the original model obscured.

Imagine a population of animals competing over a resource: food, territory, a mate. Each animal is genetically programmed to play one of two strategies. Hawk: Always escalate. Fight until you win the resource or are seriously injured. Dove: Always display. If your opponent escalates, retreat immediately. If your opponent also displays, share the resource.

Let V represent the fitness value of the resource and C the fitness cost of injury in an escalated fight, where C is greater than V, meaning injury costs more than the resource is worth. The payoff matrix, expressed in terms of reproductive fitness, looks like this: Hawk versus Hawk, both escalate, each wins half the time. Expected payoff equals V minus C, divided by two. Hawk versus Dove: Hawk escalates, Dove retreats. Hawk gets V, Dove gets zero. Dove versus Hawk: Dove retreats. Dove gets zero, Hawk gets V. Dove versus Dove: Both display and share. Each gets V divided by two.

Here is the key evolutionary question: which strategy will natural selection favor? At first glance, Hawk seems dominant. Hawks beat Doves every time. But watch what happens in a population of all Hawks. Every encounter is a brutal fight, and the average payoff is V minus C divided by two, which is negative when C is greater than V. Now suppose a single Dove mutant appears. This Dove never wins a resource against a Hawk, but it also never gets injured. Against other Doves, should any arise, it shares. In a world of Hawks, the Dove avoids catastrophic losses. Its fitness may be low, but it's not negative. The Dove mutant can invade.

Now consider the reverse: a population of all Doves. Everyone shares politely, earning V divided by two each time. A Hawk mutant arrives and steamrolls every Dove it meets, earning V each time, double the Dove payoff. Hawks invade a Dove population effortlessly.

So neither a pure Hawk nor a pure Dove population is stable against invasion. This is exactly the logic of the Chicken game from Chapter 3. No pure-strategy Nash equilibrium exists in the symmetric version. But recall from Chapter 4 what happens when pure strategies fail: we find a mixed-strategy equilibrium.

Let p be the proportion of Hawks. A Hawk's expected payoff against the population is: p times V minus C divided by two, plus one minus p times V. A Dove's expected payoff is: p times zero, plus one minus p times V divided by two. Setting the Hawk payoff equal to the Dove payoff and solving, we find p star equals V divided by C. This is the equilibrium proportion of Hawks. When V equals four and C equals ten, for instance, the population stabilizes at forty percent Hawks and sixty percent Doves. This can be interpreted in two equivalent ways: either each individual plays Hawk with probability V divided by C, a behavioral polymorphism, or a fraction V divided by C of the population is genetically programmed as Hawks, a genetic polymorphism. Evolution doesn't care about the mechanism. Only the frequencies matter.

This result is elegant and empirically testable. It predicts that in populations where the cost of fighting is high relative to the value of the resource, we should see more ritualized display and less escalation, exactly what we observe in species from stags to spiders, as Maynard Smith documented in 1982.

The Hawk-Dove payoff matrix and the algebra that produces the ESS hawk frequency p★ = V / C. The derivation lives entirely in the population, not the individual: no animal needs to compute anything. Selection finds the mix in which Hawk and Dove break even. — Fig. 1 The Hawk-Dove payoff matrix and the algebra that produces the ESS hawk frequency *p★ = V / C*. The derivation lives entirely in the population, not the individual: no animal needs to compute anything. Selection finds the mix in which Hawk and Dove break even.

Evolutionarily Stable Strategy · ESS

The Hawk-Dove game illustrates a general concept that Maynard Smith and Price formalized in 1973 as the Evolutionarily Stable Strategy, or E-S-S. An E-S-S is a strategy that, once adopted by nearly the entire population, cannot be invaded by any rare mutant strategy. Formally, a strategy S star is an E-S-S if, for every alternative strategy S not equal to S star, at least one of two conditions holds.

The best-response condition

The expected payoff of S star against itself is strictly greater than the expected payoff of S against S star. That is, S star earns a strictly higher payoff against itself than the mutant S earns against S star. In this case, the rare mutant does worse against the resident population and is immediately eliminated.

The stability condition

If the expected payoff of S star against itself equals the expected payoff of S against S star, then the expected payoff of S star against S must be strictly greater than the expected payoff of S against itself. That is, if the mutant does equally well against the resident, then the resident must do strictly better against the mutant than the mutant does against itself. This condition ensures that even when a mutant can tie against the resident, it cannot spread further once it becomes common enough to encounter copies of itself.

Notice the family resemblance to Nash equilibrium. An E-S-S must be a best response to itself. That's the Nash condition for symmetric games. But an E-S-S adds a stability requirement: it must also be robust against small perturbations in the population composition. This is a stricter criterion. Every E-S-S is a Nash equilibrium, but not every Nash equilibrium is an E-S-S, as Maynard Smith noted in 1982. The E-S-S concept thus provides a natural refinement of Nash equilibrium, one that doesn't depend on rational deliberation but on the blind persistence of evolutionary pressure.

V (value of resource)4

C (cost of injury)10

opp: HAWK

opp: DOVE

self: HAWK

−3.0(V − C) / 2

4.0V

self: DOVE

0.00

2.0V / 2

$ solve ess when C > V, the ESS is a mixed strategy with p★ = V / C.

HAWK p★0.40

DOVE 1−p★0.60

Replicator Dynamics

The E-S-S tells us where evolution settles, but not how it gets there. For that, we need replicator dynamics, the mathematical machinery that describes how strategy frequencies change over time in a population, as Taylor and Jonker developed in 1978.

The idea is disarmingly simple. Suppose organisms reproduce asexually, and their offspring inherit their parent's strategy. An organism's reproductive rate, its fitness, depends on the payoffs it receives in interactions with others. Strategies that earn above-average payoffs produce more offspring and thus grow in frequency; strategies that earn below-average payoffs shrink. The replicator equation captures this formally: x dot i equals x i times the quantity f i of x minus f bar of x. Here, x i is the proportion of the population playing strategy i, f i of x is the fitness, the expected payoff, of strategy i given the current population mix x, and f bar of x is the average fitness of the entire population. The rate of change of each strategy's frequency is proportional to how much better or worse it does compared to the population average, as Taylor described in 2014.

Taylor and Jonker in 1978 proved a remarkable theorem: the rest points of the replicator dynamics, the population states where all frequencies stop changing, correspond exactly to Nash equilibria of the underlying game. Moreover, stable rest points, ones that the dynamics converge to after small perturbations, correspond to refined equilibria that include all E-S-S. This establishes a profound bridge between the rational-agent framework of classical game theory and the evolutionary framework of biology. The same mathematical structures appear in both, derived from completely different foundations.

In the Hawk-Dove game, the replicator dynamics tell a vivid story. If the population starts with too many Hawks, p greater than V divided by C, Hawks are destroying each other in costly fights while Doves avoid the carnage. Dove fitness exceeds Hawk fitness, so the Dove share grows. If the population starts with too few Hawks, p less than V divided by C, Hawks are feasting on Doves without competition, outperforming them. The Hawk share grows. Only at p equals V divided by C do the fitnesses equalize and the frequencies stabilize. The E-S-S is a global attractor.

The power of replicator dynamics extends well beyond two-strategy games. In games with three or more strategies, the dynamics can produce complex behavior, including limit cycles where populations perpetually oscillate, never settling on a fixed equilibrium. The classic example is the Rock-Paper-Scissors dynamic observed in side-blotched lizards, Uta stansburiana, where three male color morphs, orange, blue, and yellow, cycle in frequency over years, each one beating the next in a never-ending evolutionary tournament, as Maynard Smith described in 1982. The replicator dynamics predict exactly this kind of cycling when no E-S-S exists.

Replicator-dynamics phase portrait for Hawk-Dove with V = 4, C = 10. The ESS at p★ = 0.40 is a global attractor; the two pure states at p = 0 and p = 1 are unstable rest points the dynamics walk straight away from. — Fig. 2 Replicator-dynamics phase portrait for Hawk-Dove with *V = 4, C = 10*. The ESS at *p★ = 0.40* is a global attractor; the two pure states at *p = 0* and *p = 1* are unstable rest points the dynamics walk straight away from.

Cooperation Among Animals · Reciprocal Altruism

In Chapter 6, we explored how cooperation can be sustained among rational, self-interested players through repeated interaction. Axelrod's tournaments in 1984 demonstrated that Tit-for-Tat, cooperate on the first move, then mirror your opponent's previous move, outperformed far more complex strategies in the iterated Prisoner's Dilemma. But Axelrod's insight was not confined to human institutions. He explicitly argued that the same logic applies to organisms with no concept of "strategy" at all, as long as they interact repeatedly and can condition their behavior on past outcomes.

The most vivid natural demonstration of reciprocal altruism comes from an unlikely source: vampire bats. Gerald Wilkinson in 1984 documented that vampire bats, Desmodus rotundus, in Costa Rica regularly regurgitate blood to feed roost-mates who failed to find a meal, a genuinely costly act, since a bat that doesn't eat for two consecutive nights will die. Crucially, Wilkinson showed that food sharing was not random. Bats preferentially shared with individuals who had shared with them in the past, and they were less likely to share with past defectors. The bats were playing reciprocal altruism, a biological version of Tit-for-Tat.

Why does this work evolutionarily? Recall from Chapter 6 that cooperation in the iterated Prisoner's Dilemma is sustainable when players are sufficiently patient, when the shadow of the future is long enough. For vampire bats, the "shadow of the future" comes from their social structure: they roost in stable groups for years, with high survival rates from night to night. Any bat that defects, accepting blood but refusing to share, will be identified and punished by exclusion. The long-term cost of defection outweighs the short-term gain, exactly as the folk theorem predicts.

Axelrod in 1984 identified four properties that make Tit-for-Tat successful: it is nice, never defects first; retaliatory, punishes defection immediately; forgiving, returns to cooperation after the opponent does; and clear, its pattern is easy for opponents to recognize. These properties don't require a brain. They require only that an organism's behavior be conditioned on the interaction history, something that even relatively simple nervous systems can accomplish.

game

p₀ (initial freq. of A)0.50

generations100

$ awaiting runpick a game, set p₀, press run. the mint trace is strategy A; the amber trace is strategy B. dashed line marks the analytic ESS where applicable.

Kin Selection · Hamilton's Rule

Reciprocal altruism explains cooperation in repeated interactions, but some of the most dramatic cooperation in nature occurs in one-shot encounters, or in contexts where reciprocity is impossible. Worker honeybees sacrifice their lives to sting intruders. Ground squirrels give alarm calls that attract predators to themselves. Meerkat sentinels stand guard while the group forages, foregoing food and exposing themselves to danger. How can self-sacrifice evolve when the beneficiary has no opportunity to reciprocate?

The answer came from William Hamilton in 1964, whose theory of kin selection showed that natural selection operates not on individual organisms alone, but on genes. An organism shares copies of its genes with relatives: fifty percent with siblings, twenty-five percent with half-siblings, twelve point five percent with cousins. If an altruistic act helps relatives survive and reproduce, the genes causing that altruism can spread, even if the altruist suffers a fitness cost.

Hamilton formalized this with a brilliantly simple inequality known as Hamilton's rule: r times b must be greater than c, where r equals genetic relatedness between actor and recipient, b equals fitness benefit to the recipient, and c equals fitness cost to the actor. When this condition holds, the gene for altruism increases in frequency. The biologist J B S Haldane reportedly quipped that he would lay down his life for "two brothers or eight cousins," a perfect, if informal, statement of Hamilton's rule, since r equals one half for brothers and r equals one eighth for cousins.

I would lay down my life for two brothers or eight cousins.
J. B. S. Haldane (attributed) · r·b > c

Bourke reviewed decades of empirical evidence in 2014 and found strong support for Hamilton's rule across taxa. Eusocial insects, ants, bees, wasps, termites, display the most extreme altruism in the animal kingdom, with sterile workers devoting their entire lives to their queen's reproduction. Strikingly, in Hymenoptera, ants, bees, and wasps, a peculiarity of sex determination called haplodiploidy means that sisters share seventy-five percent of their genes, making the threshold for altruism much lower than in diploid organisms.

Hamilton's rule forces us to expand our notion of "payoff" in game theory. In the classical framework, a player's payoff is its own utility, its own wealth, happiness, or survival. In evolutionary game theory with kin selection, the relevant payoff is inclusive fitness: the organism's own reproductive success plus its weighted contribution to relatives' success. This isn't altruism in any moralistic sense. It's gene-level self-interest, operating through kin as well as through the individual. But it means that the simple payoff matrices we've been writing need adjustment when interactions occur among relatives.

This expansion has a direct game-theoretic implication. Consider a one-shot Prisoner's Dilemma played between siblings, r equals one half. The standard payoff matrix might predict universal defection. But if we weight the opponent's payoff by r and add it to the player's payoff, converting individual fitness to inclusive fitness, the resulting game can become a coordination game where mutual cooperation is a Nash equilibrium. Kin selection transforms the game itself, as Maynard Smith noted in 1982.

The Circuit Closed · Evolution Meets Rationality

We have now completed a remarkable intellectual circuit. We began the course with rational agents carefully computing best responses. We end this chapter with mindless organisms whose "strategies" are genetically hardwired, shaped only by differential reproduction. And yet the same equilibrium concepts appear in both worlds.

This convergence is not a coincidence. It reflects a deep mathematical truth: any system in which agents, or organisms, or algorithms, or firms, adopt behaviors with higher payoffs at the expense of behaviors with lower payoffs will tend toward Nash equilibria. The replicator dynamic is one such system. Rational best-response dynamics is another. Reinforcement learning in A-I is yet another. The specific mechanism varies, neurons firing, genes replicating, managers reviewing quarterly reports, but the mathematical destination is the same, as Taylor noted in 2014.

This has profound implications for the interpretation of game theory. When we observe behavior in the field, whether in animal populations, market competition, or international relations, that conforms to Nash equilibrium predictions, we need not assume that the players are engaged in sophisticated reasoning. They might be. But they might also be responding to evolutionary pressures, cultural norms, trial-and-error learning, or institutional rules that push behavior toward equilibrium without anyone understanding why. Evolutionary game theory teaches us that strategic behavior is deeper and more universal than rational deliberation. It is a property of any sufficiently complex adaptive system.

Consider one final, striking implication. In Chapter 4, we computed mixed-strategy Nash equilibria and sometimes struggled with the interpretation: Why would a rational player randomize? The evolutionary perspective dissolves this puzzle. In nature, the "mixed strategy" is simply a population with a stable proportion of different types. No individual needs to randomize. The population-level mix emerges from the differential survival of pure types. The interpretation that seemed strained in classical game theory becomes perfectly natural in the evolutionary framework, a beautiful case of one field illuminating another.

Key Takeaways

Evolutionary game theory replaces rational choice with natural selection: organisms don't calculate payoffs, evolution calculates for them by favoring strategies with higher reproductive fitness.
The Hawk-Dove game (Maynard Smith & Price, 1973) is the biological equivalent of Chicken. When fighting costs exceed resource value, C > V, neither pure strategy is stable, and the E-S-S is a mixed strategy with proportion V / C playing Hawk.
An Evolutionarily Stable Strategy (E-S-S) is a strategy that cannot be invaded by any rare mutant. It satisfies two conditions: it must be a best response to itself, and it must satisfy a strict stability condition against equally-fit mutants.
Every E-S-S is a Nash equilibrium, but not vice versa. E-S-S is a stricter criterion that serves as a natural equilibrium refinement based on evolutionary stability rather than rational deliberation.
Replicator dynamics (Taylor & Jonker, 1978) describe how strategy frequencies evolve over time. Rest points correspond to Nash equilibria, and stable rest points correspond to or include E-S-S — the dynamical bridge between evolution and game theory.
Cooperation evolves through two mechanisms: reciprocal altruism — Tit-for-Tat in repeated interactions, as seen in vampire bats (Wilkinson, 1984; Axelrod, 1984) — and kin selection, Hamilton's rule r·b > c (Hamilton, 1964; Bourke, 2014), explaining altruism among genetic relatives.
The convergence of rational-choice and evolutionary approaches reveals that strategic behavior is a universal property of adaptive systems, not a feature unique to human reason.

looking ahead · class-10

Evolution shows us strategic behavior emerging without rationality. In Splitting the Pie, we turn to bargaining theory: when two players must divide a surplus, how does the structure of the negotiation determine the split? We meet the Nash bargaining solution, the Rubinstein alternating-offers game, and the strategic power of patience.

References

Axelrod, R. (1984). The Evolution of Cooperation. Basic Books.

Bourke, A. F. G. (2014). Hamilton's rule and the causes of social evolution. Philosophical Transactions of the Royal Society B, 369(1642), 20130362.

Hamilton, W. D. (1964). The genetical evolution of social behaviour, I & II. Journal of Theoretical Biology, 7(1), 1–52.

Maynard Smith, J. (1982). Evolution and the Theory of Games. Cambridge University Press.

Maynard Smith, J., & Price, G. R. (1973). The logic of animal conflict. Nature, 246, 15–18.

Taylor, P. D. (2014). The replicator equation and other game dynamics. Proceedings of the National Academy of Sciences, 111(Suppl. 3), 10810–10817.

Taylor, P. D., & Jonker, L. B. (1978). Evolutionarily stable strategies and game dynamics. Mathematical Biosciences, 40(1–2), 145–156.

Wilkinson, G. S. (1984). Reciprocal food sharing in the vampire bat. Nature, 308, 181–184.