Abstract network of light, resembling a gene regulatory network

The Math That Biology Pretends It Doesn't Need

September 2025 · mathematical biology, dynamical systems

Here is a fact that should be more disorienting than it is: gene expression at the single-molecule level looks like someone flipping coins. The process by which RNA polymerase finds a promoter, the dwell times of transcription factors at their binding sites, the timing of individual splicing events — all of it is stochastic. Protein copy numbers between genetically identical sister cells, seconds after division, can differ by factors of two or three. And yet development is reproducible. Five fingers, every time, across billions of people and hundreds of millions of years of evolution.

The naive answer — the genome as blueprint, genes as instructions — breaks down the moment you look at it carefully. A blueprint specifies what to build. The genome specifies proteins and their interactions. These are not the same thing. The genome does not say "make a kidney with one million nephrons." It encodes transcription factors and signaling cascades and structural proteins, and somehow, reliably, a kidney results. The reliability has to come from the architecture of the system, not from explicit specification of every outcome.

That architectural reliability is what mathematical biology is trying to describe. The frameworks it has developed over the last sixty years are not just convenient approximations — they make specific, testable predictions about which cell states are stable, how development can fail, how much information a signaling pathway can carry, and why some perturbations derail fate commitment while others are corrected automatically. What follows is those frameworks, grounded in the biology that motivated each one.


Attractors: the mathematical definition of a cell type

Start with the most basic question: what is a cell type? The traditional answer is transcriptional — a cell type is defined by its gene expression profile. But expression profiles are distributed, not precise. Different cells of the "same type" differ substantially in the expression levels of most genes. What makes them the same type is not a fixed expression state but convergent behavior: perturb them, and they return to the same neighborhood.

This is the definition of an attractor in dynamical systems. For a system described by dx/dt = f(x), where x is the gene expression state and f captures the regulatory interactions, a stable fixed point x* is a state where f(x*) = 0 and all perturbations decay back to it. The mathematical condition is that all eigenvalues of the Jacobian matrix J = ∂f/∂x at x* have negative real parts. A cell type, in this language, is a stable attractor of the gene regulatory network dynamics.

Kauffman proposed this in 1969 (Journal of Theoretical Biology), before the tools existed to test it. By 2005, Huang and colleagues had the data: tracking differentiation of hematopoietic progenitors in a 2,773-dimensional gene expression space, they showed that trajectories from many different starting states converged on the same attractors — the gene expression profiles of differentiated blood cell types. The landscape was real. The attractors were measurable.

The most consequential prediction of this framework is about fate commitment. A bifurcation — a qualitative change in the number or stability of fixed points — is the mathematical event of differentiation. In a pitchfork bifurcation, one stable state becomes unstable and two new stable states emerge: one valley splits into two, and noise determines which the cell falls into. In a saddle-node bifurcation, two fixed points collide and annihilate, forcing the cell into a different attractor with no way back. The GATA1/PU.1 toggle switch in blood development, where mutual repression between two transcription factors creates bistability, is the canonical biological example. Huang and colleagues (Developmental Biology, 2007) showed that the uncommitted progenitor state is a saddle point near the boundary between basins — metastable, easily perturbed, destabilized by small changes in external signals.

Blood cells flowing — hematopoiesis, the canonical bifurcation system

Hematopoiesis — the differentiation of blood cells — is the canonical attractor bifurcation system. The GATA1/PU.1 toggle is the best-characterized bistable switch in mammalian development.

The reason this matters practically is that it changes what you look for when development goes wrong. If a cell type is a stable attractor, then pathology — cancer, for instance — is what happens when the landscape is deformed: barriers lowered, new attractors created, old ones eliminated. A cancer cell is not a cell that has simply accumulated random damage. It is often a cell that has settled into an attractor that normal development never visits. Weinberg's "hallmarks of cancer" are, in dynamical systems language, the regulatory changes that reshape the landscape to make the cancer attractor accessible and stable. The implications for therapy are direct: treat cancer as a landscape deformation problem rather than a mutation enumeration problem.

The same framework is what Waddington was trying to capture with "canalization" — the tendency of development to reach the same endpoint despite perturbation. What he drew as hillsides and valleys in 1957 is, in this language, the basin structure of an attractor landscape shaped by regulatory network topology. This is why Kauffman's 1969 proposal and Wang's 2008 derivation matter not just as theoretical results but as foundations for the argument made in the next post in this series: that the landscape is an empirical object, measurable from trajectory data, and that understanding its topology — where the basins are, where the separatrices run, what the barriers look like — is precisely what organoid engineering lacks. You cannot guide self-organization toward a target cell type without knowing the shape of the landscape the cells are navigating.

interactive Waddington landscape — attractors, bifurcation, and fate commitment
A 1D cross-section of the Waddington landscape for a bistable toggle switch (like GATA1/PU.1 in blood development). Dots are individual cells rolling stochastically toward fate attractors. The quasi-potential U(x) is drawn — valleys are stable cell types, the peak is the unstable saddle between them. Reduce mutual repression: watch the barrier shrink and the bifurcation collapse two fates into one. Click anywhere on the landscape to drop a new cell.

Noise: not something to be eliminated, but something to be understood

The standard intuition about biological noise is that it is a problem — a consequence of working with small numbers of molecules that evolution has partially suppressed. The actual picture is considerably stranger. Elowitz and colleagues (Science, 2002) placed two identical promoters in E. coli driving different fluorescent proteins and decomposed cell-to-cell variation into intrinsic noise (specific to the stochastic process of individual gene expression events) and extrinsic noise (global cellular differences in ribosome counts, growth rate, and so on). Both were substantial. Neither could be easily eliminated. And subsequent work showed that cells had evolved machinery not just to tolerate noise but to exploit it.

The clearest example is bacterial persistence. Balaban and colleagues (Science, 2004) showed that E. coli populations maintain a small fraction of slow-growing "persister" cells that survive antibiotic treatment — not because of resistance mutations, but because a toxin-antitoxin module stochastically drives a small number of cells into a growth-arrested state. The population bets across the distribution: most cells grow fast in normal conditions, a few maintain dormancy as antibiotic insurance. Noise drives barrier-crossing between the two states. Remove the noise and you remove the bet-hedging. It is a deliberate feature.

In development, symmetry-breaking at bifurcation points is often initiated by noise. When a progenitor sits near a saddle point between two fates, the fluctuations in transcription factor concentrations — the same intrinsic noise Elowitz measured — push it across the basin boundary. Maamar, Raj, and Dubnau (Science, 2007) showed this directly for Bacillus subtilis competence: reducing transcriptional noise by replacing the native promoter with a lower-noise synthetic version dramatically reduced the fraction of cells entering the competent state. The stochastic commitment was not incidental. It was the mechanism. A 2025 paper in Developmental Cell (Garcia-Blay and colleagues) identified 32 proteins that specifically regulate transcriptional noise in mammalian cells, some of which alter noise without changing mean expression. Cells have dedicated molecular hardware for noise management.

This has a direct consequence for tissue engineering that the next post examines in detail. When organoids fail to pattern correctly, the standard explanation is batch-to-batch variation in culture conditions. The attractor and noise frameworks suggest something more specific: early bifurcation events — the first hours of self-organization — are the moments where noise level relative to barrier height determines fate. A small stochastic fluctuation at the wrong moment propagates through subsequent cell divisions and contacts, producing an organoid committed to the wrong trajectory long before any morphological defect is visible. Understanding noise at bifurcation points is therefore not an academic exercise. It is the prerequisite for knowing where in development to intervene — and the demonstration that intervening after commitment is too late.

interactive Stochastic fate switching — noise unlocks an inaccessible outcome
The landscape is asymmetric — the left basin (dark red) is deep and always the deterministic outcome; the right basin (blue) is shallower but still a genuine attractor, only reachable by crossing the barrier stochastically. At low noise all cells commit to the default fate. Increase noise and watch some cells gain enough stochastic energy to cross the barrier and commit to the noise-driven fate. The bar chart tracks the population split across all runs. This is the mechanism of bacterial persistence and B. subtilis competence switching.

Signaling channels carry less information than you think

A cell making a fate decision needs information from its environment. It receives that information through signaling pathways — receptor activation, kinase cascades, transcription factor translocation. The question information theory asks is: how much information can these pathways actually carry?

The answer is surprising. Cheong, Levchenko, and colleagues (Science, 2011) measured the information capacity of the TNF-to-NF-κB pathway — one of the most studied signaling systems in mammalian biology, involved in inflammation, immunity, and cancer — and found that it transmits approximately 0.92 bits per cell per time point. Almost exactly one bit. A cell receiving TNF signaling through this pathway can barely distinguish "signal present" from "signal absent." It cannot reliably discriminate between two different signal strengths. The mutual information between the input concentration and the output NF-κB level, computed from the full distribution across thousands of cells, saturates at less than one bit because cell-to-cell variation in signaling components (extrinsic noise) overwhelms the input signal.

Shannon's mutual information I(X;Y) = H(X) + H(Y) − H(X,Y) measures how much knowing the input reduces uncertainty about the output. When this approaches zero, the output tells you almost nothing about the input. When it equals H(X), the channel is noiseless. For the NF-κB pathway, it is closer to the former.

This result seems to create a paradox. How does a cell make reliable fate decisions if its signaling pathways can barely transmit one bit? The answer, which Selimkhanov and colleagues (Science, 2014) demonstrated, is temporal coding. A single static measurement of NF-κB level at one time point carries less than one bit. But NF-κB oscillates — roughly 90-minute period, constant across stimulation intensities (Nelson et al., Science, 2004) — and measuring the full time trace carries substantially more information because the dynamics encode information the amplitude does not. The signaling code is written in time, not in amplitude. A system that reads only one frame reads noise.

But temporal coding only partially resolves the paradox. The deeper answer is that individual pathways are not the unit of biological computation — networks are. The Drosophila gap gene network is the clearest demonstration. The Bicoid morphogen gradient along the anterior-posterior axis is noisy: Gregor and colleagues (Cell, 2007) measured that Bicoid concentration varies roughly 10% between embryos at equivalent positions. Yet the Hunchback boundary — the position at which Hunchback expression switches from high to low — varies less than 1% of embryo length across embryos. A tenfold reduction in positional error, achieved by the network. How? The four gap genes — Hunchback, Krüppel, Giant, Knirps — cross-repress each other and self-activate, effectively computing a spatial derivative of the Bicoid gradient rather than simply reading its amplitude. Dubuis, Tkačik, Wieschaus, Gregor, and Bialek (PNAS, 2013) showed that four gap genes together encode approximately 4.5 bits of positional information — nearly enough to uniquely identify each of the roughly 50 nuclei per row along the axis, distributed nearly uniformly, which is consistent with an optimization principle operating at the level of the network architecture.

This is the key shift. A signaling pathway is a noisy channel. A gene regulatory network is a computation — a transfer function that maps an input trajectory to an output trajectory. The gap gene network does not passively transmit Bicoid concentration. It filters it, sharpens it, and encodes it in a form suitable for downstream fate decisions. That computation is the biological information processing unit, and it only exists as a relationship between variables over time: the network's output at one moment depends on the history of its inputs, the states of its neighbors, and its own past states. You cannot reconstruct a transfer function from a single output measurement. Mathematically, this is identical to the reason you cannot identify a differential equation from one data point. The implication, which the next post in this series develops directly, is that understanding what a regulatory network is computing requires trajectory data — what signals the cell accumulated, what states it passed through — not a snapshot of where it ended up.

Abstract waves suggesting signal oscillation and frequency encoding

NF-κB oscillates at a roughly 90-minute period regardless of stimulus intensity. Frequency modulation, not amplitude, is the primary information-carrying mode. A single snapshot reads less than one bit.

interactive Signaling channel — how much information does a pathway carry?
Left: two input states (e.g. TNF absent / present). Right: distributions of the cellular output (NF-κB level) after passing through the noisy signaling channel. When the distributions overlap heavily, the output is ambiguous — the cell cannot reliably infer which input it received. The estimated bits transmitted updates in real time. The NF-κB pathway sits at ~0.92 bits — the noise slider shows why.

The implications for how we study signaling are significant. If a pathway carries less than one bit at any single moment, then single-cell snapshots of pathway activation state — the standard readout of most cell biology experiments — are extremely noisy measurements of the biological variable you actually care about. The signal is in the dynamics. A cell that pulses NF-κB four times in 30 minutes is receiving different information from a cell that pulses once, even if you cannot tell them apart in a fixed-timepoint assay. This is one of the reasons that the recorder tools described in the next post — systems that write transcriptional history into DNA or protein — are not a marginal improvement over standard single-cell sequencing. They are accessing a fundamentally different class of information.


The Waddington landscape is not equilibrium physics

The Waddington landscape is the most widely reproduced image in developmental biology. A ball rolling down branching valleys. Each valley a cell fate. The visual intuition is immediately compelling, and most presentations leave it there — a metaphor for canalization and commitment, made more rigorous by noting that valleys correspond to gene expression attractors and barriers to the energetic cost of reprogramming.

The problem with this presentation is that it imports an assumption from equilibrium statistical mechanics that cells violate constantly: the assumption that the system is at detailed balance. In an equilibrium system, the probability of finding the system in any state x is proportional to exp(−U(x)/kT), where U is a potential function. The probability current between any two states is zero at steady state. Paths between states are time-reversible.

None of this is true for cells. Living cells continuously consume energy — ATP hydrolysis, proton gradients, GTP-driven signaling cascades. They are driven, dissipative, far-from-equilibrium systems. Jin Wang and colleagues derived the consequence rigorously (PNAS, 2008, and a series of papers through 2024): the steady-state probability distribution of cell states is governed not by a single potential function but by two forces — a gradient force from a quasi-potential U = −ln P_ss, and a rotational probability flux J that circulates through the landscape even at steady state. This curl flux breaks time-reversal symmetry. It means that the optimal path from one cell fate to another does not follow naive gradient descent on the landscape, and does not necessarily pass through the saddle point between basins.

Wang's group validated this prediction against actual single-cell transcriptomic data in PNAS (2024): reprogramming paths from fibroblasts to iPSCs deviate from gradient descent in ways that the flux correctly predicts, and the flux determines whether reprogramming succeeds. Stem cells are maintained at higher non-equilibrium dissipation than differentiated cells — pluripotency costs energy not just metabolically but in the thermodynamic sense of maintaining a state that entropy alone would not favor.

The practical consequence is that knowing the quasi-potential landscape is not enough to predict how differentiation will proceed or how to optimally drive reprogramming. You also need the flux. And measuring the flux requires observing not where cells are but how they move — trajectory data, not snapshots.


Perfect adaptation: how cells maintain set points despite noise

The last framework is control theory. It is perhaps the most directly translatable to engineering, and it may be the most broadly applicable to developmental biology once the measurement tools catch up to the theory.

Bacterial chemotaxis is the canonical demonstration of biological feedback control, and specifically of one of its most powerful variants: integral feedback. When E. coli is exposed to a step increase in attractant concentration, its receptor activity increases initially, then returns exactly to its prestimulus level — regardless of how large the stimulus was. This is perfect adaptation. Barkai and Leibler (Nature, 1997) proved that this precision is structural: it is a property of the network topology, not of parameter tuning. It works even when the rate constants vary by a factor of two or three. Yi, Huang, Simon, and Doyle (PNAS, 2000) identified the mechanism: the methylation enzyme CheR modifies all receptors at a constant rate, while the demethylation enzyme CheB acts only on active receptors. This asymmetry means the methylation state accumulates a running integral of the error signal — the deviation of receptor activity from its set point. At steady state, the integral of the error must be zero. So receptor activity returns exactly to baseline. Structurally. Always.

This is literally an integral controller — the I in PID control — implemented in biochemistry. The same principle appears across biology at every scale. Mustafa Khammash's group at ETH Zurich formalized it for synthetic biology (Briat and Gupta, Cell Systems, 2016), designing an "antithetic integral feedback" motif using two molecular species that bind and annihilate each other stoichiometrically, proving it guarantees robust perfect adaptation regardless of intrinsic biochemical noise. Aoki and colleagues realized this experimentally in E. coli (Nature, 2019). Frei and colleagues extended it to mammalian cells (PNAS, 2022) — the first PI controller implemented in human cells, maintaining defined transcription factor expression at set points across perturbations.

Ben-Zvi and Barkai (PNAS, 2010) showed that morphogen gradient scaling during organ growth uses the same principle natively. The expander-repressor motif — a slowly diffusing expander mediates scaling of a morphogen gradient to tissue size — implements integral feedback. The gradient adjusts to maintain the correct proportion of the tissue expressing each target gene, regardless of how fast the tissue is growing. Development does not read fixed concentrations. It integrates deviations from a set point and corrects them. The feedback architecture is built in.

The reason this matters for tissue engineering is specific. An organoid developing outside the embryo has lost access to the mechanical boundaries, systemic gradients, and tissue-level feedbacks that implement these control laws in vivo. What remains is the cell-autonomous part of the regulatory network — the attractor structure — operating without the error-correction architecture that normally keeps it on track. The next post in this series argues that providing that architecture externally, by monitoring signaling dynamics in real time and adjusting boundary conditions before bifurcation points are crossed incorrectly, is the key engineering problem that sits upstream of every other organoid challenge: vascularization, maturation, innervation. You cannot solve those if you cannot first guide the developmental trajectory to the right attractor basin.

interactive Integral feedback — perfect adaptation in biochemical networks
The controlled output (dark red) tracks the set point (dashed line) exactly — this is perfect adaptation, the same mechanism in E. coli chemotaxis and morphogen gradient scaling. Move the set point slider to change the target. Apply a perturbation and watch the integral term (blue) drive the system back precisely. The return is exact regardless of perturbation magnitude — that is what makes integral feedback special.

Khammash's lab went further and built a real-time external control loop around living cells — the Cyberloop (Nature Communications, 2021). Cells' fluorescence is measured continuously, a computer computes the control input, and light pulses are delivered to optogenetic actuators inside the cell. Lugagne, Blassick, and Dunlop (Nature Communications, 2024) extended this to deep model predictive control using neural networks, imposing arbitrary defined gene expression dynamics on thousands of individual cells simultaneously. This is not metaphor. It is control engineering running on living cells in real time.

The implication for organ engineering is direct. Organoids fail not because cells lack the capacity to self-organize — every organoid experiment proves they have it — but because we cannot monitor the developmental process as it unfolds and correct early deviations before they propagate. The biology already implements feedback control natively. What it lacks, when running outside the embryo, is the measurement layer that tells it whether it has reached the right set point. The frameworks described here are ready to specify what measurements matter. The tools to make those measurements are what the next generation of molecular recorders — the subject of the preceding post — is being built to provide.


The same system, all at once

iPSC reprogramming — taking a differentiated skin cell and driving it back to a pluripotent state with four transcription factors — has become the proving ground for every framework described here, because it is the clearest example of a cellular state transition that we can initiate on demand and watch in real time.

The attractor framework says: fibroblast and iPSC are stable attractors separated by a high barrier. Reprogramming requires crossing that barrier. The Yamanaka factors reshape the landscape by modifying the regulatory interactions, lowering the barrier. Jost, Sáez, and colleagues (2019) showed that reprogramming follows a single reaction coordinate consistent with barrier-crossing dynamics. The stochastic framework explains the sub-1% efficiency: even with the barrier lowered, crossing requires a sufficient accumulation of stochastic kicks in the right direction. Most cells fluctuate indefinitely near the fibroblast attractor and never escape. The Kramers escape rate — exp(−ΔU/D) — depends exponentially on barrier height divided by noise. Information theory says that pluripotent cells have maximal signaling entropy (Teschendorff and Enver, 2017) and that reprogramming must open up a committed, low-entropy transcriptional state into a promiscuous, high-entropy one. The non-equilibrium landscape says that optimal paths are not gradient descent, they curve with the flux — Wang's group validated this against reprogramming data. And control theory, via Ronquist and colleagues (PNAS, 2017), solved the reprogramming problem as an optimal control problem: what combination and duration of transcription factor inputs steers the cell state from fibroblast to iPSC most efficiently?

The frameworks are not competing. They are different lenses on the same process, each making predictions the others cannot. Dynamical systems without stochasticity misses the noise-driven symmetry-breaking at bifurcations. Stochasticity without information theory cannot quantify how much the decision actually depended on the input. Information theory without control theory cannot explain why some perturbations are absorbed while others derail development permanently. Control theory without the attractor landscape has no target to steer toward.

What all six frameworks share is this: they are each describing some aspect of a biological system computing a transfer function across time. Not a lookup table — "if signal X, produce output Y" — but a dynamical computation in which the output depends on the history of inputs, the topology of the regulatory network, the noise level relative to the barrier structure, and the feedback architecture maintaining the set point. That computation is what development is. It is also what we cannot see from snapshots. Sequencing gives you the output state of the computation at one moment. Live imaging gives you one or two labeled variables in real time, but not the full state, not at scale, not with molecular identity. What you need to reconstruct the transfer function is trajectory data: what the cell accumulated, what states it passed through, what its neighbors were doing, all the way from signal to commitment. That is what the molecular recording tools in the preceding post are being designed to capture, and it is what the next post in this series argues must be measured before organ engineering can move from art to engineering.