Introduction

While the growth of cell populations appears deterministic, many processes occurring at the single cell level are stochastic. Among many possibilities, stochasticity at the single cell level can arise from stochasticity in the generation times1, from stochasticity in the partition at division2,3, or from the stochasticity of single cell growth rates, which are usually linked to stochastic gene expression4. Ideally one would like to be able to disentangle the various sources of stochasticity present in experimental data5. This would allow to understand and predict how the various sources of stochasticity affect macroscopic parameters of the cell population, such as the Malthusian population growth rate6,7. Beyond this specific question, research in this field attempts to elucidate the fundamental physical constraints which control growth and divisions in cell populations.

With the advances in single cell experiments, where the growth and divisions of thousand of individual cells can be tracked, robust statistics can be acquired. New theoretical methods are needed to exploit this kind of data and to relate experiments carried out at the population level with experiments carried out at the single cell level. For instance, one would like to relate single-cell time-lapse video microscopy experiments of growing cell populations8, which provide information on all the lineages in the branched tree, with experiments carried out with the mother machine configuration, which provide information on single lineages9,10.

Let us now review quickly how the issue was addressed theoretically. In 2015, a pathwise thermodynamic framework was built for population dynamics using large deviation theory. One important result was a variational principle for the population growth rate11, which was formulated in terms of two key path distributions, namely the chronological and the retrospective probability distributions. Then, in order to explain their experimental observation that populations of Escherichia coli double faster than the mean doubling time of their constituent single cells, Hashimoto et al. extended the classical work of Powell6 for age models without mother-daugher correlations12. Nozoe et al.13 then showed that the difference between the forward (chronological) and backward (retrospective) distributions can be used to define a quantity called phenotypic fitness landscape, which informs whether a specific phenotypic trait affects the population growth rate. In that work, they also derived the key relation between the two distributions, already known in the mathematical literature of branching processes14, but they did not connect this result with the field of fluctuation relations. Various theoretical works followed which addressed other aspects of the role of the stochasticity at the single cell level3,15,16. As far as we can tell, the connection between the results of Nozoe and the field of fluctuation relations was only made explicit in our first work on this topic17. In that work, we also derived inequalities for mean generation times, already obtained in12 for age models, but importantly we proved using the fluctuation relation that they are valid beyond age models, in particular for a broad class of size models.

In the present work, we further investigate the connection between the statistics at the single lineage and population levels using fluctuation relations. These fluctuation relations only depend on the structure of the branched tree but not on the class of dynamical variables (protein concentration, cell size or cell age.) defined on it. These relations imply that the above inequalities for the mean number of divisions or mean generation times hold regardless of the specific dynamics.

We then provide an interpretation of the fluctuation relations within Stochastic Thermodynamics18 and we explore some consequences. One application concerns the inference of the population growth rate using single lineage data19. We then introduce some specific dynamical models, which we simulate to confirm the fluctuation relations and their consequences for mean generation times. Then, for these specific models and for key phenotypic variables such as the size and the age, we also study the fitness landscape13.

Theoretical framework

The backward and forward processes

Let us consider a branched tree, starting with \(N_0\) cells at time \(t=0\) and ending with N(t) cells at time t as shown on Fig. 1. We assume that all lineages survive up to time t, and therefore the final number N(t) of cells corresponds to the number of lineages in the tree.

The most natural way to sample the lineages is to put uniform weights on all of them. This sampling is called backward, (or retrospective) because at the end of the experiment one randomly chooses one lineage among the N(t) with a uniform probability and then one traces the history of the lineage backward in time from time t to 0, until reaching the ancestor population. The backward weight associated with a lineage l is defined as

$$\begin{aligned} \omega _{\text {back}}(l)=N(t)^{-1} \,. \end{aligned}$$
(1)

In a tree, some lineages divide more often than others, which results in an over-representation of lineages that have divided more often than the average. Therefore by choosing a lineage with uniform distribution, we are more likely to choose a lineage with more divisions than the average number of divisions in the tree.

The other way of sampling a tree is the forward (or chronological) one and consists in putting the weight

$$\begin{aligned} \omega _\text {for}(l)= N_0^{-1} m^{-K(l)} \,, \end{aligned}$$
(2)

on a lineage l with K(l) divisions, where m is the number of offspring at division. This choice of weights is called forward because one starts at time 0 by uniformly choosing one cell among the \(N_0\) initial cells, and one goes forward in time up to time t, by choosing one of the m offspring with equal weight 1/m at each division. The backward and forward weights are properly normalized probabilities, defined on the N(t) lineages in the tree at time t: \(\sum _{i=1}^{N(t)} \omega _{\text {back}}(l_i) = \sum _{i=1}^{N(t)} \omega _{\text {for}}(l_i) =1\).

Figure 1
figure 1

Example of a tree with \(N_0=1\) and \(N(t)=10\) lineages at time t. Two lineages are highlighted, the first in blue with 2 divisions and the second in orange with 5 divisions. The forward sampling is represented with the green right arrows: it starts at time \(t=0\) and goes forward in time by choosing one of the two daughters lineages at each division with probability 1/2. The backward sampling is pictured by the left purple arrows: starting from time t with uniform weight on the 10 lineages it goes backward in time down to time \(t=0\).

Single lineage experiments are precisely described by a forward process since experimentally, at each division, only one of the two daughter cells is conserved while the other is eliminated (for instance flushed away in a microfluidic channel9, 10). In these experiments, a tree is generated but at each division only one of the two lineages is conserved, with probability 1/2, while the rest of the tree is eliminated. This means that single lineage observables can be measured without single lineage experiments, provided population experiments are analyzed with the correct weights on lineages.

Link with the population growth rate

Since the backward weight put on a lineage depends on the number of cells at time t, it takes into account the reproductive performance of the colony but it is unaffected by the reproductive performance of the lineage considered. On the contrary, the forward weight put on a specific lineage depends on the number of divisions of that lineage but is unaffected by the reproductive performance of other lineages in the tree. Therefore, the difference between the values of the two weights for a particular lineage informs on the difference between the reproductive performance of the lineage with respect to the colony.

We now introduce the population growth rate:

$$\begin{aligned} \Lambda _t=\frac{1}{t} \ln \frac{N(t)}{N_0} \,, \end{aligned}$$
(3)

which is linked to forward weights by the relation

$$\begin{aligned} \frac{N(t)}{N_0}=\sum _{i=1}^{N(t)} m^{K_i} \ \omega _\text {for}(l_i) = \langle m^K \rangle _\text {for} \,, \end{aligned}$$
(4)

where \(\langle \cdot \rangle _\text {for}\) is the average over the lineages weighted by \(\omega _\text {for}\), and \(K_i=K(l_i)\). Combining the two equations above, we obtain19:

$$\begin{aligned} \Lambda _t=\frac{1}{t} \ln \langle m^K \rangle _\text {for} \,, \end{aligned}$$
(5)

which allows an experimental estimation of the population growth rate from the knowledge of the forward statistics only.

Equation (4) can also be re-written to express the bias between the forward and backward weights of the same lineage

$$\begin{aligned} \frac{\omega _{\text {back}}(l)}{\omega _\text {for}(l)}=\frac{m^{K(l)}}{\langle m^K \rangle _\text {for}} \,, \end{aligned}$$
(6)

which is the reproductive performance of the lineage divided by its average in the colony with respect to \(\omega _\text {for}\).

A similar relation is derived using the relation

$$\begin{aligned} \frac{N_0}{N(t)}=\sum _{i=1}^{N(t)} m^{-K_i} \ \omega _\text {back}(l_i) = \langle m^{-K} \rangle _\text {back} \,. \end{aligned}$$
(7)

Combining Eqs. (5) and (7) we obtain:

$$\begin{aligned} \Lambda _t= - \frac{1}{t} \ln \langle m^{-K} \rangle _\text {back} \,. \end{aligned}$$
(8)

A similar equation as Eq. (6) can be obtained in terms of the backward sampling and reads: 

$$\begin{aligned} \frac{\omega _{\text {back}}(l)}{\omega _\text {for}(l)}=\frac{\langle m^{-K} \rangle _\text {back}}{m^{-K(l)}} \,. \end{aligned}$$
(9)

Combining Eqs. (1) to (3), we obtain the fluctuation relation13,17:

$$\begin{aligned} \omega _{\text {back}}(l)= \omega _\text {for}(l) \ e^{K(l) \ln m - t \Lambda _t} \,. \end{aligned}$$
(10)

If we now introduce the probability distribution of the number of divisions for the forward sampling \(p_\text {for}(K)=\sum _l \delta (K-K(l)) \omega _\text {for}(l)\) and similarly for the backward sampling, we can also recast the above relation as a fluctuation relation for the distribution of the number of divisions:

$$\begin{aligned} p_{\text {back}} (K,t)=p_{\text {for}} (K,t) \ e^{K \ln m - t \Lambda _t} \,. \end{aligned}$$
(11)

Let us now introduce the Kullback–Leibler divergence between two probability distributions p and q, which is the non-negative number:

$$\begin{aligned} {{\mathscr {D}}}_{\text {KL}}(p||q)=\int {\mathrm {d}}x \, p(x) \ln \frac{p(x)}{q(x)} \ge 0 \,. \end{aligned}$$
(12)

Using Eq. (10), we obtain

$$\begin{aligned} {{\mathscr {D}}}_{\text {KL}}(\omega _{\text {back}}|| \omega _\text {for}) = \langle K \rangle _{\text {back}} \ln m - t \Lambda _t \ge 0 \,. \end{aligned}$$
(13)

A similar inequality follows by considering \({{\mathscr {D}}}_{\text {KL}}(\omega _{\text {for}}|| \omega _\text {back})\). Finally we obtain

$$\begin{aligned} \frac{t}{\langle K \rangle _{\text {back}}} \le \frac{\ln m}{\Lambda _t} \le \frac{t}{\langle K \rangle _\text {for}} \,. \end{aligned}$$
(14)

In the long time limit, \(\lim \nolimits _{t \rightarrow + \infty } t/\langle K \rangle _{\text {back}} = \langle \tau \rangle _{\text {back}}\), where \(\tau\) is the inter-division time, or generation time, defined as the time between two consecutive divisions on a lineage. The same argument goes for the forward average. In the case of cell division where each cell only gives birth to two daughter cells (\(m=2\)), the center term in the inequality tends to the population doubling time \(T_d\). Therefore, this inequality reads in the long time limit:

$$\begin{aligned} \langle \tau \rangle _{\text {back}} \le T_d \le \langle \tau \rangle _\text {for} \,. \end{aligned}$$
(15)

Let us now mention a minor but subtle point related to this long time limit. For a lineage with K divisions up to time t, we can write \(t=a + \sum _{i=1}^{K} \tau _i\), where a is the age of the cell at time t and where \(\tau _i\) is the generation time associated with the ith division. Then \(t/ K= \tau _m + a/K\), where \(\tau _m\) is the mean generation time along the lineage. For finite times, all we can deduce is \(t/ K \ge \tau _m\). Therefore the left inequality of Eq. (15) always holds

$$\begin{aligned} \langle \tau \rangle _{\text {back}} \le \frac{t}{\langle K \rangle _{\text {back}}} \le \frac{\ln m}{\Lambda _t} \,, \end{aligned}$$
(16)

while the right inequality does not necessarily hold at finite time.

Inspired by work by Powell6, the inequalities of Eq. (15) have been theoretically derived in12 for age models. In our previous work17, we have replotted the experimental data of12 which confirm theses inequalities and we have shown theoretically that the same inequalities should also hold for size models. In fact, as the present derivation shows, the relation equation (14) is very general and only depends on the branching structure of the tree, while the relation equation (15) requires in addition the existence of a steady state. These inequalities and Eq. (11) express fundamental constraints between division and growth, which should hold for any model.

Stochastic thermodynamic interpretation

The results derived above have a form similar to that found in Stochastic Thermodynamics18. According to this framework, Eq. (5) is an integral fluctuation relation (similar to Jarzynski relation) while Eq. (11) is a detailed fluctuation relation (similar to Crooks fluctuation relation). Furthermore, the inequalities equation  (14) represent a constraint equivalent to the second law of thermodynamics, which classically follows from the Jarzynski or Crooks fluctuation relations. It is known that these inequalities take a slightly different form when expressed at finite time or at steady state, which is indeed the case here when comparing Eq. (14) with Eq. (15). A difference between work fluctuation relations like Crooks or Jarzynski and equations (5) and (11), is that Crooks or Jarzynski describe non-autonomous systems which are driven out of equilibrium by the application of a time-dependent protocol, whereas the relations for cell growth derived here concern autonomous systems, in the absence of any external protocol.

One of the main applications of Jarzynski or Crooks fluctuation relations concerns the thermodynamic inference of free energies from non-equilibrium fluctuations. Similarly, Eq. (5) or Eq. (11) can be used as estimators of the population growth rate. The specific advantage of Eq. (5) with respect to Eq. (11) is that it only requires single lineage statistics, which can be obtained from mother machine experiments. Let us now show how this can be done in practice. We use the data from20, where the growth of many independent lineages of E. coli have been recorded over 70 generations in a mother machine at three different temperatures (25 °C, 27 °C, and 37 °C), precisely 65 lineages for 25 °C, 54 for 27 °C, and 160 for 37 °C. For each temperature condition, we study the convergence of the estimator of the population growth rate based on Eq. (5), which we call \(\Lambda _{\mathrm{lin}}\) as a function of the length t of the lineages for a fixed number of independent lineages L, and as a function of the number of independent lineages for a fixed observation time.

Figure 2
figure 2

Estimator of the population growth rate \(\Lambda _{\mathrm{lin}}\) based on Eq. (5), (a) as function of the the length t of the lineages and (b) as function of the number L of lineages used in the estimation. In (a), the curves for the three temperatures converge to a constant value. In (b), only the curve for 37 °C is shown and the horizontal dashed line represents the quantity \(\ln (2)/\langle \tau \rangle _{\text {for}}\), which is smaller than the limit value of \(\Lambda _{\mathrm{lin}}\), as expected from the second law-like inequality, namely Eq. (15). In the inset, the purple histogram is the distribution of the number of divisions, while the green filled histogram is the histogram deduced from it by weighting it by a factor \(2^K\) and normalizing. All the 160 lineages were used to plot these histograms.

Firstly, for each temperature, we take into account all the lineages available and truncate them at an arbitrary time t smaller than the length of the shortest lineage of the set. On these portions of lineages of length t, we compute \(\Lambda _{\mathrm{lin}}\) versus the time t as shown in Fig. 2a. We see that the estimator \(\Lambda _{\mathrm{lin}}\) starts from zero, increases and eventually converges rather quickly towards a limiting value. The limit we found agree with the independent analysis carried out in19, with only one caveat, these authors reported that their estimator started at high values and then decreased towards the limit, while in our case, the estimator starts at zero and later increases towards the limit. In our case, the estimator needs to be zero at short times, before the first divisions occur.

Secondly, we truncate all the lineages at a fixed time equal to the length of the shortest lineage of the set, and compute \(\Lambda _{\mathrm{lin}}\) versus the number L of lineages considered for the estimation, which have been randomly selected from the ensemble of available lineages. As shown in Fig. 2b for the case at \(37^{\,\circ } \hbox {C}\) (curves for the other temperatures look exactly the same), the convergence is also excellent in that case. Although the value of the population growth rate which is obtained in this way can not be measured independently from the evolution of the population in the mother machine setup, this convergence is indicative of the success of the method. The figure also confirms that the value of the population growth rate deduced from the estimator \(\Lambda _{\mathrm{lin}}\) is larger than \(\ln (2)/\langle \tau \rangle _{\text {for}}\), as predicted by the right inequality of Eq. (15).

Here, the estimator is found to provide an excellent estimation, but this is not always so. For instance, for the inference of free energies from non-equilibrium work measurements, the exponential average of the estimator is often dominated by rare values, which are not accessible or not well sampled21. To understand why this problem does not arise here, we show in inset of Fig. 2b, the distribution P(K) of the number of divisions together with the same distribution weighted by the factor \(2^K\) and normalized. The peak of that modified distribution informs on the dominant values in the estimator21. Here, we observe that both distributions have a narrow support and are close to each other. The weighted distribution is peaked at \(K=67\) while P(K) is peaked at \(K=66\), therefore typical and dominating values are very close, which explains why the estimator is good.

Let us now further develop the Stochastic Thermodynamic interpretation of our results by analyzing the implications of the previous fluctuation relations when dynamical variables are introduced on the branched tree of the population. Let us introduce M variables labeled \((y_1,y_2, \ldots ,y_M)\) to describe a dynamical state of the system, then a path is fully determined by the values of these variables at division, and the times of each division. We call \({\mathbf {y}}(t)=(y_1(t),y_2(t), \ldots ,y_M(t))\) a vector state at time t and \(\{{\mathbf {y}}\}=\{{\mathbf {y}}(t_j)\}_{j=1}^{K}\) a path with K divisions. For cell growth models, the variables \(y_i\) can typically be the size and age of the cell, or the concentration of a key protein.

The probability \({{\mathscr {P}}}\) of path \(\{{\mathbf {y}}\}\) is defined as the sum over all lineages of the weights of the lineages that follow the path \(\{{\mathbf {y}}\}\):

$$\begin{aligned} {{\mathscr {P}}}(\{{\mathbf {y}}\},K,t)=\sum _{i=1}^{N(t)} \omega (l_i) \, \delta (K-K_i) \delta (\{{\mathbf {y}}\} - \{{\mathbf {y}}\}_i) \,, \end{aligned}$$
(17)

where \(\{{\mathbf {y}}\}_i\) is the path followed by lineage \(l_i\). Using the normalization of the weights \(\omega\) on the lineages, we show that \({{\mathscr {P}}}\) is properly normalized: \(\int \mathrm {d}\{{\mathbf {y}}\} \sum _K {{\mathscr {P}}}(\{{\mathbf {y}}\},K,t) = 1\). We then define the number \(n(\{{\mathbf {y}}\},K,t)\) of lineages in the tree at time t that follow the path \(\{{\mathbf {y}}\}\) with K divisions:

$$\begin{aligned} n(\{{\mathbf {y}}\},K,t)=\sum _{i=1}^{N(t)} \delta (K-K_i) \delta (\{{\mathbf {y}}\} - \{{\mathbf {y}}\}_i) \,. \end{aligned}$$
(18)

This number of lineages is normalized as \(\int \mathrm {d}\{{\mathbf {y}}\} \sum _K n(\{{\mathbf {y}}\},K,t) = N(t)\). Then, the path probability can be re-written as

$$\begin{aligned} {{\mathscr {P}}}(\{{\mathbf {y}}\},K,t) = n(\{{\mathbf {y}}\},K,t) \cdot \omega (l) \,. \end{aligned}$$
(19)

Since \(n(\{{\mathbf {y}}\},K,t)\) is independent of a particular choice of lineage weighting, we obtain

$$\begin{aligned} \frac{{{\mathscr {P}}}_{\text {back}}(\{{\mathbf {y}}\},K,t)}{{{\mathscr {P}}}_\text {for} (\{{\mathbf {y}}\},K,t)}=\frac{\omega _{\text {back}}(l)}{\omega _\text {for}(l)}= \ e^{K \ln m - t \Lambda _t} \, , \end{aligned}$$
(20)

which generalizes Eq. (11). In our previous work17, we have derived this relation for size models with individual growth rate fluctuations (i.e. \({\mathbf {y}}=(x,\nu )\)) but we were not aware of the weighting method introduced by13, and for this reason, we used the term ‘tree’ to denote the backward sampling, and the term ‘lineage’ to denote the forward sampling.

This relation has a familiar form in Stochastic Thermodynamics. The central quantity called entropy production can indeed be expressed similarly as the relative entropy between probability distributions associated with a forward and a backward evolution. In this analogy, \(\{{\mathbf {y}}\}\) is analog to the trajectory and \(t \Lambda _t - K \ln m\) is analog to the entropy production. Then, the equivalent of a reversible trajectory for which the entropy production is null is a lineage for which the number K of divisions is equal to \(t \Lambda _t / \ln m\), that is, a lineage having the same reproductive performance as that of the colony. When all the lineages in a tree have this property, there is no variability of the number of divisions among them. In that case, the forward and backward distributions are identical, and the cost function \(t \Lambda _t - K \ln m\) vanishes for all lineages.

Mixed age-size controlled models

Dynamics at the population level

The state of a cell is described by its size x, its age a and its individual growth rate \(\nu\), with \({\mathbf {y}}=(x,a,\nu )\). Such mixed size-age model includes the ‘adder’ in which the cell divides after adding a constant volume to its birth volume22,23,24,25.

The evolution of the number of cells \(n({\mathbf {y}},K,t)\) in the state \({\mathbf {y}}\) at time t, that belong to a lineage with K divisions up to time t is governed by the equation

$$\begin{aligned} \left( \partial _t + \partial _a \right) n({\mathbf {y}},K,t) +\partial _x \left[ \nu x n({\mathbf {y}},K,t) \right] + B({\mathbf {y}})n({\mathbf {y}},K,t) =0 \,, \end{aligned}$$
(21)

and the boundary condition

$$\begin{aligned} n(x,a=0,\nu ,K,t)= m \int {\mathrm {d}}{\mathbf {y}}' B({\mathbf {y}}') \Sigma ({\mathbf {y}}|{\mathbf {y}}') n({\mathbf {y}}',K-1,t) \,, \end{aligned}$$
(22)

where \(B({\mathbf {y}})\) is the division rate and \(\Sigma ({\mathbf {y}}|{\mathbf {y}}')\) is the conditional probability (also called division kernel) for a newborn cell to be in state \({\mathbf {y}}\) knowing its mother divided while in state \({\mathbf {y}}'\), normalized as \(\int \Sigma ({\mathbf {y}}|{\mathbf {y}}') \mathrm {d}{\mathbf {y}} =1\), for any \({\mathbf {y}}'\).

Dynamics at the probability level

While \(n({\mathbf {y}},K,t)\) in Eq. (21) is independent of the choice of weights put on the lineages, we now turn to a description in terms of the probability \(p({\mathbf {y}},K,t)\) for a cell to be in state \(({\mathbf {y}},K)\) at time t if chosen randomly among the N(t) cells in the tree at that time. To do so, one has to choose how to weight each cell in the colony, which is equivalent to weight each lineage, since at time t each cell is the ending point of one lineage.

The first possibility is the backward sampling, for which each lineage is weighted uniformly. In this case, we define \(p_{\text {back}}\) as

$$\begin{aligned} p_{\text {back}}({\mathbf {y}},K,t)=\frac{n({\mathbf {y}},K,t)}{N(t)} \,. \end{aligned}$$
(23)

Dividing Eq. (21) and the boundary condition equation (22) by N(t) we obtain

$$\begin{aligned} \left( \partial _t + \partial _a \right) p_{\text {back}}({\mathbf {y}},K,t) +\partial _x \left[ \nu x p_{\text {back}}({\mathbf {y}},K,t) \right] + \left[ B({\mathbf {y}}) + \Lambda _p(t) \right] p_{\text {back}}({\mathbf {y}},K,t) =0 \,, \end{aligned}$$
(24)

and

$$\begin{aligned} p_{\text {back}}(x,a=0,\nu ,K,t)= m \int {\mathrm {d}}{\mathbf {y}}' B({\mathbf {y}}') \Sigma ({\mathbf {y}}|{\mathbf {y}}') p_{\text {back}}({\mathbf {y}}',K-1,t) \,, \end{aligned}$$
(25)

where we defined the instantaneous population growth rate as

$$\begin{aligned} \Lambda _p(t)=\frac{{\dot{N}}}{N} \,. \end{aligned}$$
(26)

The instantaneous population growth rate and the population growth rate defined in Eq. (3) are related by:

$$\begin{aligned} \Lambda _t=\frac{1}{t} \int _{0}^{t} \Lambda _p(t') {\mathrm {d}}t' \,. \end{aligned}$$
(27)

In the long-time limit, N grows exponentially with constant rate \(\Lambda _p\), and thus \(\Lambda _t=\Lambda _p=\Lambda\).

The other possibility is to use the forward statistics, in which case we define the probability \(p_{\text {for}}\), as

$$\begin{aligned} p_{\text {for}}({\mathbf {y}},K,t)=\frac{n({\mathbf {y}},K,t)}{m^K} \,. \end{aligned}$$
(28)

Dividing Eq. (21) and the boundary condition equation (22) by \(m^K\) we obtain

$$\begin{aligned} \left( \partial _t + \partial _a \right) p_{\text {for}}({\mathbf {y}},K,t) +\partial _x \left[ \nu x p_{\text {for}}({\mathbf {y}},K,t) \right] + B({\mathbf {y}}) p_{\text {for}}({\mathbf {y}},K,t) =0 \,, \end{aligned}$$
(29)

and

$$\begin{aligned} p_{\text {for}}(x,a=0,\nu ,K,t)= \int {\mathrm {d}}{\mathbf {y}}' B({\mathbf {y}}') \Sigma ({\mathbf {y}}|{\mathbf {y}}') p_{\text {for}}({\mathbf {y}}',K-1,t) \,. \end{aligned}$$
(30)

One can notice that the backward statistics is well suited to study the population, while the forward statistics reproduce the behaviour of single lineage experiments. Indeed, by taking Eqs. (24) and (25) for the population/backward probability \(p_{\text {back}}\), and choosing \(\Lambda _p(t)=0\) and \(m=1\) we recover Eqs. (29) and (30). This equation is then a population equation in which we follow only one cell, so that \(\Lambda _p(t)=0\) and \(m=1\), which we call single lineage experiment.

Illustration of the fluctuation relation

We simulated the time evolution of colonies of cells, obeying Eqs. (21) and (22), for age and size models in order to illustrate the fluctuation relation. Since results are very similar—as expected—for age models, we restrict ourselves to size models. We tested two results: the fluctuation relation for the number of divisions Eq. (11) and one of its consequences: the inequality for the mean number of divisions Eq. (14).

All simulations for size models were conducted with the division rate \(B(x,\nu )=\nu x^{\alpha }\), where \(\alpha\) is the strength of the control and x is the dimensionless size. Power law were found to be good approximations for empirical division rates B(x)2,24,26. The factor \(\nu\), being the only time scale for size models, gives B(x) its proper dimension. Similarly for age models26, we choose \(B(a,\nu )=\nu a^{\alpha }\).

On Fig. 3a, the backward and forward probability distributions of the number of divisions are shown for a size model. The two distributions intersect at the number of divisions \(K=t \Lambda _t / \ln 2\). The inset of Fig. 3a shows the logarithm of the ratio \(q(K,t)=p_{\text {for}} (K,t)/p_{\text {back}} (K,t)\) of the two distributions, which is as expected a straight line of slope \(- \ln 2\) when plotted against the number of divisions. For convenience and for Fig. 3a only, noise in the volume partition at division has been introduced, by choosing for the conditional probability \(\Sigma (x | x')\) a uniform distribution between sizes \(x=0\) and \(x=x'\). This has the effect of broadening the distributions P(K) with respect to the case of deterministic symmetrical volume partition.

Then, we tested the inequality on the mean numbers of divisions by varying the strength of the size-control \(\alpha\). Results are shown on Fig. 3b. One one hand, we see that the less control on size, the more discrepancy between the two determinations \(\langle K \rangle _{\text {back}}\) and \(\langle K \rangle _{\text {for}}\). On the other hand, when increasing the control, the two determinations converge to the population doubling time, where no stochasticity in the number of divisions is left, and every lineage carries the same number of divisions, leading to the equality of the backward and forward statistics.

Figure 3
figure 3

Illustration of (a) the fluctuation relation for the number of divisions and (b) the inequality on the mean number of divisions, for a size-controlled model with division rate \(B(x)=\nu x^{\alpha }\). In (a), the forward distribution is shown as orange filled histogram and the backward distribution as blue empty histogram. The vertical dashed line at \(K=t \Lambda _t / \ln 2\) is the theoretical value at which the two distributions should intersect. The inset shows the logarithm of the ratio q(K) of the forward to backward probabilities (purple crosses), and the theoretical result \(t \Lambda _t-K \ln 2\) (green line). The conditional probability \(\Sigma (x | x')\) is uniform between sizes 0 and \(x'\). The simulation was conducted with constant \(\nu =1\), \(\alpha =2\) and \(t=7\). In (b), the quantity \(t/\langle K \rangle\), re-scaled by the population doubling time \(T_d=\ln 2 / \Lambda _t\), is plotted when \(\alpha\) is varied from 1 to 15, with orange diamonds for the forward sampling and with blue circles for the backward sampling. The volume partition between the two daughter cells at division is symmetrical, so that \(\Sigma (x | x')=\delta (x-x'/2)\). The simulation was conducted with constant \(\nu =1\), and \(t=6\).

Phenotypic fitness landscapes

The fitness of a phenotypic trait s is a measure of the reproductive success of individuals carrying it. It is usually defined as the number of offsprings of one individual with a given value of the trait and is quite difficult to evaluate. Nozoe et al. suggested that one way to measure it could be to compare the chronological and retrospective marginal probabilities13 and accordingly defined it as:

$$\begin{aligned} h(s)=\Lambda _t + \frac{1}{t} \ln \left[ \frac{P_{\text {back}}(s)}{P_{\text {for}}(s)} \right] \,, \end{aligned}$$
(31)

so that

$$\begin{aligned} P_{\text {back}}(s)=P_{\text {for}}(s) \exp \left[ (h(s)-\Lambda _t)t \right] \,. \end{aligned}$$
(32)

This has again the form of a fluctuation relation similar to Eq. (11), except for the replacement of the factor \(K \ln 2 /t\) by the function h(s). This suggests that the fitness landscape h(s) plays a role similar to that of an effective division rate, which depends on the trait s. In line with this interpretation, in the particular case where \(s=K\), Eq. (11) leads to \({\tilde{h}}(K)=K \ln 2 /t\), where the fitness landscape for trait K is called the lineage fitness and is written \({\tilde{h}}\). In a branched tree, lineages with a large number of divisions K are exponentially over-represented in the population with the backward sampling as compared to the forward sampling. This means that lineages with large K have a larger fitness than the ones with a small K, which is coherent with \({\tilde{h}}(K)\) being an increasing function of K.

In the following, we rewrite the definition of h(s) in a slightly different way17 using

$$\begin{aligned} P_{\text {back}}(s) =e^{-t \Lambda _t} P_{\text {for}}(s) \sum _K 2^K R_{\text {for}}(K|s), \end{aligned}$$
(33)

where we have introduced the probability of the number of division events conditioned on trait s at the forward level, \(R_{\text {for}}(K|s)\). Lastly, the fitness landscape reads17

$$\begin{aligned} h(s)=\frac{1}{t}\ln \left[ \sum _K 2^K R_{\text {for}}(K|s) \right] \,. \end{aligned}$$
(34)

An increasing or decreasing fitness landscape means a positive or negative correlation of the trait value with the capacity to divide, whereas a constant fitness landscape means that the trait is not correlated with the number of divisions. Indeed, if we consider a trait s which does not affect the number K of divisions, then \(R_{\text {for}}(K|s)=P_{\text {for}}(K)\) and Eq. (34) reads \(h(s)=\ln \left[ \sum _K 2^K P_{\text {for}}(K) \right] /t\), which is equal to \(\Lambda _t\) according to Eq. (5). In that case, we find that the backward and forward probabilities for that trait s are equal.

In the next sections, we evaluate the relevance of the key variables from our model, namely the size and the age by evaluating their fitness landscapes in size and age models.

Size models

We start with a case where the fitness landscape is fully solvable namely a size model with no individual growth rate fluctuations and with symmetric division. Let us consider a colony starting with one ancestor cell of size \(x_0\). Then, the available sizes at time t are discrete and given by \(x=x_0 \exp [\nu t] / 2^K\) where K is the number of divisions undergone by the cell. Therefore a particular size x can be reached only if there is an integer K satisfying this relation, and this integer is unique, leading to

$$\begin{aligned} R_{\text {for}}(K|x)=\delta \left( K - \frac{\ln \left[ \frac{x_0 e^{\nu t}}{x} \right] }{\ln 2} \right) \,. \end{aligned}$$
(35)

Using this relation in Eq. (34), one finds

$$\begin{aligned} h(x)= \nu + \frac{1}{t} \ln \left( \frac{x_0}{x} \right) \,. \end{aligned}$$
(36)

The fitness landscape of the size is a decreasing function, which is coherent with the over-representation of cells that divided a lot, since these cells are more likely to be small due to the numerous divisions. Reporting this result in Eq. (33), we obtain a fluctuation relation for the size

$$\begin{aligned} P_{\text {back}}(x) = e^{\left( \nu - \Lambda _t \right) t} \frac{x_0}{x} P_{\text {for}}(x) \,, \end{aligned}$$
(37)

which in the long time limit becomes

$$\begin{aligned} P_{\text {back}}(x) = \frac{x_0}{x} P_{\text {for}}(x) \, , \end{aligned}$$
(38)

where we used the property that in a steady state, the population growth rate and the individual growth rate are equal when there is no individual growth rate variability.

In some setups, experiments do not start with a unique ancestor cell but with \(N_0 > 1\) initial cells, with possibly heterogeneous sizes. We describe this heterogeneity by the average initial size \(\langle x_0 \rangle\) and the standard deviation \(\sigma _{x_0}\). In this case, accessible sizes are still discrete but depend on both the number of divisions and the initial cell that started the lineage, and are expressed as \(x_0^i \exp [\nu t] /2^K\), where K takes integer values from 0 to \(\infty\) and where \(x_0^i \in {{\mathscr {X}}}_0\), with \({{\mathscr {X}}}_0\) the set of initial sizes. Consequently, a final size x can possibly be reached by different couples \((K_i,x_0^i)\).

In order to go further, we now introduce explicitly the initial sizes \(x_0^i\) in Eq. (34) as

$$\begin{aligned} h(x)&=\frac{1}{t}\ln \left[ \sum _K \sum _{i} 2^K R_{\text {for}}(K,x_0^i|x) \right] \nonumber \\&= \frac{1}{t}\ln \left[ \sum _K \sum _{i} 2^K R_{\text {for}}(K|x,x_0^i) R_{\text {for}}(x_0^i|x)\right] \,. \end{aligned}$$
(39)

When conditioning on the initial size \(x_0^i\), there is only one possible number of divisions K to reach size x, so that \(R_{\text {for}}(K|x,x_0^i)\) obeys an equation similar to Eq. (35).

Let us examine two limit cases: (i) small variability in the initial sizes and (ii) large variability in the initial sizes.

Figure 4
figure 4

Size fitness landscapes for size models with \(B(x)=\nu x^{\alpha }\), constant \(\nu =1\), \(\alpha =1\), \(t=5\), symmetrical division and \(N_0\) initial cells following a Gaussian distribution of sizes \({{\mathscr {N}}}(\langle x_0 \rangle ,\sigma _{x_0})\). The black horizontal dashed lines represent the population growth rates \(\Lambda\). In (a), there is small variability in initial sizes: \(N_0=10\), \(\sigma _{x_0}/\langle x_0 \rangle = 0.15\). The grey dotted lines represent the plateaus available for h(x), depending on the number of divisions K. The right y-axis displays the values of K corresponding to these plateaus. In (b), there is large variability in initial sizes: \(N_0=100\), \(\sigma _{x_0}/\langle x_0 \rangle = 0.5\). The green curve is the theoretical prediction, on which the simulated purple dots align.

Case (i) is characterized by a small number \(N_0\) of initial cells and a small coefficient of variation \(\sigma _{x_0}/\langle x_0 \rangle\). In this case, it is realistic to say that a final size x can only be reached by one couple \((K^*,x^*)\), because the sets of accessible sizes generated by each initial cell do not overlap. Therefore, \(R_{\text {for}}(x_0^i|x)=\delta (x_0^i-x^*)\) and so for any final size x, only one initial size \(x^*\) survives in the sum, so that Eq. (39) reads \(h(x)=\nu + \ln \left( x^*/(x^* \exp [\nu t]/2^K) \right) /t ={\tilde{h}}(K)\). Thus cells that come from lineages with the same number of divisions K have the same fitness landscape value h(x) for the size, regardless of the size \(x^*\) of the initial cell of their lineages. Thus, available values for h(x) are quantified by K and form plateaus, where points representing cells coming from different ancestors but with the same number of divisions accumulate, as shown in Fig. 4a.

Case (ii) is characterized by a large number \(N_0\) of initial cells and a large coefficient of variation \(\sigma _{x_0}/\langle x_0 \rangle\). Unlike in case (i), the sets of accessible sizes generated by each initial cell have many overlaps, so that a final size x can be reached by many different couples \((K_i,x_0^i)\). We make the hypothesis that a final size x can be reached by any initial cell with uniform probability, so that \(R_{\text {for}}(x_0^i|x)=1/N_0\). Therefore, Eq. (39) becomes

$$\begin{aligned} h(x)&=\frac{1}{t}\ln \left[ \frac{1}{N_0} \sum _{i} \frac{x_0^i e^{\nu t}}{x} \right] \nonumber \\&=\nu + \frac{1}{t}\ln \frac{\langle x_0 \rangle }{x} \,. \end{aligned}$$
(40)

This behavior was tested numerically and the result plotted on Fig. 4b confirms that the plateaus observed in case (i) are replaced by a smooth curve depending on the mean initial size.

We observe the same effect, namely the loss of the plateaus, when introducing fluctuations in individual growth rates.

Age models

Constant individual growth rate

We consider the case where the individual growth rate is constant and equal to \(\nu\). In steady-state, the forward age distribution reads (see17 where \(p_{\text {for}}(a)\) (resp. \(p_{\text {back}}(a)\)) were denoted p(a) (resp. P(a))):

$$\begin{aligned} p_{\text {for}}(a)=p_{\text {for}}(0) \, \exp \left[ -\int _{0}^{a} B(a') {\mathrm {d}}a' \right] \,. \end{aligned}$$
(41)

To find the integration constant \(p_{\text {for}}(0)\), we use the normalization of probability \(p_{\text {for}}\):

$$\begin{aligned} Z=p_{\text {for}}(0)^{-1}=\int _{0}^{\infty } {\mathrm {d}}a \exp \left[ -\int _{0}^{a} B(a') {\mathrm {d}}a' \right] \,. \end{aligned}$$
(42)

Similarly, the steady-state backward distribution of ages reads

$$\begin{aligned} p_{\text {back}}(a)=p_{\text {back}}(0) \, \exp \left[ -\Lambda a -\int _{0}^{a} B(a') {\mathrm {d}}a' \right] \,. \end{aligned}$$
(43)

In this case, the integration constant \(p_{\text {back}}(0)\) can be expressed both using the normalization of \(p_{\text {back}}(a)\), as done for the forward case, or using \(p_{\text {back}}(0)=2 \Lambda\), as shown in17.

Therefore, the ratio of the age distributions using the backward and forward statistics reads

$$\begin{aligned} \frac{p_{\text {back}}(a)}{p_{\text {for}}(a)}= 2 Z \Lambda e^{-\Lambda a} \,, \end{aligned}$$
(44)

where Z is defined in Eq. (42) and only depends on the division rate B(a). This relation has a similar form as the relation derived by Hashimoto et al.12 for the distributions of generation times, except for the extra age-independent factor \(Z \Lambda\). Finally, the fitness landscape reads

$$\begin{aligned} h(a)=\frac{1}{t} \left[ \Lambda (t-a) + \ln (2 Z \Lambda ) \right] \,. \end{aligned}$$
(45)

For the same reason as for h(x) in size models, h(a) in age models is a decreasing function of a because lineages that divided a lot are over-represented in the population and are therefore more likely to contain young cells at time t.

The initial condition does not play any role in this derivation, therefore, unlike size models, the results obtained are unchanged for any number \(N_0\) of initial cells with heterogeneous initial ages.

The above calculation is general because we did not put any constraint on B(a). Let us now go into more details by choosing a power law for the division rate: \(B(a)=\nu a^{\alpha }\). In this case, the integral of Eq. (42) is solvable and gives

$$\begin{aligned} Z=\frac{1}{\alpha +1} \left( \frac{\alpha +1}{\nu } \right) ^{\frac{1}{\alpha +1}} \Gamma \left( \frac{1}{\alpha +1} \right) \,, \end{aligned}$$
(46)

in terms of Gamma function \(\Gamma (x)\). Results are plotted on Fig. 5a, which shows that theoretical predictions for the backward and forward age distributions are in good agreement with the numerical histograms. The inset plot shows the age fitness landscape, which follows the linear behavior predicted by Eq. (45).

Let us examine the particular case of uncontrolled models, for which the division rate is constant: \(B=\nu\). This corresponds to the case \(\alpha =0\) in the power law analysis conducted above. Replacing \(\alpha\) by 0 in Eq. (46) leads to \(Z=1/\nu\); moreover in steady state \(\Lambda =\nu\), so that

$$\begin{aligned} p_{\text {back}}(a) = 2 \, p_{\text {for}}(a) \, e^{-\Lambda a} \,. \end{aligned}$$
(47)

Moreover, the distributions themselves are greatly simplified and read

$$\begin{aligned} p_{\text {for}}(a)&=\nu e^{ - \nu a} \,, \end{aligned}$$
(48)
$$\begin{aligned} p_{\text {back}}(a)&=2 \nu e^{ - 2 \nu a} \,, \end{aligned}$$
(49)

which shows that in this special case the age distributions are themselves identical with the generation time distributions.

Fluctuating individual growth rates

Another interesting extension of this calculation concerns the case of fluctuating individual growth rate \(\nu\), for which the division rate then becomes a function of a and \(\nu\): \(B(a,\nu )\). Then, steady state age distributions are17:

$$\begin{aligned} p_{\text {for}}(a)= & {} \int {\mathrm{d}}\nu \, p_{\text {for}}(0,\nu ) \, \exp \left[ -\int _{0}^{a} B(a',\nu ) {\mathrm{d}}a' \right] \,, \end{aligned}$$
(50)
$$\begin{aligned} p_{\text {back}}(a)= & {} e^{-\Lambda a} \int {\mathrm {d}}\nu \, p_{\text {back}}(0,\nu ) \, \exp \left[ -\int _{0}^{a} B(a',\nu ) {\mathrm {d}}a' \right] \,, \end{aligned}$$
(51)

where \(p_{\text {for}}(0,\nu )\) and \(p_{\text {back}}(0,\nu )\) are given by the boundary conditions:

$$\begin{aligned} p_{\text {for}}(0,\nu )= & {} \int {\mathrm {d}}a {\mathrm {d}}\nu ' B(a,\nu ') \Sigma \left( \nu | \nu ' \right) p_{\text {for}}(a,\nu ') \,, \end{aligned}$$
(52)
$$\begin{aligned} p_{\text {back}}(0,\nu )= & {} 2 \int {\mathrm {d}}a {\mathrm {d}}\nu ' B(a,\nu ') \Sigma \left( \nu | \nu ' \right) p_{\text {back}}(a,\nu ') \,. \end{aligned}$$
(53)

In the absence of mother-daughter correlations for the individual growth rate, then \(\Sigma \left( \nu | \nu ' \right) = {\hat{\Sigma }} \left( \nu \right)\), which implies that \(p_{\text {for}}(0,\nu )\) and \(p_{\text {back}}(0,\nu )\) have the same dependency in \(\nu\):

$$\begin{aligned} p_{\text {for}}(0,\nu )&= {\hat{\Sigma }} \left( \nu \right) \int {\mathrm {d}}a {\mathrm {d}}\nu ' B(a,\nu ') p_{\text {for}}(a,\nu ') \,, \end{aligned}$$
(54)
$$\begin{aligned}&={\hat{\Sigma }}\left( \nu \right) \, {\hat{Z}}^{-1} \end{aligned}$$
(55)
$$\begin{aligned} p_{\text {back}}(0,\nu )&=2 {\hat{\Sigma }} \int {\mathrm {d}}a {\mathrm {d}}\nu ' B(a,\nu ') p_{\text {back}}(a,\nu ') \end{aligned}$$
(56)
$$\begin{aligned}&= 2 {\hat{\Sigma }} \left( \nu \right) \, \Lambda \,. \end{aligned}$$
(57)

Finally, the fluctuation relation for the age reads

$$\begin{aligned} \frac{p_{\text {back}}(a)}{p_{\text {for}}(a)}= 2 {\hat{Z}} \Lambda e^{-\Lambda a} \,, \end{aligned}$$
(58)

which is the equivalent of Eq. (44) for fluctuating growth rates without mother-daughter correlations. Therefore, the age fitness landscape features the same linear dependency in age with a slope \(- \Lambda\) as in the case of constant individual growth rate.

In the general case with mother-daughter correlations, this statement is not necessarily true though, because \(p_{\text {for}}(0,\nu )\) and \(p_{\text {back}}(0,\nu )\) do not have in general the same dependency in \(\nu\).

Consequently, looking at the slope of the age fitness landscape informs on the presence of mother-daughter correlations as illustrated numerically in Fig. 5b, where the age fitness landscape for models without mother-daughter correlations aligns with the theoretical prediction of slope \(- \Lambda\); while the same function for models with mother-daughter correlations presents a non-linear age dependency.

Figure 5
figure 5

Age distributions and age fitness landscape for constant individual growth rate \(\nu\) (a) and age fitness landscapes for fluctuating \(\nu\) (b). In (a), the forward (resp. backward) age distribution is shown with orange filled histogram (resp. blue empty histogram). The red and green curves are the corresponding theoretical predictions. The inset plot shows the age fitness landscape (purple crosses) and the theoretical linear law (green). The horizontal black dashed-line represents the population growth rate \(\Lambda\). The simulation was conducted with \(B(a)=\nu a^{\alpha }\), \(\nu =1\), \(\alpha =2\), \(t=12\) and \(N_0=1\). In (b), the age fitness landscapes are shown for models without (purple dots) and with (pink squares) mother-daughter correlations in individual growth rates. The green line of slope \(- \Lambda\) fits well the purple dots. For both models, we chose a Gaussian distribution for \(\Sigma\), of standard deviation \(\sigma _{\nu }\), and centred either on the mother growth rate \(\nu '\) for models with correlations or on a constant mean growth rate \(\nu _m\) for uncorrelated models. The simulation was conducted with \(\nu _m=1\), \(\sigma _{\nu }=0.4\), \(\alpha =5\), \(t=12\) and \(N_0=1\).

Discussion

We have studied the relation between two different samplings of lineages in a branched tree: one sampling called backward or retrospective presents a statistical bias with respect to the forward or chronological sampling, an observation which is important to relate experiments carried out at the population level with the ones carried out at the single lineage level. This statistical bias can be rationalized by a set of fluctuation relations, which relate the probability distributions in the two ensembles and which are similar to fluctuation relations known in Stochastic Thermodynamics. This analogy leads to efficient methods to infer the population growth rate from an analysis of lineages, as we demonstrated by the analysis of the mother machine data of Tanouchi et al.20. Important inequalities for the mean number of divisions or the mean generation times follow from these fluctuation relations, which have been verified experimentally12 for various strains of E Coli. It would be interesting to generalize these studies to other cell types, and in the particular context of this paper, it would be useful to perform experimental studies in bulk or semi open configurations, to test the predictions which involve a comparison between forward and backward samplings.

By measuring the difference between these two samplings for a specific trait, one obtains the fitness landscape, introduced by Nozoe et al.13. While these authors have applied that concept to variables which are not reset or redistributed at division in their work, in the present paper, we used the concept of fitness landscape for variables like the size and the age, which precisely undergo a reset at division in size and age models. We derived expressions for these fitness landscapes, which agree with the statistical bias which we expect when measuring size or age distributions in cell populations. In addition, we also find that the precise form of the age fitness function appears to inform whether or not mother-daughter correlations are present in age models.

In the future, it would be valuable to extend our approach to include other important phenotypic state variables besides size or age, such as variables controlling replication dynamics3,27. We hope that our work contributes to clarifying the connection between single lineage and population statistics and to understanding the fundamental constraints which cell growth and division must obey.