next up previous
Next: Inference procedures. Up: Continuum approximation of the Previous: Continuum approximation of the

   
Cell-cycle correction to the continuum Luria-Delbrück distribution for 2-phase models of the cell cycle

Recall that cLD is obtained starting from the Bartlett generating function, which is the generating function corresponding to the Luria-Delbrück distribution. Also recall that the mean proportion of mutants can be expressed as $c \mu \log(N)$, where the correction factor c only depends on cell cycle parameters. We may view $c \mu$ in this formula as the effective mutation rate. This observation prompted us to attempt to generalize cLD for non-exponential distributions of cell cycle times. The approach is essentially to replace $\mu$ by the effective mutation rate $c \mu$. It turns out that, for cell cycle times that are distributed as a shifted exponential, this is not sufficient to give us mutant distributions that fit the experimental ones. However, if we also assume that the effective number of initial (and final) number of cells in the culture is (b/c) N0 (and (b/c) N), we can obtain very good fit between the simulation data and the theoretical prediction. Note that here I grouped together the correction factor for the mutation rate and the correction factor for the number of cells in one parameter b. For the correction factor for the mutation rate we have an analytical expression. The correction factor for the cell number we have to determine by fitting the simulation data to the generalized form of cLD.

I will describe the fitting procedure for the parameter b. It turns out that the value of this parameter is determined by the ratio r of the division time (TB) to the mean waiting time ($1/\lambda$). It is not affected by the mutation rate or the final number of cells in the culture. The major implication of this result is that we can obtain the value of this parameter for any r of interest by simulating cultures with relatively small numbers of cells. We can then use this value for any culture size, and thus infer mutation rates in realistic-size cultures.

In the integral form of the distribution, we let $\beta = b \mu N$ and $\epsilon = b \mu N_0$. The scaled variable $\xi$ becomes

 \begin{displaymath}
\xi = \frac{x}{c \mu} - \frac{1}{b \mu N}.
\end{displaymath} (6.36)

We then perform a one-parameter optimization, using as criterion for the goodness of fit the $\chi^2$ value. The procedure is the following. We generate the empirical distribution of the proportion of mutants (and the corresponding cumulative distribution) from the simulation data. This will also give us the distribution of the variable $\xi$, which is related to the proportion of mutants, x, through Eq. [*]. c is the correction factor due to the cell cycle time distribution (Eq. [*]), N is the final number of cells in the culture, $\mu$ is the mutation rate that we used in the simulation, and b is the parameter that we need to identify. We may use as a first choice for b its value for the L-D distribution, which is 2. Let $D(\xi)$ denote the empirical cumulative distribution of $\xi$. Let $T(\xi)$ be the theoretical cumulative distribution of this variable. We can calculate this distribution using the integral form of Eq. [*], with parameters $\beta = b \mu N$, and $\epsilon = b \mu N_0$. The quantity that we want to minimize is the $\chi^2$ value, calculated as:

 \begin{displaymath}
\chi^2 = \frac{1}{N}\sum_{\xi} \frac{\left(D(\xi) -
T(\xi)\right)^2}{T(\xi)}.
\end{displaymath} (6.37)

where $\xi$ takes values as given by Eq. [*], with the proportion of mutants varying between 0 and 1, in increments of 1/N. In fact, we neglected the cumulative density values below 0.01 and beyond 0.99 (in a few cases 0.98 or 0.97). They do not affect the fit significantly, while the computation of the integral becomes difficult in these regions. Also, the simulation data is less precise in these regions, as we would need a very large number of runs to be able to see events that have a very low probability. We find that value of the parameter b that minimizes the $\chi^2$ value. The algorithm for minimization is the Golden Section Search algorithm, described in Press et al. (1988). Table [*] gives these values for a number of data sets. Note that N0, N, and $\mu$ are the values that we used in the simulations. (b/c) N0, (b/c) N, and $c \mu$ are the effective initial number of cells, final number of cells, and mutation rate. The cases where we truncated the right-hand tail at proportions different from 0.99 are marked.


 
Table 6.3: Fit of the b parameter. Right tails truncated at 0.99, unless otherwise specified (0.98 marked by $\dag$, 0.97 by $\ddag$)
 
N0 N $\mu$ r b $\chi^2$
1 104 10-3 0 1.979 0.000263
1 105 10-4 0 2.003 0.000427
1 105 10-3 0 1.974 0.000911
1 104 $ 3 \times 10^{-4} $ 1 2.695 0.007509
1 104 10-3 1 2.769 0.003966
1 104 $ 3 \times 10^{-3}$ 1 2.749 0.000905
1 105 10-4 1 2.821 0.00126
1 105 $ 3 \times 10^{-4} $ 1 2.824 0.00112
1 105 10-3 1 2.771 0.00335$^\dag $
1 104 $ 3 \times 10^{-4} $ 3 2.889 0.00702
1 104 10-3 3 2.979 0.00617
1 104 $ 3 \times 10^{-3}$ 3 2.955 0.00159
1 105 10-4 3 3.037 0.00272
1 105 $ 3 \times 10^{-4} $ 3 3.076 0.000514
1 105 10-3 3 3.016 0.00202$^\ddag $
1 104 $ 3 \times 10^{-4} $ 9 2.949 0.00743
1 104 10-3 9 3.062 0.00565
1 104 $ 3 \times 10^{-3}$ 9 2.988 0.00991
1 105 10-4 9 3.022 0.00379
1 105 $ 3 \times 10^{-4} $ 9 3.163 0.00576
1 105 10-3 9 3.081 0.00587$^\ddag $

The first three data sets in the table correspond to cultures based on exponential cell-cycle time distribution. As we expect, the value of the parameter b for all these data sets is around 2. As the division time TB becomes a larger proportion of the cell cycle, the value of the parameter b increases. However, the most dramatic change occurs when TB changes from being negligible, to being as large as the mean waiting time. The other parameter of these distributions, c, shows a similar behavior. The effective mutation rate is maximal for TB = 0, it decreases with r, with the most dramatic change occurring at the transition between r = 0 and r = 1.


next up previous
Next: Inference procedures. Up: Continuum approximation of the Previous: Continuum approximation of the
Mihaela Oprea
1999-04-11