%%%%%%%%%%% Article sur IAP %%%%%%%%%%%%%%
% Preambule
\documentclass[a4paper, 12pt]{article}
\title{Markovian modelling of gene products synthesis}
\author{\normalsize{ Jean Peccoud} \\ {\small TIMC-IMAG} \\ {\small Facult\'e de m\'edecine}\\ {\small Institut Albert
Bonniot}\\{\small 38706 La Tronche Cedex} \\ {\small France}
\and \normalsize{ Bernard Ycart}\\ {\small LMC-IMAG}\\ {\small Universit\'e Joseph Fourier}\\ {\small BP53}\\ {\small 38041 Grenoble Cedex
9}\\{\small France}}
\date{}
\begin{document}
% Nouvelles commandes
\newcommand{\pun}[3]{\ensuremath{p_{#1,#2}(#3)}}
\newcommand{\dpun}[3]{\ensuremath{p'_{#1,#2}(#3)}}
\newcommand{\gizt}[3]{\ensuremath{G_{#1}(#2, #3)}}
\newcommand{\pgit}[2]{\ensuremath{g_{#1}(#2)}}
\newcommand{\pgait}[2]{\ensuremath{\gamma_{#1}(#2)}}
\newcommand{\elambdaetmu}{\ensuremath{e^{-(\lambda+\mu)t}}}
\newcommand{\steparray}{\\ [14pt]}
\newcommand{\Et}{ \ensuremath{\mathbf{E}(t)}}
\newcommand{\E}{ \hbox{I\kern-.2em\hbox{E}}}
\newcommand{\Vt}{ \ensuremath{\mathbf{V}(t)}}
\newcommand{\V}{ \ensuremath{\mathbf{V}}}
\newcommand{\queue}{$M/M/\infty$}
\newcommand{\pin}[2]{\ensuremath{p_{#1,#2}}}
\newcommand{\hgf}{\ensuremath{\ _{1}F_{1}}}
\newcommand{\dg}[2]{{\displaystyle \frac{\partial G_{#1}}{\partial #2}(z,t)}}
\newcommand{\NN}{\hbox{I\kern-.2em\hbox{N}}}
% Etiquettes
% eq:reactionset1-eq:reactionset4 : Les quatre reactions chimiques
% {eq:kolmo} : Equations de Kolmogorov
% {eq:deffunctgene} :Definitions de G1 et GO
% {eq:equadifffunctgene} : EDP verifiees par (G0, G1) dans le cas general (Eq 2 de Bernard)
% Page de titre
\maketitle
%Abstract
\begin{abstract}
We propose a Markovian model for the gene induction process. This model is a
particular case of a birth and death process in Markovian environment. The environment
is represented by the state of the gene (active or inactive) and the protein population
size is ruled by the birth and death process. Proteins can be produced by the gene when
it is activated (birth). They can disappear as well due to their denaturation (death).
The distribution of the protein number is computed explicitly both for the transient
and stationary case. It is shown that three out of the four parameters of the model can be
estimated from an observed sample of protein numbers collected during the
stationary regime.
\end{abstract}
%%% Debut du texte
\section{Introduction}
One of the major achievements of molecular biology during the last two decades has been
the characterization of many phenomena ensuring the fine tuning of gene expression at
the transcriptional level. One of the simplest kinds of gene regulations is induction.
Basically, an inducible gene is a gene which is normally silent but can be expressed under the
control of external signals such as hormone, sugar, temperature and so on.
This reaction is usually mediated by a regulatory factor which can be activated by the
external signal through ligand binding or phosphorylation for instance. The activation
of this factor is reversible. The level of gene expression is proportional to the
fraction of the regulatory factor in the activated state.
Although the dissection of molecular mechanisms of gene regulation is presently a major
concern for molecular biologists, quantitative observations of this phenomenon, in
particular \textit{in vivo}, are still strikingly scarce \cite{maddox}. In 1990 Minoru Ko and coworkers
conducted a pioneering experiment in which they were able to quantify on a
cell-by-cell basis the induction of a reporter gene under the control of the glucocorticoid-inducible promoter/enhancer
of the mouse mammary tumor virus \cite{ko90}. They pointed out that the individual levels
of reporter activity were very heterogeneous among a single cell population grown in a
homogeneous concentration of inducer. When transcription studies are conducted
\textit{in vitro} the collected data are smooth since they result from the average
behavior of
large populations of transcription templates whereas in a single cell the action of the inducer results
in a probability for the transcription template to be in the activated state, \textit{i.e.}
a probability for the gene to be in its ``on'' state \cite{ko92}. In this case, one can
imagine that large fluctuations of the gene activity can be observed if the half-life of
the ``on'' state is large compared to the average time of a transcription cycle
\cite{hippel}. Ko initiated the mathematical modelling of gene induction mainly by doing
Monte-Carlo simulations \cite{ko91} of a discrete time two states Markov chain. The gene
activity level was assumed to be proportional to the total time spent by the process in
the ``on'' state.
In this paper a more detailed model of gene induction is presented. Both the random
switch between the active and inactive states of the gene and the randomness of
transcription initiation \cite{hippel} are considered. The set of interactions considered in this model are
\begin{eqnarray}
I & \stackrel{\lambda}{\longrightarrow} & A \label{eq:reactionset1} \\
A & \stackrel{\mu}{\longrightarrow} & I \label{eq:reactionset2} \\
A & \stackrel{\nu}{\longrightarrow} & A + P \label{eq:reactionset3} \\
P & \stackrel{\delta}{\longrightarrow} & \emptyset \label{eq:reactionset4}
\end{eqnarray} \\
where (\ref{eq:reactionset1}) and (\ref{eq:reactionset2}) express the change of the gene between
inactivity and activity, (\ref{eq:reactionset3}) the reporter gene protein synthesis, and
(\ref{eq:reactionset4}) the protein degradation. The symbol \( \emptyset \) is used as a
zero of chemical reaction meaning that P is transformed into something that is not
considered in the model. The time evolution of the gene product synthesis is modelled by a Markov
process in continuous time (see for instance \cite{barucha} for a general reference). The
state space of this process is
\[ S=\left\{(i,n)\; , \; i\in\{0, 1\}, n\in \NN \right\} . \]
In any state $(i, n)$ the first coordinate denotes
the status of the gene.
\begin{itemize}
\item[ ] \( i=0 \) when the gene is inactive,
\item[ ] \( i=1 \) when the gene is active.
\end{itemize}
The second coordinate $n$ represents the number of molecules (copies) of the protein P in the cell. It
is assumed that the time evolution of the state of the gene is Markovian. The gene
remains active during a random time exponentially distributed with parameter \( \mu \) (\( 1/\mu \)
time units on average). Then, the gene jumps to its inactive state where it remains
during a random time exponentially distributed with parameter $\lambda.$ When the gene is inactive no protein is
synthesized whereas during activity they are synthesized with a constant rate \( \nu \). Finally the
life durations of all copies of P are assumed to be independent, identically distributed
random variables, their common distribution being exponential with parameter \( \delta \).
With these notations, it is possible to detail the rates of transition on the state space
\( S\).
From the state $(0, n)$, inactive gene and $n$ copies of P, the process may jump to
\begin{itemize}
\item [ ] $(1, n)$ with rate \( \lambda \) (the gene switches to the active state),
\item [ ] $(0, n-1)$ with rate \( n\delta \) (degradation of one P).
\end{itemize}
From the state $(1, n)$, active gene and $n$ molecules of P, the process may
jump to
\begin{itemize}
\item [ ] $(0, n)$ with rate \( \mu \) (the gene switches to inactivity),
\item [ ] $(1, n-1)$ with rate \( n\delta \) (degradation of one P),
\item [ ] $(1, n+1)$ with rate \( \nu \) (synthesis of one P).
\end{itemize}
This set of transitions defines a Markov process on $S$ that will be referred to as the
IAP process.
Let \pun{0}{n}{t} (respectively \pun{1}{n}{t}) be the probability that at time $t$ the gene
is inactive (resp. active) and that $n$ molecules of P are present. It is easy to derive the Kolmogorov system of equations
for the IAP process from the
rates of transition \cite{barucha}.
\begin{equation}
\left\{
\begin{array}{rll}
\forall n \geq 0,\ p'_{0,n}(t) & = &
-(\lambda + n \delta) \pun{0}{n}{t} + (n+1)\delta \pun{0}{n+1}{t} + \mu
\pun{1}{n}{t} \steparray
p'_{1,0}(t) & = &
-(\mu + \nu) \pun{1}{0}{t} + \delta \pun{1}{1}{t} + \lambda \pun{0}{0}{t} \steparray
\forall n \geq 1,\ p'_{1,n}(t) & = &
-(\mu+\nu + n\delta) \pun{1}{n}{t} + (n+1)\delta \pun{1}{n+1}{t} + \nu \pun{1}{n-1}{t}
+ \lambda \pun{0}{n}{t}
\end{array}
\right.
\label{eq:kolmo}
\end{equation}
The model that has just been built is a particular case of a Markov process in
random environment. These processes have first been introduced in different contexts by
Eisen and Tainiter \cite{eisen} for queuing processes and by Smith and
Wilkinson \cite{smith} for branching processes. A review of processes in random environments and their
applications to population studies and epidemiology can be found in \cite{bartlett}.
More precisely, the IAP process can be seen as a birth and death process with a two states
Markovian environment in continuous time \cite{cogtor}. In this perspective, the evolution in time of
the gene between activity and inactivity is the environment process. The
state of that process determines the instantaneous rates of the population of protein molecules birth and death process.
When the gene is active and the decay rate
$\delta$ is positive, the copy number of P evolves according to a linear growth birth
and death process known in queuing theory as the \queue\ queue \cite{barucha}. When
$\delta = 0$ the protein process is a Poisson
process since no protein denaturation occurs. The asymptotic properties of birth and death processes in random environments
have been extensively studied: stationarity conditions, central limit theorems, law of
large numbers, diffusion approximations, and many other important results have been
proved \cite{cogtor, karmeshu, torrez78}. However, explicit solutions to the Kolmogorov
system of equations, even in the stationary case are out of reach in general
\cite{cogtor}. The main purpose of this paper is to show that the distribution of the
IAP process can be explicitly computed in the transient case as well as in the stationary one.
As a by-product
of these computations, it will be shown that three out of the four parameters of the
model can be estimated from a sample of the number of copies of P. This should allow an
experimental validation of the IAP model of gene induction.
Section \ref{sec:transient} deals with the transient
case. The general case $(\delta>0)$ is set apart from the Poissonian case
$(\delta=0)$ where no stationary regime exists. The asymptotics of the first two moments
are derived and interpreted. Section \ref{sec:asymptotic} is devoted to the stationary
regime. The generating function of the asymptotic distribution of the number of
copies of P is shown to be a degenerate hypergeometric function. Explicit convergent
estimators of the parameters in terms of the first three moments of that distribution are obtained.
%%%%%%%%%%%%%%%%%
% Transient behaviour section
%%%%%%%%%%%%%%%%%
\section{Transient behavior}\label{sec:transient}
Finding the transient behavior of the IAP process amounts to solving the Kolmogorov system
of differential equations (\ref{eq:kolmo}). Two generating functions are introduced for
this purpose.
\begin{equation}
G_{0}(z,t)\; = \; {\displaystyle \sum_{n=0}^{\infty} }z^{n}\pun{0}{n}{t} \quad;\quad
G_{1}(z,t) \; = \; {\displaystyle \sum_{n=0}^{\infty} }z^{n}\pun{1}{n}{t}
\label{eq:deffunctgene}
\end{equation}
Multiplying by $z^{n}$ the equations in \dpun{0}{n}{t} and \dpun{1}{n}{t} of the
Kolmogorov system and summing them up results in the following system of partial
differential equations in $G_{0}(z,t)$ and $G_{1}(z,t)$.
\begin{equation}
\left\{
{ \begin{array}{l}
\dg{0}{t}=-\lambda G_{0}(z,t) +\mu G_{1}(z,t) +\delta (1-z) \dg{0}{z}
\steparray
\dg{1}{t} = \lambda G_{0}(z,t)-\mu G_{1}(z,t)- \nu (1-z) G_{1}(z,t)+\delta (1-z) \dg{1}{z}
\end{array}}
\right.
\label{eq:equadifffunctgene}
\end{equation}
From the expression of \gizt{0}{z}{t} and \gizt{1}{z}{t} the two components of the IAP
process can be studied separately. The sum
\[ G(z,t)= \gizt{0}{z}{t}+\gizt{1}{z}{t} \]
is the generating function of the protein number at time $t$. Of course,
\[ \forall t \geq 0 ,\ G(1, t) = 1 \quad . \]
Moreover
\[ \pgait{0}{t} = \gizt{0}{1}{t} \quad\hbox{ and }\quad \pgait{1}{t} = \gizt{1}{1}{t}\]
are the probabilities for the gene to be inactive and active respectively at time $t$.
Since the gene activity is a two states Markov process, its transient behavior is
easily computed.
\begin{eqnarray}
\forall t \geq 0,\ \pgait{0}{t} = \gizt{0}{1}{t}& = & \frac{\mu}{\lambda+\mu} +
\left(\pgait{0}{0} - \frac{\mu}{\lambda+\mu}\right) \elambdaetmu
\nonumber \\
\pgait{1}{t} = \gizt{1}{1}{t} & = & \frac{\lambda}{\lambda+\mu} +
\left(\pgait{1}{0} - \frac{\lambda}{\lambda+\mu}\right) \elambdaetmu
\label{eq:initialecondition}
\end{eqnarray}
Now for any given initial condition $\left(\gizt{0}{z}{0}, \gizt{1}{z}{0}\right)$, the
system (\ref{eq:equadifffunctgene}) admits a unique solution. Since the topic of this
paper is gene induction, we assume that there is no copy of P in the cell at time $t=0$.
\begin{equation}
\forall n \geq 1, \ \pun{0}{n}{0} = \pun{1}{n}{0} = 0
\label{eq:initialcondition1}
\end{equation}
It could seem mathematically reasonable to assume that the gene activity process is
already at equilibrium at $t=0$ and hence to start with
\[ \pun{0}{0}{0} \; = \;\frac{\mu}{\lambda + \mu} \quad \mbox{and} \quad
\pun{1}{0}{0} \; = \; \frac{\lambda}{\lambda + \mu}\; .
\]
This assumption would lead to simpler expressions since the functions \pgait{0}{t}
and \pgait{1}{t} would then be constant.
However, this hypothesis does not seem to be biologically realistic. It is more
consistent with the gene induction phenomenon to assume that the gene is inactive at
time $t=0$: $\pun{0}{0}{0}=1$. Thus the initial values for the
generating functions $G_{0}$ and $G_{1}$ are 1 and 0 respectively.
\begin{equation}
G_{0}(z,0) = 1 \mbox{ and }
G_{1}(z,0) = 0
\label{eq:initialconditionG}
\end{equation}
\vskip 1cm
At this stage the case where $\delta > 0$ has to be distinguished from $\delta = 0$ (no
protein degradation considered). In the latter case, the system of partial differential
equations (\ref{eq:equadifffunctgene}) simplifies into a linear system of differential
equations with constant coefficients, $z$ appearing only as a parameter.
\begin{equation}
\left\{
\begin{array}{l}
\dg{0}{t}= -\lambda \gizt{0}{z}{t} + \mu \gizt{1}{z}{t} \steparray
\dg{1}{t} = \lambda \gizt{0}{z}{t} - \left(\mu + \nu(1-z)\right)\gizt{1}{z}{t}
\end{array}
\right.
\label{eq:simpleequadiff}
\end{equation}
The solution of this system with initial condition (\ref{eq:initialconditionG}) is
easily computed by a symbolic calculation package such as \textit{Mathematica} \cite{mma}. Since the
number of copies of P is the only experimental data potentially available, only the expression of
$G(z,t)= \gizt{0}{z}{t}+\gizt{1}{z}{t}$ is of interest.
\begin{equation}
G\left(z, t \right)=\frac{1}{2\beta(z)}
\left( \left(\alpha(z) + \beta(z)\right) \,{e^{\left( \alpha (z) - \beta(z) \right) \,t}} +
\left(\beta (z) - \alpha (z)\right) \,{e^{\left( \alpha (z) + \beta (z) \right) \,t}}\right)
\label{eq:gene1}
\end{equation}
where
\begin{eqnarray*}
\alpha(z)& =& \frac{1}{2} \left(\nu(z-1) - (\lambda + \mu)\right)\\
\beta(z) & = & \frac{1}{2} \left({\nu^{2} \left( z-1 \right) ^{2}} +
2 \nu \left( \lambda - \mu \right) (z-1) +
{\left( \lambda + \mu \right) }\right)^{1/2} \ .
\end{eqnarray*}
It is easy to derive the expectation and variance of the copy number at time
$t$ from its generating function. The expectation \Et\ is the value of
the derivative of $G(z,t)$ with respect to z at $z=1$.
\begin{equation}
\Et=\frac{\lambda}{\lambda + \mu}
\left(\nu t - \frac{\nu}{\lambda + \mu}\left(1 - \elambdaetmu\right)\right)
\label{eq:espstationnaire}
\end{equation}
This expression can be interpreted as follows. If the gene remains permanently active
and there is no protein denaturation the number of copies increases as a Poisson
process with intensity $\nu$. Therefore the average copy number of P at time $t$ is
$\nu t$. But if the gene alternates between activity and inactivity, the average
proportion of time it spends being active in the long run is $\lambda/(\lambda + \mu)$.
Therefore it could be anticipated that \Et\ should be equivalent for large
$t$'s to
\[ \frac{\lambda}{\lambda + \mu}\nu t \ .\]
The variance \Vt\ of the copy number at time $t$ can be computed in a similar
way from the second derivative of $G(z,t)$ with respect to z at $z=1$.
\begin{eqnarray}
\Vt & = & t \left( \frac{\lambda \nu}{\lambda + \mu} + 2 \frac{\lambda\mu\nu^2}{(\lambda+\mu)^3}\right)
\nonumber \\
& & \ + \frac{\lambda \nu}{(\lambda + \mu)^4}\left(\lambda \nu - 4 \mu \nu - (\lambda+\mu)^2\right)
\nonumber \\
& & \ + \frac{\lambda \nu}{(\lambda + \mu)^4} \elambdaetmu
\left(-\lambda\nu \elambdaetmu + 4 \mu \nu + (\lambda + \mu)^2 - 2 \nu t (\lambda^2 - \mu^2)\right)
\label{varstationnaire}
\end{eqnarray}
For large $t$'s, the dominant term of the above expression is
\[ t\left(\frac{\lambda \nu}{\lambda + \mu} + 2 \frac{\lambda\mu\nu^2}{(\lambda+\mu)^3}\right) . \]
In this equivalent, it is possible to recognize a term corresponding to the variance $\nu t$ of a
Poisson process with intensity $\nu$ multiplied by the asymptotic proportion of activity
of the gene. This component of the variance is increased by another term in which one of
the factors is the asymptotic variance of the gene activity process $ \lambda \mu/(\lambda+\mu)^2$.
\vskip 1cm
The second case where the denaturation rate $\delta$ of the P molecules
is strictly positive has now to be considered. Now the system (\ref{eq:equadifffunctgene}) is
not as easy to solve as in the previous case since it is singular at $z=1$. One
possible method is to express the solution as a power series in $(z-1)$, the
coefficients of which are the solution of a recurrence equation (see section
\ref{sec:asymptotic}). In our case, this leads rapidly to very complicated expressions.
However, the first terms of the power series expansion of
\[ G(z,t)= \gizt{0}{z}{t}+\gizt{1}{z}{t} \]
are worth computing since they are related to the moments of the distribution at
time $t$ of the copy number of P.
At least the computations of the expectation\Et\ and the variance\Vt\ are tractable.
\[
\Et = \frac{\lambda}{\lambda + \mu} \frac{\nu}{\delta} +
\frac{\lambda \nu}{(\lambda+\mu)(\lambda+\mu-\delta) } \elambdaetmu -
\frac{\lambda\nu}{\delta(\lambda+\mu-\delta)} e^{-\delta t}
\]
From this expression, it is clear that \Et\ converges exponentially fast to
\[
\mathbf{E}=\frac{\lambda\nu}{(\lambda+\mu)\delta} \quad .
\]
This asymptotic mean number of copies can be interpreted as follows. With a gene
permanently active, the copy number would evolve as an \queue\ queue
\cite{barucha}. So its asymptotic distribution would be Poisson with mean
$\nu/\delta$. But in the long run the gene is active only during a proportion
$\lambda/(\lambda+\mu)$ of the time. Notice also that the exponential rates of
convergence $-(\lambda+\mu)$ and $-\delta$ are those of the two states Markov
environment (activity of the gene) and of the \queue\ queue (number of copies) respectively.
The expression of the variance of the copy number is somewhat more complicated
and it would be pointless to reproduce it here in full extent. Its
asymptotics is the following.
\[
\lim_{t \rightarrow \infty} \Vt = \V =
\frac{\lambda}{\lambda+\mu}\frac{\nu}{\delta} +
\frac{\lambda \mu}{(\lambda + \mu)^2} \frac{\nu^2}{\delta (\lambda + \mu + \delta)}
\]
The interpretation of this expression is analogous to the one already given in the case
where $\delta=0$.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%% Comportement asymptotique dans le cas ou delta >0
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Asymptotic behavior}\label{sec:asymptotic}
When $\delta$ is strictly positive, the expressions of the expectation and variance of
the number of copies at time $t$ suggest that the IAP process has a
stationary distribution to which the distribution at time $t$ converges exponentially
fast.
This is indeed the case. That the IAP process is ergodic can be deduced from the
general results of Cogburn and Torrez \cite{cogtor}. Only the existence of the stationary
regime can be proved in this way. To go further, it is possible to use stochastic comparison
(see Stoyan \cite{stoyan} for a general reference). The idea is to compare our model to
the \queue\ queue that would describe the copy number if the gene remained always active.
In particular it is easy to see that the IAP process is stochastically dominated by a
Poisson distribution with parameter $\nu/\delta$, and hence admits moments of any order.
Now the environment process (status of the gene) as well as the underlying
birth and death processes in any environment state have exponential rates of convergence to
equilibrium. It can be proved that the IAP process inherits that property. Details of
proofs are not given here for sake of brevity. This exponential rate of convergence to
equilibrium is of high practical interest.
It makes possible to collect experimental data the statistical distribution of which should be
close to the asymptotic distribution.
In this section, an
explicit expression for the generating function of the stationary distribution is
first derived. Then the explicit expression of the parameters in terms of the first three
moments of that distribution is given. Thus the parameters of the model admit convergent
estimators in terms of a sample of copy numbers.
Let
\[\left\{\pin{i}{n}\; ,\; i\in\{0,1\}, n \in \NN\right\} \]
be the stationary distribution of the IAP process. The \pin{i}{n}'s are a constant solution
to the Kolmogorov system of equations (\ref{eq:kolmo}). The generating functions $g_0$
and $g_1$ are analogous to those of the previous section.
\[
\pgit{0}{z} = \sum_{n=0}^{\infty}z^n \pin{i}{n} \quad\mbox{ and } \quad
\pgit{1}{z} = \sum_{n=0}^{\infty}z^n \pin{0}{n}
\]
Naturally, the next relations hold.
\[
\pgit{0}{z} = \lim_{t \rightarrow \infty} \gizt{0}{z}{t} \mbox{ and }
\pgit{1}{z} = \lim_{t \rightarrow \infty} \gizt{1}{z}{t}
\]
The functions $g_0$ and $g_1$ satisfy the following system of differential
equations deduced from (\ref{eq:deffunctgene}).
\begin{equation}
\left\{
\begin{array}{l}
\delta (z-1) g'_{0}(z)=-\lambda g _{0}(z) + \mu g _{1} (z) \steparray
\delta (z-1) g'_{1}(z)=\lambda g _{0}(z) - \mu g_{1}(z) - \nu (1-z) g_{1}(z)
\end{array}
\right.
\label{eq:stationnaire}
\end{equation}
\vskip 1cm
The values at $z=1$ of $g_0$ and $g_1$ are the asymptotic probabilities for the gene to
be inactive and active respectively
\[
\pgit{0}{1}=\frac{\mu}{\lambda+\mu} \quad \mbox{ and } \quad
\pgit{1}{1}=\frac{\lambda}{\lambda+\mu}\; .
\]
The difficulty in solving this system comes from its singularity at $z=1$.
From (\ref{eq:stationnaire}) the following second order differential equation in $g_0$ is derived.
\begin{equation}
-\delta^{2} (1-z) g''_{0}(z) + \delta \left(\lambda + \mu + \delta + \nu
(1-z)\right)g'_{0}(z) - \lambda \nu g_{0}(z) = 0
\label{eq:8g''}
\end{equation}
Through the change of variable $x=1-z$ and setting $h(z)=g(1-x)$ , (\ref{eq:8g''}) becomes
\begin{equation}
\delta^{2} x h''(x) + \delta \left(\lambda + \mu + \delta + \nu
x\right) h'(x) +\lambda \nu h(x)=0.
\label{eq:statreduced2}
\end{equation}
The ratios $a$, $b$, and $c$ are introduced to simplify the expressions.
\[ a = \frac{\lambda + \mu + \delta}{\delta} \ , \
b = \frac{\nu}{\delta} \ , \
c = \frac{\lambda \nu}{\delta^{2}}
\]
The values of the
derivatives of $h$ at $x=0$ are obtained by induction from (\ref{eq:statreduced2}).
\begin{equation}
\forall n\geq 1,\ h^{(n)}(0) = (-1)^{n}\
\frac{c(c+b)\cdots\left(c+b(n-1)\right)}{a(a+1)\cdots(a+n-1)} \cdot\frac{\mu}{\lambda+\mu}
\label{eq:hn}
\end{equation}
Hence, the expansion of $g_0$ as a power series in $z-1$ is
\begin{equation}
g_{0}(z)=\frac{\mu}{\lambda + \mu}\left[1 + \sum_{n=1}^{\infty}\frac{1}{n!}\frac{c(c+b)\cdots\left(c+b(n-1)\right)}
{a(a+1)\cdots (a+n-1)}(z-1)^{n}\right]\ .
\label{eq:solg0}
\end{equation}
This power series is a particular case of a degenerate hypergeometric function
\cite{gradshteyn}. The use of the classical notation \hgf \ for these
functions leads to
\[
g_{0}(z)=\frac{\mu}{\lambda + \mu} \hgf \left(\frac{\lambda}{\delta},
\frac{\lambda+\mu+\delta}{\delta};
\frac{\nu}{\delta}(z-1)
\right)\ .
\]
The expression of $g_{1}(z)$ is deduced from the first equation of
(\ref{eq:stationnaire}). Now the generating function of the asymptotic distribution of
the copy number is $$g(z)=g_{0}(z)+g_{1}(z)\quad . $$ We get
\begin{equation}
g(z)=1 + \sum_{n=1}^{\infty}\frac{1}{n!}\frac{c(c+b)\cdots\left(c+b(n-1)\right)}
{(a-1)a\cdots (a+n-2)}(z-1)^{n}\quad .
\label{eq:solg}
\end{equation}
Again, $g$ is a degenerate hypergeometric function.
\[
g(z)= \hgf \left(\frac{\lambda}{\delta},
\frac{\lambda+\mu}{\delta};
\frac{\nu}{\delta}(z-1)
\right).
\]
The expression of $g$ as a power series in $(z-1)$ makes it particularly easy to compute
the exponential moments, denoted by $e_n$, of the asymptotic distribution. Let $X$ be the
number of copies at equilibrium, and denote by \E \ the
mathematical expectation, then
\[
e_{n} = \E \left[ X(X-1) \cdots (X-n+1)\right]=g^{(n)}(1)=\frac{c(c+b)\cdots\left(c+b(n-1)\right)}
{(a-1)a\cdots (a+n-2)}\ .
\]
Thus, the mean of the asymptotic distribution is
\[ \mathbf{E}= e_1 = \frac{c}{a-1}= \frac{\lambda}{\lambda+\mu} \frac{\nu}{\delta}\ .\]
Its variance is
\[ \V = e_2 + e_1 -e_1^2 = \frac{\lambda}{\lambda+\mu} \frac{\nu}{\delta}
+ \frac{\lambda\mu}{(\lambda+\mu)^2}\frac{\nu^2}{\delta(\lambda+\mu+\delta)}\ .\]
These two expressions have already been obtained in section \ref{sec:transient} as the
limits of \Et\ and \Vt.
\vskip 1 cm
Since the parameters of this model of gene expression cannot be directly measured, their estimation on the basis of a given
sample of the asymptotic distribution (which is the only kind of experimental data potentially available)
would be a significant step forward. \textit{A priori} one should estimate four
parameters $\lambda$, $\mu$, $\nu$, and $\delta$. However, the model is homogeneous in
time. Multiplying the scale of time by a constant is equivalent to dividing the four
parameters by the same constant and this operation does not affect the asymptotic
distribution. Hence, only three of the four parameters can be estimated from the asymptotic distribution.
The scale of time is adjusted so as to have one of the parameters,
say $\delta$, equal to 1. This assumption is equivalent to choosing as time unit the average life
time of the P molecule. Let
$r_1$, $r_2$ and $r_3$ be the successive ratios of the first three exponential moments.
\[
\begin{array}{lllllll}
r_1 & = & e_1 & = & {\displaystyle \frac{c}{a-1}} & = & {\displaystyle
\frac{\lambda}{\lambda+\mu} \cdot \nu} \steparray
r_2 & = & e_2/e_1 & =& {\displaystyle \frac{c+b}{a} }& = & {\displaystyle \frac{\lambda+1}{\lambda+\mu+1}\cdot\nu }\steparray
r_3 & = & e_3/e_2 & = & {\displaystyle \frac{c+2b}{a+1}} & = & {\displaystyle \frac{\lambda+2}{\lambda+\mu+2}\cdot\nu}
\end{array}
\]
It is now easy to solve the above equation in $\lambda$, $\mu$, and $\nu$ and get
\[
\begin{array}{lll}
\lambda & =&{\displaystyle \frac{2r_1(r_3- r_2)}{r_1 r_2 - 2 r_1 r_3 + r_2 r_3} }\ , \steparray
\mu & =&{\displaystyle \frac{2(r_2 - r_1)(r_1 - r_3) (r_3 - r_2)}{(r_1 r_2 - 2 r_1 r_3 + r_2 r_3)(r_1
- 2 r_2 + r_3)}} \ , \steparray
\nu &=& {\displaystyle \frac{-r_1 r_2 + 2 r_1 r_3 - r_2 r_3}{r_1 - 2 r_2 + r_3}}\ .
\end{array}
\]
Provided that some sample $(X_1, \ldots, X_N)$ of the asymptotic distribution is available
the natural estimators of the first three moments are
\[
\begin{array}{lll}
\hat{e}_1 & =&{\displaystyle \frac{1}{N}\sum_{i=1}^{N}X_i } \ ,\steparray
\hat{e}_2 & =&{\displaystyle \frac{1}{N}\sum_{i=1}^{N}X_i(X_i - 1) } \ ,\steparray
\hat{e}_2 & =&{\displaystyle \frac{1}{N}\sum_{i=1}^{N}X_i(X_i - 1)(X_i - 2)} \ .
\end{array}
\]
By the law of large numbers, these are convergent estimators of $e_1$, $e_2$ and $e_3$
respectively. $\lambda$, $\mu$ and $\nu$ were shown to be a function, $\phi$ , of
$e_1$, $e_2$ and $e_3$.
\[ \left(\lambda, \mu, \nu\right)= \phi\left(e_1, e_2, e_3\right) \]
Since $\phi$ is a continuous function, the statistics
\[ \left(\hat{\lambda}, \hat{\mu}, \hat{\nu}\right)= \phi\left(\hat{e}_1, \hat{e}_2, \hat{e}_3\right) \]
is a convergent estimator of the 3-tuple of the parameters $(\lambda, \mu, \nu)$.
\bibliography{iap}
\bibliographystyle{plain}
\end{document}