230
INFERENCE
FOR
STOCHASTIC KINETIC MODELS
10.4 Diffusion approximations for inference
The discussion in the previous section demonstrates that it is possible to construct
exact MCMC algorithms for inference in discrete stochastic kinetic models based on
discrete time observations (and
it
is possible to extend the techniques
to
more realis-
tic data scenarios than those directly considered). The discussion gives great insight
into the nature
of
the inferential problem and its conceptual solution. However, there
is a slight problem with the techniques discussed there in the context
of
the relatively
large and complex models
of
genuine interest to systems biologists. It should be clear
that each iteration
of
the MCMC algorithm described in the previous section
is
more
computationally demanding than simulating the process exactly using Gillespie's di-
rect method (for the sake
of
argument, let us say that it is one order
of
magnitude
more demanding). For satisfactory inference, a large number
of
MCMC iterations
will be required. For models
of
the complexity discussed in the previous section, it
is not uncommon for
10
7
-10
8
iterations to be required for satisfactory convergence
to the true posterior distribution. Using such methods for inference therefore has a
computational complexity
of
10
8
-10
9
times that required to simulate the process.
As
if
this were not bad enough, it turns out that MCMC algorithms are particularly diffi-
cult to parallelise effectively (Wilkinson
2005). One possible approach
to
improving
the situation is to approximate the algorithm with a much faster one that is less ac-
curate, as discussed
in
Boys et al. (2004). Unfortunately even that approach does not
scale up well to genuinely interesting problems, so a different approach is required.
A similar problem was considered in Chapter 8, from the viewpoint
of
simula-
tion rather than inference. We saw there how it was possible
to
approximate the true
Markov jump process by the chemical Langevin equation (CLE), which is the dif-
fusion process that behaves most like the true
jump
process.
It
was seen there how
simulation
of
the CLE can
be
many orders
of
magnitude faster than an exact algo-
rithm. This suggests the possibility
of
using the CLE as an approximate model for
inferential purposes.
It
turns out that the CLE provides an excellent model for infer-
ence, even in situations where it does not perform particularly well
as
a simulation
model. This observation at first seems a little counter-intuitive, but the reason is that
in the context
of
inference, one is conditioning on data from the true model, and this
helps to calibrate the approximate model and stop MCMC algorithms from wander-
ing off into parts
of
the space that are plausible in the context
of
the approximate
model, but not in the context
of
the true model.
What is required is a method for inference for general non-linear multivariate dif-
fusion processes observed partially, discretely and with error. Unfortunately this too
turns out to be a highly non-trivial problem, and is still the subject
of
a great deal
of
ongoing research. Such inference problems arise often in financial mathematics and
econometrics, and so much
of
the literature relating to this problem can be foundin
that area; see Durham & Gallant
(2002) for an overview.
The problem with diffusion processes is that any finite sample path contains an in-
finite amount
of
information, and so the concept
of
a complete-data likelihood does
not exist in general.
We
will illustrate the problem in the context
of
high-resolution
time-course data on the CLE. Starting with the CLE in the form
of
(8.3), define