
64 BAYESIAN INFERENCE
counted automatically (see (3.10)). This is particularly helpful in multilevel
models where the number of parameters is sometimes difficult to quantify. As
an example, consider the normal random effects models in Example 2.1 with
likelihood given by
L(θ, b
i
| y) ∝
n
i=1
|Σ|
−1/2
exp
−
1
2
e
i
(β, b
i
)
T
Σ
−1
e
i
(β, b
i
)
, (3.14)
where e
i
(β, b
i
)=y
i
− x
i
β − w
i
b
i
and θ =(β, Σ). The random effects have
not been integrated out and are now treated as parameters along with θ.On
the surface, if we count the number of random effects (assume for simplicity
they are one-dimensional), there are n.However, the effective number can be
quite smaller because the random effects distribution p(b
i
| θ)shrinksthe
random effects to zero. As the variance of the random effects distribution
goes to zero, there are fewer parameters; in fact, if the variance is zero, all the
random effects are identically zero so there are in fact no parameters. On the
other hand, as the varianceincreases,the number of parameters approaches n.
Despite its computational simplicity, the DIC does have drawbacks. The
best model as determined by the DIC can change depending on the choice of
‘likelihood’ (see Trevisani and Gelfand, 2003); for example, again revisiting
the normal random effects model (Example 2.1), the likelihood can take one
of two forms: the integrated likelihood given in (3.2), or the likelihood without
the random effects integrated out, given in (3.14).
In addition, the DIC is not invariant to the parameterization of θ.Thisoc-
curs because the fit term Dev{E(θ | y)} in (3.12) involves a plug-in estimator
for θ based on the posterior, E(θ | y); and ingeneral, E{h(θ) | y} = h{E(θ |
y)}.Forthemultivariate normal model in Example 2.3, θ could be defined
as (β, Σ
−1
)or(β, Σ). Using Σ vs. Σ
−1
will result in different values for the
DIC; see Section 4.2 for an illustration on the Growth Hormone data. For
covariance matrices, Spiegelhalter et al. (2002) recommend using the inverse
because its posterior mean is more stable.
Another limitation, common to all likelihood based criteria, is that for some
models, the likelihood is not available in closed form (e.g., the multivariate
probit model in Example 2.6). For many models, to evaluate the likelihood,
it is possible to use Monte Carlo integration and reweighting. For example, in
the multivariate probit model, we need to compute
E
z
J
j=1
I{z
ij
> 0}
y
ij
I{z
ij
< 0}
1−y
ij
given (β, Σ), where z
i
follows a multivariate normal distribution with mean
x
i
β and covariance matrix Σ.Wecansamplefrom the distribution of z
i
and
compute the expectation by averaging the term in brackets over the samples.