192 CORE TECHNIQuES
output the values of many more fit statistics than are typically reported for the analysis,
which presents a few problems. One problem is that different fit statistics are reported
in different articles, and another is that different reviewers of the same manuscript may
request statistics that they know or prefer (Maruyama, 1998). It can therefore be diffi-
cult for a researcher to decide on which particular statistics and which values to report.
There is also the possibility for selective reporting of values of fit statistics. For example,
a researcher keen to demonstrate acceptable model fit may report only those fit statistics
with favorable values. A related problem is “fit statistic tunnel vision,” a disorder appar-
ent among practitioners of SEM who become so preoccupied with overall model fit that
other crucial information, such as whether the parameter estimates actually make sense,
is overlooked. Fortunately, there is a cure, and it involves close inspection of the whole
computer output (Chapter 7), not just the section on fit statistics.
A more fundamental issue is the ongoing debate in the field about the merits of the
two main classes of fit statistics described in the next section: model test statistics and
approximate fit indexes. To anticipate some of this debate now, some methodologists
argue strongly against what has become a routine—and bad—practice for researchers to
basically ignore model test statistics and justify retention of their preferred model based
on approximate fit indexes. Others argue that there is a role for reasoned use of approxi-
mate fit indexes in SEM, but not at the expense of what test statistics say about model
fit. I will try to convince you that (1) there is real value in the criticisms of those who
argue against the uncritical use of approximate fit indexes, and (2) we as practitioners
of SEM need to “clean up our act” by taking a more skeptical, discerning approach to
model testing. That is, we should walk disciplined model testing as we talk it (practice
the rigor that we as scientists preach).
The main benefit of hypothesis testing in SEM is to place a reasonable limit on
the extent of model–data discrepancy that can be attributed to mere sampling error.
Specifically, if the degree of this discrepancy is less than that expected by chance, there
is initial support for the model. This support may be later canceled by results of more
specific diagnostic assessments, however, and no testing procedure ever “proves” mod-
els in SEM (Chapter 1). Discrepancies between model and data that clearly surpass the
limits of chance require diagnostic investigation of model features that might need to be
respecified in order to make the model consistent with the evidence.
Before any individual fit statistic is described, it is useful to keep in mind the fol-
lowing limitations of basically all fit statistics in SEM:
1. Values of fit statistics indicate only the average or overall fit of a model. That is,
fit statistics collapse many discrepancies into a single measure (Steiger, 2007). It is thus
possible that some parts of the model may poorly fit the data even if the value of a fit
statistic seems favorable. In this case, the model may be inadequate despite the values of
its fit statistics. This is why I will recommend later that researchers report more specific
diagnostic information about model fit of the type that cannot be directly indicated by
fit statistics alone. Tomarken and Waller (2003) discuss potential problems with models
that seem to fit the data well based on values of fit statistics.