Small area estimation: The application of model-based or indirect estimators to link survey
outcome variables such as disease or substance use available for a national or regional
study, for example, a
census
, to local area predictors such as countydemographic and
socioeconomic variables, to estimate local area disease or substance use prevalence rates.
The ‘areas’ in small area estimation may be defined by geographical domains such as a state
or county and by socio-demographic characteristics such as income, race, age, or gender
subgroups. Such an approach can be applied to cases where the number of area-specific
sample observations is not large enough to produce reliable direct estimates. [Small Area
Estimation, 2003, J. N. K. Rao, Wiley, New York.]
Small expected frequencies: A term that is found in discussions of the analysis of
contingency
tables
. It arises because the derivation of the
chi-squared distribution
as an approximation for
the distribution of the
chi-squared statistic
when the hypothesis of independence is true, is
made under the assumption that the expected frequencies are not too small. Typically this
rather vague phrase has been interpreted as meaning that a satisfactory approximation is
achieved only when expected frequencies are five or more. Despite the widespread accept-
ance of this ‘rule’, it is nowadays thought to be largely irrelevant since there is a great deal of
evidence that the usual chi-squared statistic can be used safely when expected frequencies
are far smaller. See also STATXACT.[The Analysis of Contingency Tables, 2nd edition,
1992, B. S. Everitt, Chapman and Hall/CRC Press, London.]
Smear-and-sweep: A method of adjusting death rates for the effects of confounding variables. The
procedure is iterative, each iteration consisting of two steps. The first entails ‘smearing’ the
data into a two-way classification based on two of the confounding variables, and the second
consists of ‘sweeping’ the resulting cells into categories according to their ordering on the
death rate of interest. [Encyclopedia of Statistical Sciences, 2006, eds. S.Kotz, C. B. Read,
N.Balakrishnan and B.Vidakovic, Wiley, New York.]
Smirnov, Nikolai Vasil’yevich ( 1 900^1 966): Born in Moscow, Russia, Smirnov graduated
from the University of Moscow in 1926 and then taught at Moscow University, Timoryazev
Agricultural Academy and Moscow City Pedagogical Institute. In 1938 he obtained his
doctorate with his dissertation, ‘On approximation of the distribution of random variables’.
From 1938 until his death Smirnov worked at the Steklov Mathematical Institute of the
USSR Academy of Sciences in Moscow making significant contributions to the distribu-
tions of statistics used in
nonparametric tests
and the limiting distributions of
order statistics
.
He died on June 2nd, 1966 in Moscow.
Smith , Cedri c Aust en Bardel l ( 1 917^2002): Born in Leicester, UK, Smith won a scholarship
to Trinity College, Cambridge, in 1935 from where he graduated in mathematics with first-
class honours in 1938. He then began research in statistics under
Bartlett
, J. Wishart and
Irwin
, taking his doctorate in 1942. After World War II Smith became Assistant Lecturer at
the Galton Laboratory, eventually becoming Weldon Professor in 1964. It was during this
period that he worked on linkage analysis, introducing ‘lods’ (log-odds) to linkage studies
and showing how to compute them. Later he introduced a Bayesian approach to such
studies. Smith died on 16 January 2002.
Sm ooth ingmet h ods: A term that could be applied to almost all techniques in statistics that involve
fitting some model to a set of observations, but which is generally used for those methods
which use computing power to highlight unusual structure very effectively, by taking
advantage of people’s ability to draw conclusions from well-designed graphics. Examples
of such techniques include kernel methods, spline functions, nonparametric regression
and locally weighted regression. [TMS Chapter 2.]
400