1430 Part H Automation in Medical and Healthcare Systems
are not based ontesting the hypotheses made in advance
by people. Notwithstanding, data mining techniques
can be applied to investigate medical improvement and
patient relationship management in general hospital
management based on the records of patient visits.
Major data mining tools that may help healthcare
administration or healthcare services include:
1. Classification tools, e.g., decision trees and neural
networks
2. Clustering techniques, e.g., K-Means and hierarchi-
cal clustering
3. Association rules, e.g., apropri algorithm and rough
sets theory.
Classification is a tool to identify the relationships
between certain conditional (independent) variables
and a decision (dependent) variable. The relationships
are mostly specified to a certain range of conditional
variables (e.g., age between 20 and 25 or systolic hyper-
tension above140mmHg), and certainrange ofdecision
variables (e.g., annual cost of medical care between
$ 10000 and $ 15000). Because tools of decision trees,
such as ID3 and CHAID, provide clear IF-THEN rules
for presenting the relationships between variables, they
are more acceptable, in comparison with neural net-
works that define variable relationships as weights on
connections of neurons.
Clustering is a tool to separate a data set into
multiple subsets. A predefined distance that measures
similarity between two cases and methods for finding
clusters (subsets of data) are the two key elements in
clustering. Clusteringdoes not require a dependent vari-
able. Usually it is applied to handle data with a massive
number of variables. Clustering methods (algorithms)
iteratively check the distance between cases to find
clusters. Basically, the distance between two cases that
belong to the same cluster should be shorter than the
distance between two cases that belong to two different
clusters. The results of clusters are usually difficult to
interpret and apply, since distance is an unusual con-
cept in medicine, and the cases of each cluster may
include a large number of variables. Even so, cluster-
ing has been applied as a technique for analyzing gene
expression data in the biological field, e.g., [80.14], and
problems of clinical and medical care, e.g., [80.15]. For
example, Jannin and Morandi successfully apply deci-
sion tree and clusteringtechniques to predict parts of the
surgical procedure forbrain-tumor patients based on the
pathology-related characteristics of the patient [80.16].
Tools of association rules intend to find causal-
ity between variables. For example, patients who have
diabetes are also hypertensive (15%, 90%). The 15%
represents support for such a rule, whereas the 90% rep-
resents confidence. Support defines the proportion of
the number of patients who have both diabetes and hy-
pertension over the totalnumber of patients.Confidence
defines the proportion of the number of patients who
have both diabetes and hypertension over the number
of patients who have diabetes. An advantage of associ-
ation rules is that they can identify a causality that has
less support but high confidence. To search for a plau-
sible curable treatment or medication for rare disorder
patients is a typical problem that can apply techniques
of association rules, for two reasons. First, rare disorder
patients mean that the support is small. Second, if the
majority of the patients are cured by a common treat-
ment or medication, its confidence is high. Traditional
statistical tools that mostly do not allow a large number
of variables in the analysis and count on a large number
of samples are not able to handle such a problem.
Though data mining provides an opportunity for
finding medical evidence that may not be found by tra-
ditional statistical tools, few applications are performed
in the healthcare fields. Some researchers ascribe the
reasons to a large number of false alarms created by
data mining. Actually, the credibility of mining results
can be proved by cross-validation or some other re-
lated tools [80.17]. From the author’s perspective, there
are two major obstacles for data mining applications in
healthcare. First, healthcare practitioners may not be so
familiar with data warehouse and data mining. Second,
many legacy medical data tables are not normalized. It
is believed that both obstacles will be overcome in the
near future, because many multidisciplinary research
opportunities have been created for both healthcare and
computer science professionals.
80.7 Developing a Healthcare Information System
To develop a good healthcare information system re-
lies on at least three elements: the right people, the
right project management, and the right strategic plan.
Right people means that the healthcare information sys-
tem should cover the operation needs of major system
users (e.g., administrators, physicians, nurses, staff, pa-
Part H 80.7