886 Chapter 25 Distributed Databases
■
Differences in query languages. Even with the same data model, the lan-
guages and their versions vary. For example, SQL has multiple versions like
SQL-89, SQL-92, SQL-99, and SQL:2008, and each system has its own set of
data types, comparison operators, string manipulation features, and so on.
Semantic Heterogeneity. Semantic heterogeneity occurs when there are differ-
ences in the meaning, interpretation, and intended use of the same or related data.
Semantic heterogeneity among component database systems (DBSs) creates the
biggest hurdle in designing global schemas of heterogeneous databases. The design
autonomy of component DBSs refers to their freedom of choosing the following
design parameters, which in turn affect the eventual complexity of the FDBS:
■
The universe of discourse from which the data is drawn. For example, for
two customer accounts, databases in the federation may be from the United
States and Japan and have entirely different sets of attributes about customer
accounts required by the accounting practices. Currency rate fluctuations
would also present a problem. Hence, relations in these two databases that
have identical names—
CUSTOMER or ACCOUNT—may have some com-
mon and some entirely distinct information.
■
Representation and naming. The representation and naming of data ele-
ments and the structure of the data model may be prespecified for each local
database.
■
The understanding, meaning, and subjective interpretation of data. This is
a chief contributor to semantic heterogeneity.
■
Transaction and policy constraints. These deal with serializability criteria,
compensating transactions, and other transaction policies.
■
Derivation of summaries. Aggregation, summarization, and other data-
processing features and operations supported by the system.
The above problems related to semantic heterogeneity are being faced by all major
multinational and governmental organizations in all application areas. In today’s
commercial environment, most enterprises are resorting to heterogeneous FDBSs,
having heavily invested in the development of individual database systems using
diverse data models on different platforms over the last 20 to 30 years. Enterprises
are using various forms of software—typically called the middleware, or Web-
based packages called application servers (for example, WebLogic or WebSphere)
and even generic systems, called Enterprise Resource Planning (ERP) systems (for
example, SAP, J. D. Edwards ERP)—to manage the transport of queries and transac-
tions from the global application to individual databases (with possible additional
processing for business rules) and the data from the heterogeneous database servers
to the global application. Detailed discussion of these types of software systems is
outside the scope of this book.
Just as providing the ultimate transparency is the goal of any distributed database
architecture, local component databases strive to preserve autonomy.
Communication autonomy of a component DBS refers to its ability to decide
whether to communicate with another component DBS. Execution autonomy