209
XML Versus HTML
ity at the information level.Whereas Java,VB, and C# provide certain levels of portability
at the programming language level, XML provides this information portability that we are
looking for.
The Extensible Markup Language (XML)
XML stands for Extensible Markup Language.You probably are already familiar with an-
other markup language called HTML (Hypertext Markup Language). Both XML and
HTML are descendants of SGML, the Standard Generalized Markup Language. Surpris-
ingly, SGML appeared as early as the 1970s and was standardized in the 1980s.
The primary function of HTML is to present data in a browser. It was actually devel-
oped to organize data using hyperlinks, and the browser is a perfect vehicle for this pur-
pose. However, HTML is meant to format and present data, not to define and verify it.
HTML is a subset of SGML but did not include the data verification constructs provided
by the SGML specification.The reason for this is that SGML is very complex and sophis-
ticated, and implementing SGML completely can be quite expensive.At least early on,
HTML did not concern itself with the data verification issues, among other things.
XML, on the other hand, does concern itself with data verification issues. XML was
defined in 1997 as a subset of SGML. XML is much more strict with its format than
HTML and was designed to represent data. XML is not proprietary and the World Wide
Web Consortium (W3C) is the organization that proposes recommendations and that
promotes the distribution of its standards.
In subsequent chapters, we will see how XML is used within various object-oriented
technologies such as distributed computing, object persistence, and so on.
One of the philosophical problems with Java is that it is proprietary (owned by Sun
Microsystems).The .NET framework is also proprietary (owned by Microsoft).The
beauty of XML is that it is an open technology. In fact, it is one of the few technologies
that have been embraced by most of the IT industry leaders: Sun, Microsoft, IBM, and so
on.Thus, XML is not about to go away anytime soon.
XML Versus HTML
Soon after XML emerged, there was speculation that XML would replace HTML. Many
believed that because they were both descendants of SGML, XML was an upgrade. In re-
ality, HTML and XML are designed for different purposes. HTML presents data, and
XML describes the data. Both HTML and XML are important tools in the development
of Web-based systems.
XML actually looks a lot like HTML.This is not surprising, because they come from
the same source. However, XML provides two primary advantages that HTML does
not—validity and well-formed documents.
HTML tags are all predefined.Tags such as <HTML>, <HEAD>, <BODY>, and so on are all
defined in the HTML specification.You cannot add your own tags. Because HTML is in-
tended for formatting purposes, this is not really a problem. XML, however, is meant to