XML: Grease for the wheels of electronic business
By Michael Champion, Research and Development Fellow at
Software AG
The extensible markup language (XML) has evolved as a promising
solution for a number of very real problems that face enterprises
aiming to exploit electronic business in their markets. It offers a
widely adopted standard way of representing text and data in a format
that can be easily processed and exchanged across diverse hardware,
operating systems, applications and the Web.
The current wave of worldwide interest in XML has come because it
is viewed as a panacea for a range of problems that hinder enterprises
striving to develop electronic business applications. These include:
- The most common format for document exchange, Microsoft Word, is
only available on the Windows and Macintosh platforms, thus
limiting information exchange and reuse.
- HTML is rigidly defined and cannot be extended without
destroying the interoperability of the data across applications
and platforms.
- Exchanging data across platforms and between applications is
difficult using traditional EDI, because solutions are cumbersome
and expensive.
XML has quickly gathered great interest because it is a relatively
simple standard to understand, use, and implement. It is a classic
"80:20" solution, meaning that it supplies about 80 percent
of the functionality of competing technologies (such as MS Word for
documents, EDI for data exchange) with perhaps 20 percent of the
effort required to build enterprise-level solutions.
Comparing HTML and XML
XML and HTML both stem from the ISO standard Standard Generalized
Markup Language (SGML) and therefore share many similarities. There
are, however, three fundamental differences:
- XML separates form and content: HTML mostly
consists of tags defining the appearance of text; in XML the tags
generally define the structure and content of the data, with
actual appearance specified by a specific application or an
associated style sheet.
- XML makes data self-describing: The tags label
the meaning of a piece of data, greatly reducing the difficulty of
extracting useful information out of a data stream or of using
data from one application in another.
- XML is extensible: tags can be defined by
individuals or organizations for some specific application,
whereas the HTML standard tagset is defined by the World Wide Web
Consortium (W3C).
Although XML and HTML both apply tags to markup content, computer
programs can "understand" the XML-tagged data better because
it carries meaningful semantic and structural information.
Take, for example, the healthcare data management initiative
"HL7," for which an XML solution is currently in
development. Health records are partly structured but highly variable,
made up of both text and data, and issues such as privacy, security,
accessibility, and integrity make this an extremely challenging
project. Attempted solutions based on RDBMS and HTML have not been
successful, and EDI solutions are too expensive for individual
physicians and small clinics.
How XML is being used today
Document Content Management
Various target groups inside and outside of organizations require
timely delivery of information in different formats and on numerous
media, for example product catalogs on paper, CD-ROM, or HTML.
Unfortunately, maintaining and formatting textual information is
costly because it must be performed manually.
Content management is therefore a prime application area for XML,
because information stored in XML format can be filtered quickly to
suit virtually any audience. Filtering is made possible by style
sheets, which are basically sets of rules (XSL) defining how various
XML elements are to be displayed for a particular output format. For
example, XML transformation systems such as XSL Transformation (XSLT)
convert XML for display on several types of front-ends, including
conventional browsers, XML browsers, and even lightweight devices such
as cell phones and PDAs that support the Wireless Markup Language (WML).
WML is a derivation of XML.
A pioneering XML project that has been very successful, but has
received relatively little attention, is the Wall Street Journal (WSJ)
Interactive Edition, an online newspaper. XML allows intelligent
searching within the WSJ Interactive Edition archive, so that
searching for the "company name" tag "IBM," for
example, returns stories about IBM, but not stories in which IBM is
mentioned in passing.
The power of XML was demonstrated when WSJ subsequently wanted to
provide news to PalmPilot users. Subscribers now simply download
information formatted for the limited Web browser abilities in these
handheld devices to their PCs, then sync them to their palm computers
for later reading.
In short, the WSJ Interactive Edition successfully uses XML
technologies to separate form from content so that content can be
processed more efficiently. XML data is filtered for printed paper,
the Web, and a stripped-down HTML version delivered to PDAs.
Enterprise integration
Much of the current excitement surrounding XML involves its
potential in linking existing enterprise systems, often ERP systems,
with:
- each other. Organizational changes, mergers,
and the general evolution of technology and business often force
enterprise-level systems that were designed to stand alone to
exchange data with other "monolithic" systems. ERP
vendors and some enterprise integration vendors are using XML as
the cornerstone for these integration efforts.
- suppliers and customers. Traditionally the
domain of EDI, projects in the area of business-to-business have
been relatively limited outside of very large enterprises because
of cost and complexity. XML-based EDI has come into its own
recently because of the combination of the universal transport
mechanism "Internet" and the universal data format
"XML."
- consumers. The explosion of interest in
e-commerce has made it imperative for many enterprises to offer
their goods or services directly over the Internet. Many products
are available for building online storefronts, but few offer links
to underlying ERP systems, and ERP vendors have not been quick to
offer sophisticated storefront software of their own. This has led
to a paradoxical situation where orders entered electronically by
consumers over the Internet often must be re-entered by hand into
the ERP systems that track inventory or schedule production.
Vendors such as Intershop are using XML as the basis for data
presented to and received from the user and to link with back-end
EDI and ERP systems.
Another way that XML can be used to integrate diverse applications
deserves mention. Various initiatives, for example XML-RPC and
Microsofts Simple Object Access Protocol (SOAP), have defined means
of performing remote procedure calls between client and server
applications using HTTP as the transport mechanism and XML to encode
the details. Microsoft plans to evolve COM (the communications
protocol that enables Windows applications to interoperate) to
incorporate SOAP, which will allow applications on any platform to
interoperate with others via XML messages.
Enterprise information portals
Enterprise information portals (EIPs) provide coherent,
personalized views of scattered data in an enterprise, delivered over
an intranet or the Internet to anyone with a browser (see Software
Report No. 49, September 1999). Portals can help classify and focus
information to support specific internal business objectives (such as
workgroup productivity) or to provide information services to
customers. XML is central to many portal products because it is rich
enough to represent data from a number of sources yet flexible enough
to be formatted for display in browsers. This allows the portal itself
to be a relatively thin application that relies on underlying
databases, ERP systems, search engines, and the Web itself, for the
underlying content and functionality.
Early adopters of the EIP concept have already begun projects.
General Motors (Detroit, Michigan, USA), for example, has developed an
engineering portal to support the design of its automobile components.
The solution is based on the product RIO by DataChannel Inc (Bellevue,
Washington, USA). The latest version of RIO, 3.2, applies
"X-Machine" technology developed by Software AG to store XML
information.
Conclusion
Today, XML plays a role similar to that of English for natural
languages. It is a language that anyone can easily learn and use to
communicate with, and has established itself as a universal standard.
Although XML is not yet fully mature, standards and products are
quickly evolving to support it deep in the native architecture of
electronic business applications. XML offers today's developers
powerful glue to connect disparate applications, but in the near
future it will play a more central role in application architectures.
Mike Champion can be contacted at mike.champion@sagus.com
Box: Criteria for using XML in projects
XML technology should be considered for projects in which:
- data must be exchanged across diverse systems;
- imported data must be processed with minimal human involvement;
- data must be presented in multiple formats or on a variety of
media;
- existing products or specialized tools can be leveraged to ease
development.
|