deutsche Version
 

 

 

 
 

XML: Grease for the wheels of electronic business

By Michael Champion, Research and Development Fellow at Software AG

The extensible markup language (XML) has evolved as a promising solution for a number of very real problems that face enterprises aiming to exploit electronic business in their markets. It offers a widely adopted standard way of representing text and data in a format that can be easily processed and exchanged across diverse hardware, operating systems, applications and the Web.

The current wave of worldwide interest in XML has come because it is viewed as a panacea for a range of problems that hinder enterprises striving to develop electronic business applications. These include:

  • The most common format for document exchange, Microsoft Word, is only available on the Windows and Macintosh platforms, thus limiting information exchange and reuse.
  • HTML is rigidly defined and cannot be extended without destroying the interoperability of the data across applications and platforms.
  • Exchanging data across platforms and between applications is difficult using traditional EDI, because solutions are cumbersome and expensive.

XML has quickly gathered great interest because it is a relatively simple standard to understand, use, and implement. It is a classic "80:20" solution, meaning that it supplies about 80 percent of the functionality of competing technologies (such as MS Word for documents, EDI for data exchange) with perhaps 20 percent of the effort required to build enterprise-level solutions.

Comparing HTML and XML

XML and HTML both stem from the ISO standard Standard Generalized Markup Language (SGML) and therefore share many similarities. There are, however, three fundamental differences:

  • XML separates form and content: HTML mostly consists of tags defining the appearance of text; in XML the tags generally define the structure and content of the data, with actual appearance specified by a specific application or an associated style sheet.
  • XML makes data self-describing: The tags label the meaning of a piece of data, greatly reducing the difficulty of extracting useful information out of a data stream or of using data from one application in another.
  • XML is extensible: tags can be defined by individuals or organizations for some specific application, whereas the HTML standard tagset is defined by the World Wide Web Consortium (W3C).

Although XML and HTML both apply tags to markup content, computer programs can "understand" the XML-tagged data better because it carries meaningful semantic and structural information.

Take, for example, the healthcare data management initiative "HL7," for which an XML solution is currently in development. Health records are partly structured but highly variable, made up of both text and data, and issues such as privacy, security, accessibility, and integrity make this an extremely challenging project. Attempted solutions based on RDBMS and HTML have not been successful, and EDI solutions are too expensive for individual physicians and small clinics.

How XML is being used today

Document Content Management

Various target groups inside and outside of organizations require timely delivery of information in different formats and on numerous media, for example product catalogs on paper, CD-ROM, or HTML. Unfortunately, maintaining and formatting textual information is costly because it must be performed manually.

Content management is therefore a prime application area for XML, because information stored in XML format can be filtered quickly to suit virtually any audience. Filtering is made possible by style sheets, which are basically sets of rules (XSL) defining how various XML elements are to be displayed for a particular output format. For example, XML transformation systems such as XSL Transformation (XSLT) convert XML for display on several types of front-ends, including conventional browsers, XML browsers, and even lightweight devices such as cell phones and PDAs that support the Wireless Markup Language (WML). WML is a derivation of XML.

A pioneering XML project that has been very successful, but has received relatively little attention, is the Wall Street Journal (WSJ) Interactive Edition, an online newspaper. XML allows intelligent searching within the WSJ Interactive Edition archive, so that searching for the "company name" tag "IBM," for example, returns stories about IBM, but not stories in which IBM is mentioned in passing.

The power of XML was demonstrated when WSJ subsequently wanted to provide news to PalmPilot users. Subscribers now simply download information formatted for the limited Web browser abilities in these handheld devices to their PCs, then sync them to their palm computers for later reading.

In short, the WSJ Interactive Edition successfully uses XML technologies to separate form from content so that content can be processed more efficiently. XML data is filtered for printed paper, the Web, and a stripped-down HTML version delivered to PDAs.

Enterprise integration

Much of the current excitement surrounding XML involves its potential in linking existing enterprise systems, often ERP systems, with:

  • each other. Organizational changes, mergers, and the general evolution of technology and business often force enterprise-level systems that were designed to stand alone to exchange data with other "monolithic" systems. ERP vendors and some enterprise integration vendors are using XML as the cornerstone for these integration efforts.
  • suppliers and customers. Traditionally the domain of EDI, projects in the area of business-to-business have been relatively limited outside of very large enterprises because of cost and complexity. XML-based EDI has come into its own recently because of the combination of the universal transport mechanism "Internet" and the universal data format "XML."
  • consumers. The explosion of interest in e-commerce has made it imperative for many enterprises to offer their goods or services directly over the Internet. Many products are available for building online storefronts, but few offer links to underlying ERP systems, and ERP vendors have not been quick to offer sophisticated storefront software of their own. This has led to a paradoxical situation where orders entered electronically by consumers over the Internet often must be re-entered by hand into the ERP systems that track inventory or schedule production. Vendors such as Intershop are using XML as the basis for data presented to and received from the user and to link with back-end EDI and ERP systems.

Another way that XML can be used to integrate diverse applications deserves mention. Various initiatives, for example XML-RPC and Microsoft’s Simple Object Access Protocol (SOAP), have defined means of performing remote procedure calls between client and server applications using HTTP as the transport mechanism and XML to encode the details. Microsoft plans to evolve COM (the communications protocol that enables Windows applications to interoperate) to incorporate SOAP, which will allow applications on any platform to interoperate with others via XML messages.

Enterprise information portals

Enterprise information portals (EIPs) provide coherent, personalized views of scattered data in an enterprise, delivered over an intranet or the Internet to anyone with a browser (see Software Report No. 49, September 1999). Portals can help classify and focus information to support specific internal business objectives (such as workgroup productivity) or to provide information services to customers. XML is central to many portal products because it is rich enough to represent data from a number of sources yet flexible enough to be formatted for display in browsers. This allows the portal itself to be a relatively thin application that relies on underlying databases, ERP systems, search engines, and the Web itself, for the underlying content and functionality.

Early adopters of the EIP concept have already begun projects. General Motors (Detroit, Michigan, USA), for example, has developed an engineering portal to support the design of its automobile components. The solution is based on the product RIO by DataChannel Inc (Bellevue, Washington, USA). The latest version of RIO, 3.2, applies "X-Machine" technology developed by Software AG to store XML information.

Conclusion

Today, XML plays a role similar to that of English for natural languages. It is a language that anyone can easily learn and use to communicate with, and has established itself as a universal standard. Although XML is not yet fully mature, standards and products are quickly evolving to support it deep in the native architecture of electronic business applications. XML offers today's developers powerful glue to connect disparate applications, but in the near future it will play a more central role in application architectures.

 

Mike Champion can be contacted at mike.champion@sagus.com

Box: Criteria for using XML in projects

XML technology should be considered for projects in which:

  • data must be exchanged across diverse systems;
  • imported data must be processed with minimal human involvement;
  • data must be presented in multiple formats or on a variety of media;
  • existing products or specialized tools can be leveraged to ease development.