deutsche Version
 

 

 

 

The importance of XSLT

By Juliane Harbart, Software AG

About the Author: Juliane Harbarth is technical consultant for database management systems R&D at Software AG. She is also a member of the XSL Working Group of the World Wide Web Consortium.

XSLT (XSL Transformations) is a standard way to describe how to transform the structure of an XML document. As a Recommendation of the World Wide Web Consortium (W3C), XSLT has established itself as a widely accepted standard in the XML field. Although its primary intent was to add style to XML data for representation on various media, its application has expanded to additional causes. For example, because XSLT enables exchanging XML data that conforms to different proprietary schemas, it has broad application in business-to-business data exchange. It can be regarded not only as ‘a certain XML-related standard,’ but also as being at the core of what makes XML really useful.

How XSLT began

In 1998 the World Wide Web Consortium (W3C) released its XML 1.0 Recommendation [1]. Amongst other things, XML claimed the benefit of the separation of content and presentation. This triggered the quest for a means of adding presentational information to XML instances.

Cascading Style Sheets

The first approach to XML style sheets was to use the Cascading Style Sheet Language (CSS) [2], which was originally used to style HTML Web sites. CSS contains styling information in a very straightforward way. It allows specifying presentational issues per tag as well as the use of ‘classes’. This is easily extended to work with XML instead of HTML.

XML Style Sheet Language

Another effort started by the W3C’s styling initiative is XSL, which is a much more ambitious approach than CSS. XSL comes in two parts: XSLT, a means of transforming XML, and XSL Formatting Objects (FO), an XML styling vocabulary intended to contain all the information necessary for output in every fashion imaginable e.g. aural output, display on WAP devices, Braille.

The process of displaying an XSL instance involves two steps. In the first step the XML instance to be displayed is converted into another XML instance that conforms to XSL FO. The transformation to the XSL FO format is described in style sheets written in XSLT. XSL as a whole is still in Working Draft status. The latest draft was published 27 March 2000 [3]. XSLT, however, though being a part of XSL, has been a Recommendation since 16 November 1999 [4]. It is tightly integrated with other XML efforts such as XLink and XML query.

In addition to CSS and XSL, there are a number of other style sheet languages are suitable for rendering XML, including Format Output Specification Language (FOSI) or proprietary languages such as Omnimarks. These, however, have not achieved widespread use.

The XSLT success story

Although XSLT’s primary intent was to transform XML into the XSL FO vocabulary, taken by itself it is a general mechanism to transform XML into other XML or even other end formats. XSLT can therefore be used as a means of XML transformation wherever such a transformation might be useful.

From XML to HTML

XSLT’s first application was the conversion from XML to HTML. Some users originally misunderstood this to be the whole purpose of XSL, a view that is hard to shake loose. There are a number of reasons for this.

Firstly, Microsoft’s Internet Explorer 5 (IE5), the first well-known tool to support something resembling XSLT, supported nothing but XML to HTML conversion. More recent versions of MSXML now conform much better with XSLT.

Secondly, this preoccupation helped users to avoid closer contact with XSL FO, whose intimidating spec size corresponds directly with its complexity.

The third reason lies within the much earlier and wider availability of XSLT tools compared to XSL FO tools. The first spec-conformant XSLT processor, which appeared more or less simultaneously with the XSLT Recommendation, was James Clark’s XT [5]. Other recommendable processors include Michael Kay’s SAXON [6] and Xalan from the Apache XML project [7]. Three tools implement various aspects and versions of XSL FO. They are FOP, dedicated to the Apache XML project by James Tauber [8], XEP by RenderX [9], and Sebastian Rahtz’s Passive TeX [10]. All this is again mirrored in the fact that XSLT has been a Recommendation for more than six months now, while XSL is still only a Working Draft.

XPath and its relation to XML Linking

XSLT has developed to take on remarkable influence and usability in the realm of XML standards and techniques. As described above, XSLT style sheets are instructions about how XML elements are to be translated. They basically consist of a set of templates that specify how a node in the source instance is to be represented in the translated result. The mechanism of locating parts of XML documents is naturally a subtask of designing an XML transformation language.

This ‘locating language’ is, for example, necessary to specify for which parts of the source a certain template applies, and with which parts to continue transforming. Taking a closer look, this ‘location language’ is not only applicable to styling tasks, but may be regarded as a general XML locating mechanism. It especially pertains to XML linking issues as worked on by the W3C’s XLink Working Group (WG) [11]. In accordance with the W3C’s general goal to ensure interoperability within the World Wide Web, and even more so between the W3C’s WGs, those two groups (the XSL WG and the XLink WG) joined forces to develop the ‘location language’ described above. This language was then called XPath [12], due to the fact that its main purpose is to locate parts of XML documents and, according to that, one of its most important grammatical constructs is ‘location path’.

XSL’s Impact on XML Querying

Since the introduction of XML, its range of application has been extended from use as a document format to a general format with almost unlimited usability within the realm of data generation, storage and transmission. The range of possible XML instances now includes XML data that does not necessarily need to be displayed.

This enrichment in the common understanding of what XML is, and can be used for, strongly influenced the development and the perception of XML-related standards. The biggest impact, however, the concept of having XML ‘data’ (as opposed to XML documents) had on the W3C world is the foundation of the XML Query WG in 1999 [13].

One point is the storage of XML instances. There is nothing wrong with simply storing XML documents as files on a file server, but this view ignores the advantages of modern database management systems which can store large amounts of data are conveniently and which include such database goodies like recovery and data integrity.

Looking at XML from this data-centric perspective raises questions such as "What is the best way to store large amounts of XML?", "What kinds of retrieval are appropriate?", "What would an XML database look like?"

One of the main directions of extending XML’s usability is the question of querying XML. From all of the efforts that existed in respect to XML in 1999, the XSL WG seemed the most likely place to discuss query issues. This is especially relevant to its XPath offspring. In fact, the styling and the querying issues overlap.

A stylesheet that retrieves certain elements of an XML instance and displays those in a re-ordered fashion, each including several information items that are collected from the same instances, or even other instances, comes very close to applications that may also appear within a database application environment. To find out what really was to be done in order to establish a standard for XML querying, it was necessary to resolve the following questions:

  1. Doesn’t XPath provide everything necessary of a query language?
  2. Can’t XSLT (together with XPath) accomplish most of the things that querying XML is about?

In answer to the first question, XPath lacks the ability to change the instance it is applied to, that is, to insert, update, or delete. It is merely a means of obtaining query results based upon an instance (or more than one instance). But, this does not prevent XPath from being a query language. With regard to retrieval capability, XPath lacks sorting, grouping and join methods. Basically, XPath selects items, i.e. nodes , from an XML instance, but has no means of specifying how to build these together to form a desired result.

XSLT is no more capable than XPath of updating the instances it is applied to. Looking only at the retrieval aspect, XSLT does provide some of the query-like behavior that XPath is lacking, not always, however, in a way that seems most applicable for queries.

The decision of how much of XSLT/XPath goes into the XML query language to be lies with the XML Query WG. At first glance it appears reasonable to use as much as possible from already mature and accepted technologies and just add those things that might be lacking. This would fit well into the W3C’s general position to keep the standards in sync.

Things appear quite different when viewing the query issue from a different perspective. There are good reasons for doing things differently when handled within the styling context from when they appear within a query context. At least, it seems obvious that XPath will persist as the main standard for locating things within instances, so whatever query approach might come out of the WG’s work it will probably be based on XPath, or at least use XPath-like means for those things that XPath covers. In any case, the styling and the querying topics are tightly coupled.

Why is XSLT so important?

When looking at the realm of XML related standards that the W3C is fostering and some other standard-like achievements that live outside the W3C (e.g. SAX [14]), it is obvious that these efforts differ with respect to their success. A standard (meaning a Recommendation, Working Draft, or other appropriate term) can be considered ‘successful’ if it becomes accepted, gets implemented, and is subsequently used.

As with every other new technology that has popped up so far, XML fares very well in some respects and less well in others. XSLT and XPath can be considered W3C successes. XPath is probably even more successful than XSLT. It became a Recommendation early enough so that people did not give up waiting for it. It was implemented and is widely used.

XSLT (or rather XSL) was not always generally approved of and no success (or failure) can be wholly explained by general demand or quality of what is at stake. There are always many personal/political/social influences that affect the outcome. Last but not least, it should be mentioned that the efforts of many individuals, not least of all technical editor James Clark, have enormously benefited the rapid development and adoption of XSLT.

Links

  1. Extensible Markup Language (XML) 1.0
    W3C Recommendation 10-February-1998
    http://www.w3.org/TR/REC-xml
  2. W3C
    Cascading Style Sheets
    http://www.w3.org/Style/CSS/
  3. Extensible Stylesheet Language (XSL), Version 1.0
    W3C Working Draft 27 March 2000
    http://www.w3.org/TR/xsl
  4. XSL Transformations (XSLT), Version 1.0
    W3C Recommendation 16 November 1999
    http://www.w3.org/TR/xslt
  5. XT, Version 19991105
    Copyright (c) 1998, 1999 James Clark
    http://www.jclark.com/xml/xt.html
  6. About SAXON, version 5.4.1
    Michael H. Kay, 7 August 2000
    http://users.iclway.co.uk/mhkay/saxon/
  7. The APACHE XML Project
    Xalan-Java version 1.2.D01
    http://xml.apache.org/xalan/index.html
  8. The APACHE XML Project
    FOP
    http://xml.apache.org/fop/
  9. RenderX
    products: XEP Rendering Engine
    http://www.renderx.com/FO2PDF.html
  10. Oxford University Computing Services
    PassiveTeX , Sebastian Rahtz March 2000
    http://users.ox.ac.uk/~rahtz/passivetex/
  11. W3C, Architecture Domain
    Extensible Markup Language (XML) Activity :
    XML Linking Working Group
    http://www.w3.org/XML/Activity#linking-wg
  12. XML Path Language (XPath), Version 1.0
    W3C Recommendation 16 November 1999
    http://www.w3.org/TR/xpath
  13. W3C, Architecture Domain
    Extensible Markup Language (XML) Activity :
    XML Query Working Group
    http://www.w3.org/XML/Activity.html#query-wg
  14. Megginson Technologies
    SAX 2.0: The Simple API for XML
    http://www.megginson.com/SAX/