The
importance of XSLT
By Juliane Harbart, Software AG
About the Author: Juliane
Harbarth is technical consultant for database management systems R&D at Software AG. She is also a member of the XSL Working
Group of the World Wide Web Consortium.
XSLT (XSL Transformations) is a standard way to describe how to
transform the structure of an XML
document. As a
Recommendation of the World Wide Web Consortium (W3C),
XSLT has established itself as a widely accepted standard in the XML
field. Although its primary intent was to add style to XML data for
representation on various media, its application has expanded to
additional causes. For example, because XSLT enables exchanging XML
data that conforms to different proprietary schemas, it has broad
application in business-to-business data exchange. It can be regarded
not only as a certain XML-related standard, but also as being at
the core of what makes XML really useful.
How XSLT began
In 1998 the World Wide Web Consortium (W3C) released its XML
1.0 Recommendation [1]. Amongst other things, XML claimed the
benefit of the separation of content and presentation. This triggered
the quest for a means of adding presentational information to XML
instances.
Cascading Style Sheets
The first approach to XML style sheets was to use the Cascading
Style Sheet Language (CSS) [2], which was originally used to style
HTML Web sites. CSS contains styling information in a very
straightforward way. It allows specifying presentational issues per
tag as well as the use of classes. This is easily extended to
work with XML instead of HTML.
XML Style Sheet Language
Another effort started by the W3Cs styling initiative is XSL,
which is a much more ambitious approach than CSS. XSL comes in two
parts: XSLT, a means of transforming XML, and XSL Formatting Objects
(FO), an XML styling vocabulary intended to contain all the
information necessary for output in every fashion imaginable e.g.
aural output, display on WAP devices, Braille.
The process of displaying an XSL instance involves two steps. In
the first step the XML instance to be displayed is converted into
another XML instance that conforms to XSL FO. The transformation to
the XSL FO format is described in style sheets written in XSLT. XSL as
a whole is still in Working Draft status. The latest
draft was published 27 March 2000 [3]. XSLT, however, though being
a part of XSL, has been a Recommendation since 16
November 1999 [4]. It is tightly integrated with other XML efforts
such as XLink and XML query.
In addition to CSS and XSL, there are a number of other style sheet
languages are suitable for rendering XML, including Format Output
Specification Language (FOSI) or proprietary languages such as
Omnimarks. These, however, have not achieved widespread use.
The XSLT success story
Although XSLTs primary intent was to transform XML into the XSL
FO vocabulary, taken by itself it is a general mechanism to transform
XML into other XML or even other end formats. XSLT can therefore be
used as a means of XML transformation wherever such a transformation
might be useful.
From XML to HTML
XSLTs first application was the conversion from XML to HTML.
Some users originally misunderstood this to be the whole purpose of
XSL, a view that is hard to shake loose. There are a number of reasons
for this.
Firstly, Microsofts Internet Explorer 5 (IE5), the first
well-known tool to support something resembling XSLT, supported
nothing but XML to HTML conversion. More recent versions of MSXML now
conform much better with XSLT.
Secondly, this preoccupation helped users to avoid closer contact
with XSL FO, whose intimidating spec size corresponds directly with
its complexity.
The third reason lies within the much earlier and wider
availability of XSLT tools compared to XSL FO tools. The first
spec-conformant XSLT processor, which appeared more or less
simultaneously with the XSLT Recommendation, was James
Clarks XT [5]. Other recommendable processors include Michael
Kays SAXON [6] and Xalan from the Apache XML project [7]. Three
tools implement various aspects and versions of XSL FO. They are FOP,
dedicated to the Apache XML project by James Tauber
[8], XEP by RenderX [9], and Sebastian
Rahtzs Passive TeX [10]. All this is again mirrored in the fact
that XSLT has been a Recommendation for more than six months now,
while XSL is still only a Working Draft.
XPath and its relation to XML Linking
XSLT has developed to take on remarkable influence and usability in
the realm of XML standards and techniques. As described above, XSLT
style sheets are instructions about how XML elements are to be
translated. They basically consist of a set of templates that specify
how a node in the source instance is to be represented in the
translated result. The mechanism of locating parts of XML documents is
naturally a subtask of designing an XML transformation language.
This locating language is, for example, necessary to specify
for which parts of the source a certain template applies, and with
which parts to continue transforming. Taking a closer look, this location
language is not only applicable to styling tasks, but may be
regarded as a general XML locating mechanism. It especially pertains
to XML linking issues as worked on by the W3Cs XLink
Working Group (WG) [11]. In accordance with the W3Cs general
goal to ensure interoperability within the World Wide Web, and even
more so between the W3Cs WGs, those two groups (the XSL WG and the
XLink WG) joined forces to develop the location language
described above. This language was then called XPath
[12], due to the fact that its main purpose is to locate parts of XML
documents and, according to that, one of its most important
grammatical constructs is location path.
XSLs Impact on XML Querying
Since the introduction of XML, its range of application has been
extended from use as a document format to a general format with almost
unlimited usability within the realm of data generation, storage and
transmission. The range of possible XML instances now includes XML
data that does not necessarily need to be displayed.
This enrichment in the common understanding of what XML is, and can
be used for, strongly influenced the development and the perception of
XML-related standards. The biggest impact, however, the concept of
having XML data (as opposed to XML documents) had on the W3C
world is the foundation of the XML Query WG in 1999
[13].
One point is the storage of XML instances. There is nothing wrong
with simply storing XML documents as files on a file server, but this
view ignores the advantages of modern database management systems
which can store large amounts of data are conveniently and which
include such database goodies like recovery and data integrity.
Looking at XML from this data-centric perspective raises questions
such as "What is the best way to store large amounts of
XML?", "What kinds of retrieval are appropriate?",
"What would an XML database look like?"
One of the main directions of extending XMLs usability is the
question of querying XML. From all of the efforts that existed in
respect to XML in 1999, the XSL WG seemed the most likely place to
discuss query issues. This is especially relevant to its XPath
offspring. In fact, the styling and the querying issues overlap.
A stylesheet that retrieves certain elements of an XML instance and
displays those in a re-ordered fashion, each including several
information items that are collected from the same instances, or even
other instances, comes very close to applications that may also appear
within a database application environment. To find out what really was
to be done in order to establish a standard for XML querying, it was
necessary to resolve the following questions:
- Doesnt XPath provide everything necessary of a query
language?
- Cant XSLT (together with XPath) accomplish most of the things
that querying XML is about?
In answer to the first question, XPath lacks the ability to change
the instance it is applied to, that is, to insert, update, or delete.
It is merely a means of obtaining query results based upon an instance
(or more than one instance). But, this does not prevent XPath from
being a query language. With regard to retrieval capability, XPath
lacks sorting, grouping and join methods. Basically, XPath selects
items, i.e. nodes , from an XML instance, but has no means of
specifying how to build these together to form a desired result.
XSLT is no more capable than XPath of updating the instances it is
applied to. Looking only at the retrieval aspect, XSLT does provide
some of the query-like behavior that XPath is lacking, not always,
however, in a way that seems most applicable for queries.
The decision of how much of XSLT/XPath goes into the XML query
language to be lies with the XML Query WG. At first glance it appears
reasonable to use as much as possible from already mature and accepted
technologies and just add those things that might be lacking. This
would fit well into the W3Cs general position to keep the standards
in sync.
Things appear quite different when viewing the query issue from a
different perspective. There are good reasons for doing things
differently when handled within the styling context from when they
appear within a query context. At least, it seems obvious that XPath
will persist as the main standard for locating things within
instances, so whatever query approach might come out of the WGs
work it will probably be based on XPath, or at least use XPath-like
means for those things that XPath covers. In any case, the styling and
the querying topics are tightly coupled.
Why is XSLT so important?
When looking at the realm of XML related standards that the W3C is
fostering and some other standard-like achievements that live outside
the W3C (e.g. SAX [14]), it is obvious that these
efforts differ with respect to their success. A standard (meaning a
Recommendation, Working Draft, or other appropriate term) can be
considered successful if it becomes accepted, gets implemented,
and is subsequently used.
As with every other new technology that has popped up so far, XML
fares very well in some respects and less well in others. XSLT and
XPath can be considered W3C successes. XPath is probably even more
successful than XSLT. It became a Recommendation early enough so that
people did not give up waiting for it. It was implemented and is
widely used.
XSLT (or rather XSL) was not always generally approved of and no
success (or failure) can be wholly explained by general demand or
quality of what is at stake. There are always many
personal/political/social influences that affect the outcome. Last but
not least, it should be mentioned that the efforts of many
individuals, not least of all technical editor James Clark, have
enormously benefited the rapid development and adoption of XSLT.
Links
- Extensible Markup Language (XML) 1.0
W3C Recommendation 10-February-1998
http://www.w3.org/TR/REC-xml
- W3C
Cascading Style Sheets
http://www.w3.org/Style/CSS/
- Extensible Stylesheet Language (XSL), Version 1.0
W3C Working Draft 27 March 2000
http://www.w3.org/TR/xsl
- XSL Transformations (XSLT), Version 1.0
W3C Recommendation 16 November 1999
http://www.w3.org/TR/xslt
- XT, Version 19991105
Copyright (c) 1998, 1999 James Clark
http://www.jclark.com/xml/xt.html
- About SAXON, version 5.4.1
Michael H. Kay, 7 August 2000
http://users.iclway.co.uk/mhkay/saxon/
- The APACHE XML Project
Xalan-Java version 1.2.D01
http://xml.apache.org/xalan/index.html
- The APACHE XML Project
FOP
http://xml.apache.org/fop/
- RenderX
products: XEP Rendering Engine
http://www.renderx.com/FO2PDF.html
- Oxford University Computing Services
PassiveTeX , Sebastian Rahtz March 2000
http://users.ox.ac.uk/~rahtz/passivetex/
- W3C, Architecture Domain
Extensible Markup Language (XML) Activity :
XML Linking Working Group
http://www.w3.org/XML/Activity#linking-wg
- XML Path Language (XPath), Version 1.0
W3C Recommendation 16 November 1999
http://www.w3.org/TR/xpath
- W3C, Architecture Domain
Extensible Markup Language (XML) Activity :
XML Query Working Group
http://www.w3.org/XML/Activity.html#query-wg
- Megginson Technologies
SAX 2.0: The Simple API for XML
http://www.megginson.com/SAX/
|