deutsche Version
 

 

 

 


Cover Pages Archive

SGML and XML News

By: Robin Cover

  • [November 30, 2000]   
    XML Key Management Specification (XKMS).    

    VeriSign, Microsoft, and webMethods have "created the open XML Key Management Specification (XKMS) with the goal of efficient integration of digital signatures and encryption -- to simplify the integration of standard methods for securing Internet transactions (PKI and digital certificates) with XML applications." The version 1.0 document XML Key Management Specification (XKMS) "specifies protocols for distributing and registering public keys, suitable for use in conjunction with the proposed standard for XML Signature developed by the World Wide Web Consortium (W3C) and the Internet Engineering Task Force (IETF) and an anticipated companion standard for XML encryption. The XML Key Management Specification (XKMS) comprises two parts -- the XML Key Information Service Specification (X-KISS) and the XML Key Registration Service Specification (X-KRSS). The X-KISS specification defines a protocol for a Trust service that resolves public key information contained in XML-SIG elements. The X-KISS protocol allows a client of such a service to delegate part or all of the tasks required to process <ds:KeyInfo> elements. A key objective of the protocol design is to minimize the complexity of application implementations by allowing them to become clients and thereby shielded from the complexity and syntax of the underlying PKI used to establish trust relationships. These may be based upon a different specification such as X.509/PKIX, SPKI or PGP. The X-KRSS specification defines a protocol for a web service that accepts registration of public key information. Once registered, the public key may be used in conjunction with other web services including X-KISS. Both protocols are defined in terms of structures expressed in the XML Schema Language, protocols employing the Simple Object Application Protocol (SOAP) v1.1 and relationships among messages defined by the Web services Definition Language v1.0 (WSDL). Other compatible expressions are possible." The public announcement for XKMS reads, in part: "VeriSign Inc., Microsoft Corp. and webMethods Inc. have introduced a breakthrough XML-based framework, the XML key management specification (XKMS), to enable a broad range of software developers to seamlessly integrate digital signatures and data encryption into e-commerce applications. To accelerate the development of applications incorporating these advanced technologies, the XKMS specification -- jointly designed and prototyped by VeriSign, Microsoft and webMethods with industry support from other technology leaders -- was made publicly available today and will be submitted to the appropriate Web standards bodies for consideration as an open Internet standard. In addition, XKMS will be built into the Microsoft.NET architecture to ensure broad and rapid adoption of this framework in both B2B and B2C environments. The new XKMS specification revolutionizes the development of trusted B2B and B2C applications by introducing an open framework that enables virtually any developer to easily access applications from any public key infrastructure products and services. With the XKMS specification, developers are able to integrate advanced technologies such as digital signature handling and encryption into their web-based applications. The XKMS specification promotes the interoperability of advanced technologies because it is based on XML, a rapidly growing standard for application development. Currently, developers choosing to enable applications to handle digital keys for authentication and digital signatures are often required to purchase and integrate specialized toolkits from a Public Key Infrastructure (PKI) software vendor which only interoperate with that vendor's PKI offerings. Functions such as digital certificate processing, revocation status checking, and certification path location and validation are all built into the application via the toolkit. With the new XKMS specification, those functions are no longer built into the application but instead reside in servers that can be accessed via easily programmed XML transactions. The XKMS architecture, along with the recently drafted XML digital signature standards and the emerging XML encryption standard, provides a complete framework for ensuring broad interoperability across applications developed by enterprises, B2B exchanges and other Internet communities of interest. XKMS is also compatible with the emerging standards for Web Services Description Language (WSDL) and Simple Object Access Protocol (SOAP)..." For other description and references, see "XML Key Management Specification (XKMS)."


    [November 30, 2000]   
    VeriSign's Extensible Provisioning Protocol (EPP).    

    Extensible Provisioning Protocol (EPP) is one of four principal components in the VeriSign XML Trust Services suite recently announced in connection with the XML Key Management Specification (XKMS). Overview: "To enable Internet registrars that sell online identity services to access central domain name registry data more efficiently, VeriSign has developed the EPP (Extensible Provisioning Protocol) to support an XML-based domain name management utility. EPP enables VeriSign Global Registry Services' accredited registrar partners to sell domain names, telephone numbers, and other identity assets via EPP, which permits greater information sharing and flexibility as new identification technologies gain acceptance... The Extensible Provisioning Protocol (EPP) is a connection-oriented, application layer client-server protocol for the provisioning and management of objects stored in a shared central repository. Specified in the schema notation of the Extensible Markup Language (XML), the protocol defines generic object management operations and an extensible framework that maps protocol operations to objects. A complete set of protocol specifications was recently published with the Internet Engineering Task Force (IETF) as Internet-Draft documents. XML provides a rich set of features that allows communicating peers to create data tags that have semantic meaning in the operating environment shared by the peers. While in general this is a very desirable feature, it introduces an element of instability for protocol designers. Once a protocol has been formally specified, adding new tags to extend the protocol means changes to published specifications. Over time this can lead to a lack of interoperable implementations and specification confusion. EPP takes a different approach. The base protocol itself is very simple, defining a set of object management features that are not explicitly tied to specific objects. The base protocol is intended to be stable and unchanging to ease development of interoperable implementations. EPP operations are mapped to objects using XML namespaces that provide 'hooks' to loosely coupled object specifications so that definitions for management of new objects can be done outside the base protocol. For example, the protocol can be extended to support provisioning of purchase orders by defining a new specification that defines how purchase order objects are managed. EPP provides features for session management, object query, and object management. Sessions are established between a client and a server, and once a session is established the client and server exchange commands and responses. Security services are available at both the application and transport layers. The EPP protocol suite currently contains a base protocol specification and mappings for three different objects: Internet domain names, Internet host names, and 'contact' identifiers associated with humans and organizations'. Specifications for other objects may be developed as needs are identified. EPP is connection oriented, but transport independent. A specification for transport using the Transmission Control Protocol (TCP) exists; specifications for transport using other protocols or applications frameworks may be produced in the future." There are five published components in the EPP Specification: (1) Base Specification, (2) Domain Name Mapping, (3) Host Mapping, (4) Contact Mapping, (5) Transport over TCP. For other references, see "Extensible Provisioning Protocol (EPP)."


    [November 30, 2000]   
    XMLPay Specification.    

    VeriSign, Ariba, and other vendors have created the XMLPay specification "for sending payment requests and responses through financial networks; [the specification is designed] to help Internet merchants process a broad range of Web-based payment types, including credit debit card, purchase card, and Automated Clearinghouse, or ACH payments, for B2B and B2C e-commerce. The XMLPay Specification consists of three parts. (1) 'XMLPay: Core' is the heart of XMLPay. It defines the basic XML datatypes needed to unify B2C and B2B payment processing applications. (2) 'XMLPay: Registration' captures automation of payment-related enrollment functions, such as merchant registration and configuration. (3) 'XMLPay: Reports' specifies mechanisms for automating merchant transaction reporting functions in the payments back office. The first of these specifications, XMLPay Core, is available now [2000-11-30] Teams working on XMLPay are planning to extend the functionality to registration and reporting. The driving goal is to provide a public specification for Web payment interoperability, from merchant service sign-up, to payment execution, to reporting functions after payments have taken place." From the text of the specification: "This document, the XMLPay 1.0 Core Specification, defines an XML syntax for payment transaction requests, responses, and receipts in a payment processing network. The typical user of XMLPay is an Internet merchant or merchant aggregator who wants to dispatch credit card, corporate purchase card, Automated Clearing House (ACH), or other payment requests to a financial processing network. Using the data type definitions specified by XMLPay, a user creates a client payment request and dispatches it -- using a mechanism left unspecified by XMLPay -- to an associated XMLPay-compliant server component. Responses are also formatted in XML and convey the results of the payment requests to the client. XMLPay includes support for digitally-signed XML objects. Digital signatures are used both for the purpose of authenticating requests and responses and as a foundation for a higher-level digital receipt architecture based on an X.509 Public Key Infrastructure. XMLPay uses the digital signature format being specified by the joint IETF/W3C XML Digital Signature Working Group." Appendix A, "XMLPay Schemas," provides standard W3C schemas for XMLPay and XMLPay Types; Appendix B, "XMLPay DTD," presents the Document Type Definition for XMLPay... XMLPay supports payment processing using the following payment instruments: (1) Retail credit and debit cards; (2) Corporate purchase cards: Levels 1, 2 and 3; (3) Internet checks; (4) ACH. Typical XMLPay operations include: (1) Funds authorization and capture; (2) Sales and repeat sales; (3) Voiding of transactions. XMLPay is intended for use in both Business-to-Consumer (B2C) and Business-to-Business (B2B) payment processing applications. In a B2C transaction, the Buyer presents a payment instrument (e.g., credit card number) to a Seller in order move money from the Buyer to the Seller (or vice-versa in the case of a credit or refund). Use of XMLPay comes into play when the Seller needs to forward the Buyer's payment information on to a Payment Processor. The Seller formats a XMLPayRequest and submits it either directly to an XMLPay-compliant payment processor or, as pictured, indirectly via a XMLPay-compliant Payment Gateway. Responses have type XMLPayResponse. The Buyer-to-Seller and Payment Gateway-to-Payment Processor channels are typically left unaffected by use of XMLPay. For example, XMLPay is typically not used in direct communications between the buyer and the seller. Instead, conventional HTML form submission or other Internet communication methods are typically used. Similarly, because Payment Processors often differ considerably in the formats they specify for payment requests, it is often desired to localize XMLPay server logic at the Payment Gateway, leaving the legacy connections between gateways and processors unchanged. When used in support of B2B transactions, the Seller does not typically initiate XMLPay requests. Instead, an aggregator or trading exchange uses XMLPay to communicate business-focused purchasing information (such as level 3 corporate purchase card data) to a payment gateway. In this way, the trading exchange links payment execution to other XML-based communications between Buyers and Sellers such as Advance Shipping Notice delivery, Purchase Order communication, or other B2B communication functions..." For references, see "XMLPay Specification."


    [November 29, 2000]   
    Clinical Data Interchange Standards Consortium Publishes XML Specification for Drug Development and Regulatory Review Processes.    

    A recent announcement from the Clinical Data Interchange Standards Consortium (CDISC) describes the completed development of FDA safety domain metadata models and an XML DTD for clinical data interchange. The XML DTD and associated documentation from the CDISC Operational Data Modeling Group are available for download. The announcement says, in part: "The Clinical Data Interchange Standards Consortium has achieved two significant milestones towards its goal of standard data models to streamline drug development and regulatory review processes. CDISC participants have completed metadata models for the 12 safety domains listed in the FDA Guidance regarding Electronic Submissions and have produced a revised XML-based data model to support data acquisition and archive. The Submissions Data Standards team has been working since early 1999 to define a metadata model that is designed to: (1) provide regulatory submission reviewers with clear descriptions of the usage, structure, contents and attributes of all submitted datasets and variables; (2) allow reviewers to replicate analyses, tables, graphs and listings with minimal or no transformations; (3) enable reviewers to easily view and subset the data used to generate any analysis, table, graph or listing without complex programming. This team, under the leadership of Wayne Kubick of Lincoln Technologies, and Dave Christiansen of Genentech, presented their metadata models to a group of representatives at the FDA on October 10 and discussed future cooperative efforts with Agency reviewers. The CDISC Operational Data Modeling (ODM) Working Group released their Version 1.0 model for data acquisition, interchange and archive. A small, interdisciplinary team was formed in September of 1999 to examine two different XML-based data interchange models (which had been separately put forward by PHT/Lincoln Technologies and by Phase Forward) -- specifically to assess the feasibility of developing an integrated, CDISC standard data and metadata model to support data acquisition. The resulting CDISC model is based on the Extensible Markup Language (XML), which is gaining wide acceptance as a general data interchange framework, and has been determined to be an effective approach to clinical data interchange. The goal of the CDISC XML Document Type Definition (DTD) Version 1.0 is to make available a first release of the definition of this CDISC model, in order to support sponsors, vendors and CROs in the design of systems and processes around a standard interchange format. 'The release of the CDISC Version 1.0 DTD provides the industry with a foundation of standards that will support unprecedented improvements in the quality and efficiency of future data interchange,' said Ken Harter, senior systems analyst, Amgen Inc. Both CDISC models can be reviewed at http://www.cdisc.org/publications.html. Comments are requested by January 31, 2001 and should be posted using the CDISC Web site Discussions option. CDISC is a non-profit organization with a mission to lead the development of standard, vendor-neutral, platform-independent data models that improve process efficiency while supporting the scientific nature of clinical research in the biopharmaceutical and healthcare industries." For additional description and references, see "Clinical Data Interchange Standards Consortium."


    [November 29, 2000]   
    Submissions to the OMG Gene Expression RFP.
        

    Several submissions have now been published in response to the Object Management Group's Gene Expression RFP, originally issued in March 2000 (LSR RFP-7/lifesci/00-03-09). The RFP overview: "Life sciences research has experienced rapid growth in the number of gene expression analysis techniques and is faced with explosive growth in the amount of data produced by these experiments. The creation and adoption of standardized programmatic interfaces is a crucial step in support of automated data exchange and interoperability among different gene expression data systems. This RFP solicits proposals which define interfaces and services in support of array based gene expression data collection, management, retrieval, and analysis." The RFP also requests definition of one or more XMI compliant Document Type Definitions (DTDs) "intended for use as self-describing data structures for encapsulation of hybridization, expression, and cluster data." In response to this RFP, relevant documents have been submitted by the European Bioinformatics Institute, Rosetta Inpharmatics, and NetGenics. (1) The EBI Initial Submisison regarding the Gene Expression RFP proposes "a framework for describing information about a DNA-array experiment and a data format -- Microarray Markup Language (MAML) -- for communicating this information... MAML is based on the Extensible Markup Language XML. MAML is independent of the particular experimental platform and provides a framework for describing experiments done on all types of DNA-arrays, including spotted and synthesized arrays, as well as oligo-nucleotide and cDNA arrays, and is independent of the particular image analysis and data normalization methods. MAML does not impose any particular image analysis or data normalization method, but instead provides format to represent microarray data in a flexible way, which allows to represent data obtained from not only any existing microarray platforms, but also many of the possible future variants, including protein arrays. The format allows representation of raw and processed microarray data. The format is compatible with the definition of the 'minimum information about a microarray experiment' (MIAME) proposed by the MGED group. (2) On behalf of the GEML Community, Rosetta Inpharmatics has submitted to the Object Management Group (OMG) a proposed DTD based on the new version of Gene Expression Markup Language - GEML 2.0. Rosetta Inpharmatics Initial Submission regarding the Gene Expression RFP describes work in connection with the GEML DTD: "Rosetta Inpharmatics and Agilent Technologies have been using the GEML 1.0 format as part of internal pipelines for the past year. Rosetta has been continuously loading XML files on the order of thirteen megabytes into the Rosetta Resolver system, an enterprise expression data analysis product. We recently used internal tools to export the more than one thousand profiles, assigned annotations, and supporting patterns that constituted the data for the article, Functional Discovery via a Compendium of Expression Profiles, that appeared in the July 7, 2000 issue of Cell. The total size of the export, when compressed, was a little over a half of gigabyte of data. That data was then imported by Harvard into their Rosetta Resolver system. We have not, as of yet, implemented the interfaces contained in this proposal but given that the size of the compressed XML files has proven no technical obstacle, we see no technical problems in implementing the interfaces. Rosetta has developed the freeware GEML Conductor tools for visualization of GEML formatted data and for conversion of gene expression data in other formats into GEML." See the XML DTD and IDL file. (3) In the NetGenics Submission, the UML model is normative. "The UML, which follows the recently adopted UML Profile for CORBA, permits semantic specifications that go beyond what is expressible in IDL. Given the size of typical data sets, a stream-based externalization approach makes sense. The stream would likely contain XML (e.g., Rosetta Inpharmatics' GEML), a popular means of representing gene expression data..." See the associated XMI file for details. See also: (1) "Gene Expression Markup Language (GEML)"; (2) "OMG Gene Expression RFP"; and (3) "Microarray Markup Language (MAML)."


  • [November 29, 2000]   
    Conversion Tool for DAML-O, RDF Schema, and UML/XMI.
        

    A posting from Sergey Melnik and Stefan Decker (Stanford University) announces the availability of an online tool for converting between different data representations. The conversion tool is documented on the Interdataworking web site. This work-in-progress tool features: "(1) support for conversion between DAML-O, RDF Schema, and UML/XMI; (2) translation between quad and built-in reification, representation of order using 'RDF Seq' and 'order-by-reification' mechanisms; (3) support for conversion from Protégé RDF files to DAML-O restrictions; (4) a new XML serializer for RDF (trivial RDF/XML syntax) with support for embedded models and statements." The web site "provides a testbed for the concept of gateways in the 'interdataworking' technology. Interdataworking is a novel software structuring technique that facilitates data exchange between heterogeneous applications. The testbed supports data conversion from one format into another; the source data can be specified using a URL or uploaded as a file from your file system. You can choose a parser for your data. The object model delivered by the parser is sent through a sequence of gateways. The list of gateways can be selected, and the order is important. The result is output using a specified serializer..." Theoretical background may be found in papers written by the developers: (1) "A Layered Approach to Information Modeling and Interoperability on the Web" (Melnik and Decker), and (2) "Representing UML in RDF" (Melnik). See related resources in "XML and 'The Semantic Web'."


    [November 28, 2000]   
    W3C Releases Jigsaw WebDAV Package.    

    The W3C's Jigsaw development team recently released a downloadable version of the Jigsaw Web server platform with a WebDAV package. Jigsaw is a W3C Open Source Project which provides a sample HTTP 1.1 implementation and a variety of other features on top of an advanced architecture implemented in Java. The new WebDAV implementation is "based on Jigsaw 2.1.2, and has been tested with cadaver, DAVExplorer and WebFolders." WebDAV (Web-based Distributed Authoring and Versioning) is an XML based protocol which "defines a set of new methods (PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK) and a set of new headers (DAV, Depth, If, Destination, ...); it supplies a set of extensions to the HTTP protocol which allows users to collaboratively edit and manage files on remote web servers." For additional references, see: (1) the WebDAV FAQ document; (2) the article by Tom Bednarz showing how to enable WebDAV (mod_dav) for the Apache server that ships with Mac OSX (beta); (3) "WEBDAV (Extensions for Distributed Authoring and Versioning on the World Wide Web."


    [November 27, 2000]   
    IEEE Workshop on XML-Enabled Wide Area Search in Bioinformatics (XEWA).    

    A two-day IEEE Workshop on XML-Enabled Wide Area Search in Bioinformatics (XEWA) will be held on December 13-14, 2000. The XEWA workshop is sponsored by the IEEE Computer Society, Mass Storage Systems and Technology Technical Committee. Workshop goals will be to: "(1) Enumerate relevant service types for bioinformatics; (2) Prioritize services according to those whose availability would provide the most bang for the buck; (3) Explore alternative representations for representing the schemata (e.g., RDF, XML, XOL), and converge on one or a few preferable options; (4) Produce several service-oriented schemata that provide the 'connective tissue' needed to access existing sites and services using a representation neutral format (e.g., ER / OO / UML diagrams... The goal of the XEWA workshop is to define a format capable of describing how to interact with a data source. This format should be simple enough to enter the description by hand, flexible enough to link in to existing ontologies, and descriptive enough to be useful to automated tools trying to access the source." Background and rationale: "There are well over 500 public domain data sources of interest to genomics/proteomics researchers. Many of these 'data sources' do more than just provide data, they also provide access to a wide range of services. A good example of this are sequence homology search engines. Given the differences in interfaces, syntax and semantics between sites, there is no practical path for a given researcher or research team to use more than a few. Data warehouses, federated systems, and the like help, but only a little. The number of new sources coming online every year, and the number of changes to existing sources, is simply overwhelming. This is one of the major problems driving bioinformatics today. We picture a genomics world in which scientists, search engines, and soft-bots can browse and execute (limited) queries against a wide range of sites, with no significant per-site overhead. Rather than attempting to integrate these sources (thus allowing complex queries against few sites), we advocate providing just enough connective tissue to allow semi-intelligent agents or search engines to execute simplified queries against hundreds of sites. The connective tissue can take the form of a collection of loose, service-oriented "schemata" that provide such systems with the information needed to work their way through the interface at each site, to get to the underlying services. A schema might include structured metadata with domain-specific information, a thesaurus, service descriptions, and typical web interfaces." For additional information, see the call for papers and the workshop program. In addition to the "500+ public domain data sources", there are well over a dozen XML DTDs and schemas for bioinformatic and genome mapping disciplines. See for example: (1) Gene Expression Markup Language (GEML); (2) CellML; (3) Genome Annotation Markup Elements (GAME); (4) XML for Multiple Sequence Alignments (MSAML); (5) Systems Biology Markup Language (SBML); (6) Bioinformatic Sequence Markup Language (BSML); (7) BIOpolymer Markup Language (BIOML); (8) "The Clone Annotation DTD"; (9) MAML DTD (microarray format markup language, being developed by a community of developers in the Array XML Working Group -- including Berkerley, NCBI, EBI, NCGR, Stanford -- as part of the MGED initiative).


    [November 27, 2000]   
    OASIS XML-Based Security Services Technical Committee to Define Security Framework.    

    An OASIS Technical Committee for 'XML-Based Security Services' is being formed with the goal of defining a "framework for sharing security information and security services on the Internet through XML documents." The initial members are from Sun Microsystems, JamCracker, and Netegrity. Projected deliverables include "a set of XML Schemas and an XML-based request/response protocol for authentication and authorization services. A draft of the Committee Specification (Version 0.8) will be based on the Security Services Markup Language (S2ML) co-authored by Netegrity, Inc. and its partners. The Committee Specification Version 0.8 will be ready by December 15, 2000. The final Committee Specification (Version 1.0) is scheduled for the second quarter 2001. The XML-Based Security Services TC intends to submit the Committee Specification as an OASIS standard after sufficient implementation experience has been gathered..." Subscription to the associated OASIS mailing list is open to anyone: send subscribe as the body of an email message to security-services-request@lists.oasis-open.org. For additional description and references, see (1) "Security Services Markup Language (S2ML)" and (2) the text of the announcement.


    [November 25, 2000]   
    DAML-ONT Specification and DAML-ONT Theoretic Semantics Model.
        

    The "DARPA Agent Mark Up Language (DAML)" is part of a new effort to "help bring the 'semantic web' into being, focusing on the eventual creation of a web logic language. DAML is being designed as an XML-based semantic language that ties the information on a page to machine-readable semantics (ontology). DAML represents joint work between DoD, industry and academia in both the US and the European Community and we hope it will lead to the eventual web standard in this area." The W3C mailing list 'www-rdf-logic@w3.org' now hosts a very active discussion on the developing DAML Ontology Language Specification, released in October 2000. Several new resources are available from the project web sites. The DAML Ontology Library provides a summary of submitted ontologies, sortable by URI, Submission Date, Keyword, Open Directory Category, Class, Property, Funding Source, and Submitting Organization. A technical document by Richard Fikes and Deborah L. McGuinness "A Model Theoretic Semantics for DAML-ONT" outlines a "model-theoretic semantics for the DAML-ONT language by providing a set of first-order logic axioms that can be assumed to hold in any logical theory that is considered to be logically equivalent translation of a DAML-ONT ontology. The intent is to provide a precise, succinct, and formal description of the relations and constants in DAML-ONT (e.g., complementOf, intersectionOf, Nothing). The axioms provide that description by placing a set of restrictions on the possible interpretations of those relations and constants. The axioms are written in ANSI Knowledge Interchange Format (KIF), which is a proposed ANSI standard. The document is organized as an augmentation of the DAML-ONT specification. Each set of axioms and their associated comments have been added to the specification document immediately following the portion of the specification for which they provide semantics. For example, the axioms providing semantics for the property complementOf immediately follow the XML property element that defines complementOf. We have maintained the ordering of the definitions from the original DAML-ONT specification, although that ordering is not optimal for understanding the axioms. In particular, the following terms are used in axioms before they are defined in the document: Class, Property, domain, range, type, List." An "Annotated DAML Ontology Markup - Walkthrough" supplies an example DAML Ontology; the example ontology demonstrates each of the features in DAML-ONT, an initial specification for DAML Ontologies. Other recently published resources include documents comparing DAML (or DAML-ONT) to: (1) "Simple HTML Ontology Extensions (SHOE)" and "Ontology Interchange Language (OIL)." For additional information, see (1) the archives of the W3C discussion list for DAML-ONT ('www-rdf-logic@w3.org'); (2) the DAML web site; (3) "DARPA Agent Mark Up Language (DAML)."


    [November 25, 2000]   
    RELAX Core Published as ISO/IEC DIS 22250-1 with Technical Report in English.    

    urata Makoto recently announced that RELAX Core has been released as an ISO document: ISO/IEC DIS 22250-1. Text and office systems -- Regular Language Description for XML (RELAX) -- Part 1: RELAX Core Document Type: DIS (Fast Track). Voting on the DIS will end on 2001-05-02. An English translation of the RELAX Core specification (JIS TR) is now available in PDF and .DOC formats. A copy of the DIS is available from ISO for standard ISO charges. The original TR specification (JIS TR X 0029:2000, Regular Language Description for XML (RELAX): RELAX Core) available in English is a 36-page technical report which "specifies mechanisms for formally specifying the syntax of XML-based languages. For example, the syntax of XHTML 1.0 can be specified in RELAX. Compared with DTDs, RELAX provides the following advantages: (1) Specification in RELAX uses XML instance (i.e., document) syntax, (2) RELAX provides rich datatypes, and (3) RELAX is namespace-aware. The RELAX specification consists of two parts, RELAX Core and RELAX Namespace. This Technical Report specifies RELAX Core, which may be used to describe markup languages containing a single XML namespace. Part 2 of this Technical Report specifies RELAX Namespace, which may be used to describe markup languages containing more than a single XML namespace, consisting of more than one RELAX Core document. Given a sequence of elements, a software module called the RELAX Core processor compares it against a specification in RELAX Core and reports the result. The RELAX Core processor can be directly invoked by the user, and can also be invoked by another software module called the RELAX Namespace processor. This Technical Report also specifies a subset of RELAX Core, which is restricted to DTD features plus datatypes. This subset is very easy to implement, and with the exception of datatype information, conversion between this subset and XML DTDs results in no information loss. RELAX Core uses the built-in datatypes of XML Schema Part 2. Datatypes can be used as conditions on attributes or used as hedge models. The TR also defines some datatypes specific to RELAX." Annex A supplies an XML DTD for RELAX Core. Annex B gives a RELAX Module for RELAX Core. For related XML schema research, see "XML Schemas."


    [November 24, 2000]   
    W3C's Amaya 4.1 Browser/Editor Supports Advanced Features.    

    W3C has announced the release of Amaya version 4.1, supporting HTML 4.0, XHTML 1.0, HTTP 1.1, MathML 2.0, and many CSS 2 features; it also provides RDF and XPointer/Xlink support in connection with its collaborative annotation system. Source code and binaries are available for download; see also the CVS database. Description: "Amaya is W3C's own versatile editor/browser. With the extremely fast moving nature of Web technology, Amaya plays a central role at the Consortium. Easily extended to integrate new ideas into its design, Amaya provides developers with many specialized features including multiple views, where the internal structural model of the document can be displayed alongside the browser's view of how it should be presented on the screen. Amaya has a counterpart called Jigsaw which plays a similar role on the server side. Amaya is a complete Web browsing and authoring environment and comes equipped with a WYSIWYG style of interface, similar to that of the more popular commercial browsers. Amaya maintains a consistent internal document model adhering to the Document Type Definition (DTD), meaning that it handles the relationships between various document components: paragraphs, headings, lists and so on, as laid down in the relevant W3C Recommendation." Amaya offers advanced transport protocols support (e.g., content negotiation and 'keep alive' connections per libwww and HTTP/1.1), CSS stylesheet editing/publishing, WYSIWYG interface editing and rendering of mathematical expressions (MathML), and advanced graphics support (e.g., PNG, Scalable Vector Graphics). Amaya 4.X also "includes a collaborative annotation application based on Resource Description Framework (RDF), XLink, and XPointer. From the technical point of view, annotations are usually seen as metadata, as they give additional information about an existing piece of data. In this project, we use a special RDF annotation schema for describing annotations. Annotations can be stored locally or in one or more annotation servers. When a document is browsed, Amaya queries each of these servers, requesting the annotations related to that document. Amaya uses XPointer to describe where an annotation should be attached to a document. With this technique, it is possible to annotate any Web document independently, without needing to edit that document. Finally Amaya presents annotations with pencil annotation icons and attaches XLink attributes to these icons. If the user single-clicks on an annotation icon, the text that was annotated is highlighted. If the user double-clicks on this icon, the annotation text and other metadata are presented in a separate window..." For documentation on the RDF/XPointer implementation, see: (1) "Annotations in Amaya"; (2) "Annotation Server HOWTO" ['how to set up and use an W3C-Perllib Annotations server']; (3) the special RDF annotation schema.


    [November 22, 2000]   
    XEXPR - A Scripting Language for XML.    

    W3C has acknowledged a submission from eBusiness Technologies, Inc. for XEXPR - A Scripting Language for XML. Reference: W3C Note 21 November 2000, by Gavin Thomas Nicol (Chief Scientist, eBusiness Technologies, Inc.). Document abstract: "In many applications of XML, there is a requirement for using XML in conjunction with a scripting language. Many times, this results in a scripting language such as JavaScript being bound within the XML content (like the <script> tag). XEXPR is a scripting language that uses XML as its primary syntax, making it easily embeddable in an XML document. In addition, XEXPR takes a functional approach, and hence maps well onto the syntax of XML." An associated specification XTND - XML Transition Network Definition (published as a separate NOTE) provides a generic DTD which uses XEXPR. Description: "In XML-based standards there often arises the need for two components: (1) A component for describing, declaratively, a set of states, and transitions between them: for example, when describing business processes, protocols, or decision trees. (2) A component allowing logic to be embedded into the XML. This submission is made up of two parts: XTND and XEXPR. XTND is a generic DTD that can be used for describing transition networks, and their interaction with the outside world. XEXPR is a scripting language that uses XML syntax, and hence is designed to be embedded in XML. XTND uses XEXPR. eBT is submitting these two specifications to the W3C in the hope that they will be incorporated into future specifications that need such functionality." The XTND part of the specification (XML Transition Network Definition), published as W3C Note 21-November-2000, provides formal constructs for encoding states and transitions in events in processes. Description: "In many systems, transition networks are used to describe a set of states and the transitions that are possible between them. Common examples are such things as ATM control flows, editorial review processes, and definitions of protocol states. Typically, each of these transition networks has its own specific data format, and it's own specific editing tool. Given the rapid transition to pervasive networking, and to application integration and interchange, a standard format for transition networks is desirable. This document defines such an interchange format, defined in XML: the interchange language for the Internet... Loosely speaking, a transition network is a set of states and the transitions between them. They are good at capturing the notion of process. For example: (1) Control processes such as those in a digitally controlled heating system. (2) Processes controlling manufacture or design. (3) Workflow processes such as those found in product data management software. They are also useful in modeling the behavior of systems and can be used in object-oriented analysis to create formal models of object interaction and larger system behavior. Transition networks are closely related to finite state machines (FSM), and to data flow diagrams(DFD), but they are augmented with the following capabilities: (1) Transition networks are not limited to "accepting or rejecting their input". Transition networks may execute actions or fire off events during transitions. (2) Transition networks can interact with other objects, thereby affecting change in the transition network (or in other networks). (3) Transitions in transition networks can be controlled by guard conditions that prohibit or allow the transition to be followed. (4) These guard conditions can be dependent on any predicate involving objects from within the environment of the transition network. As such, transition networks can be used to describe far more complex interactions or processes than either FSMs or DFDs allow." The W3C staff comment says in part: "It is common to combine the declarative potential of XML with imperative scripting languages such as ECMAScript. The submission defines a new scripting language (XEXPR) which is itself expressed directly in XML. The language takes a functional approach and avoids the need for further parsing machinery as would be needed for a syntax featuring infix operators. The submission demonstrates the use of XML for defining a functional scripting language and for representing finite state transition networks. This may prove to be of interest to future W3C work on dialogs for human-computer interaction, and more generally as a component for a Web application framework. Current W3C work on voice browsers is taking a different approach, using a form filling metaphor for representing dialogs, with a focus on easy authoring for voice applications. This work is drawing upon rich experience with earlier markup languages for voice interaction, and it is unclear whether the more abstract approach presented in the submission is relevant. W3C's work on forms is using XML Schema as the basis for the modelling data, with the addition of dynamic integrity constraints that act over multiple fields. For example, the total value of an order can be defined in terms of a computation over the values of other fields such as unit prices, quantities, discounts, and tax and shipping costs. Such computations can be conveniently represented as expressions that evaluate to typed values. The focus is on a simple side-effect free representation of constraints, based upon the type system defined by XML Schema and the use of XPath for addressing form data. The XML scripting language proposed in the submission could be of interest to the XForms working group, but may prove to be too complicated, for the restricted requirements for forms. XForms is expected to have to interoperate with popular scripting languages such as ECMAScript. This avoids the need for the constraint language to evolve into a general purpose scripting language."


    [November 22, 2000]   
    empolis K42 Knowledge Server.    

    Jasmin Franz (STEP Electronic Publishing Solutions GmbH) recently posted an announcement for the release of an evaluation version of its 'K42 knowledge server'. Excerpts: "empolis, a world class provider of knowledge management solutions, proudly announces the beta release of empolis K42, its cutting edge knowledge server that is fully compliant with the ISO standard Topic Maps (see www.topicmaps.com). The free evaluation is available at www.empolis.co.uk. Knowledge management is recognised as a crucial part of utilising information assets, whether it is for corporate or commercial publishers. empolis K42 Knowledge Server provides a real time, persistent and scalable solution to approaching knowledge management. Written in Java, in order to aid cross-platform support, it has an extensive API allowing it to be customised and extended to better meet customer's individual requirements. Utilising the latest standards including XML, XLink, Topic Maps, and XTM, empolis K42 provides access to knowledge through its Knowledge Author and Knowledge Navigator components - both of which run within a web browser. The Gartner Group said of Topic Maps: 'the paradigm is powerful, flexible and extensible, topic maps will become a mainstream technology by 2003.' empolis employees are actively involved in the Topic Maps and XTM standard developments. empolis K42 provides a new paradigm for organising, maintaining and navigating information. The information models it stores are independent of the physical domain in which that information resides. These models can provide the routes to information, such as a set of web resources on a server and do not have to be contained within that information. As a result they can be used to deploy information sets in different environments with different requirements, and can also be personalised by individual users and user communities... Some of the highlights of empolis K42 Knowledge Server: (1) empolis K42 provides a Knowledge Author component to enable the creation and maintenance of the knowledge data. It allows the knowledge server to be updated in real time. (2) Knowledge Navigator provides a delivery solution that can be rapidly implemented to enable companies to deliver the knowledge data in their own corporate style through the use of XML and XSL. (3) empolis K42 is written in Java in order to aid cross-platform support and has a comprehensive API to expose the functionality it provides and to enable customisation and integration of the software. (4) empolis K42 has already been tested to persist and provide access to over a million topics and is designed to scale to tens of millions. empolis K42, as a beta version, utilises and supports the Topic Map standard. But empolis K42 is a knowledge server that will enable portals, corporates and communities to capture, manage and deliver valuable knowledge assets. As such, empolis K42 will support not only Topic Maps but will include other such effective standards that help capture and express knowledge." For reference, see "(XML) Topic Maps."


    [November 22, 2000]   
    XSLTDoc for Browsing XSLT Stylesheets.    

    Jeni Tennison recently announced the (alpha) availability of a tool designed to help people browse their stylesheets. The tool itself is an XSLT application. "For beginners, it gives a description of what each instruction is doing in theory (it doesn't trace the actual running of the stylesheet), including a summary of any XPaths. For people writing complex stylesheets, it provides summary views. The XSLTDoc application gives you: (1) links to the called template from any xsl:call-template instruction; (2) links to the definitions of the variables/parameters wherever they're used; (3) a sortable summary tables giving template matches and modes. It's all import/include aware, and tells you when a particular named template, variable declaration and so on are overridden in importing stylesheets. Getting linking done with matching/moded templates is a goal, but it's pretty tricky especially as there may be several templates that match in a particular case, and it's really impossible to know which will do so without having a specific source XML instance. The tool is available for download from the utilities page on Jeni Tennison's web site. Just "download the .ZIP archive, unzip it into a working directory, and load xslt-doc.xsl; you will be prompted for a stylesheet to load; enter its file name relative to the XSLTDoc directory." Note also Jeni's XSLT Pages with tutorials.


    [November 22, 2000]   
    IBM alphaWorks Releases XSLbyDemo Tool for XSLT Rules Generation.    

    A new tool from IBM's alphaWorks XML Application Development team is XSLbyDemo. XSLbyDemo is a technology "for generating XSLT rules on the basis of editing operations conducted under the WYSIWYG mode of Page Designer, which is a full-fledged HTML authoring tool provided with IBM WebSphere Studio. The remarkable feature of XSLbyDemo is that users can create an XSLT stylesheet automatically solely on the basis of the knowledge of HTML editing. The remarkable feature of XSLbyDemo is that users can create an XSLT stylesheet automatically solely on the basis of the knowledge of HTML editing. The users do not have to know anything about the syntax/programming of XSLT, and need not be aware the rule generation process, which happens behind the HTML authoring in the WYSIWYG mode. The users are thus allowed to concentrate on the styling of the HTML document, relying on the Page Designer's full capabilities for HTML and CSS authoring. XSLbyDemo finally produces an XSLT stylesheet that transforms a given HTML document to a desired document obtained as the results of the WYSIWYG authoring. XSLbyDemo runs under Windows NT 4.0 with Service Pack 4, Windows 95, Windows 98, or Windows 2000." For related tools, see "XSL/XSLT Software Support."


    [November 22, 2000]   
    IBM's XML and Web Services Development Environment.    

    New from IBM alphaWorks labs: XML and Web Services DE. "The IBM XML and Web Services Development Environment is the first development environment that creates open, platform-neutral Web services for deployment across heterogeneous systems. This tool allows HTML, Java, SQL and XML developers to quickly extend existing e-business applications so that they can deliver business informational Web services. Database developers can also use SQL as a programming language to quickly build data-aware Web services. Web developers can create Web services with minimal knowledge of Java, XML or SOAP. It turns the power of XML and Java technology into competitive e-business advantage. It provides all of the tooling needed to create Web services... (1) Discover - Browse the UDDI Business Registry to locate existing Web services for integration. The Web becomes an extension of the development environment. (2) Create/Transform - Use powerful XML editing functions to quickly develop new Web services. Complete transformation (edit and mapping) tools are also provided so that developers can create Web services from existing XML, Java, or SQL applications. (3) Build - Wrap existing bean components as SOAP-accessible services and describe them in the Web services description language (WSDL). Generate SOAP proxies to Web services described in WSDL. Generate bean skeletons from WSDL. Minimal knowledge of SOAP or WSDL is required. (4) Deploy - Deploy the Web service on the developer's machine or to a remote, production-level server for testing right away. After testing, publish the Web service immediately to the application server (WebSphere Application Server or Apache Tomcat). (5) Test - Test applications as they run locally or remotely, and get instant feedback. (6) Publish - In addition to creating and deploying Web services, the development environment can also publish them to the UDDI Business Registry. This advertises your Web services so that other businesses can access them." See (1) "Universal Description, Discovery, and Integration (UDDI)"; (2) "Simple Object Access Protocol (SOAP)"; (3) "Web Services Description Language (WSDL)."

  • [November 21, 2000]   
    Extensible Stylesheet Language (XSL) Specification Becomes W3C Candidate Recommendation.    

    W3C has announced the promotion of the XSL specification to the status of a W3C Candidate Recommendation: Extensible Stylesheet Language (XSL) Version 1.0. Reference: W3C Candidate Recommendation 21-November-2000, edited by Sharon Adler (IBM), Anders Berglund (IBM), Jeff Caruso (Pageflex), Stephen Deach (Adobe), Paul Grosso (ArborText), Eduardo Gutentag (Sun), Alex Milowski (Lexica), Scott Parnell (Xerox), Jeremy Richman (BroadVision), Steve Zilles (Adobe). Document abstract: "XSL is a language for expressing stylesheets. It consists of two parts: (1) a language for transforming XML documents, and (2) an XML vocabulary for specifying formatting semantics. An XSL stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses the formatting vocabulary." Description: "XSL is a language for expressing stylesheets. Given a class of arbitrarily structured XML documents or data files, designers use an XSL stylesheet to express their intentions about how that structured content should be presented; that is, how the source content should be styled, laid out, and paginated onto some presentation medium, such as a window in a Web browser or a hand-held device, or a set of physical pages in a catalog, report, pamphlet, or book... An XSL stylesheet processor accepts a document or data in XML and an XSL stylesheet and produces the presentation of that XML source content that was intended by the designer of that stylesheet. There are two aspects of this presentation process: first, constructing a result tree from the XML source tree and second, interpreting the result tree to produce formatted results suitable for presentation on a display, on paper, in speech, or onto other media. The first aspect is called tree transformation and the second is called formatting. The process of formatting is performed by the formatter. This formatter may simply be a rendering engine inside a browser. Tree transformation allows the structure of the result tree to be significantly different from the structure of the source tree. For example, one could add a table-of-contents as a filtered selection of an original source document, or one could rearrange source data into a sorted tabular presentation. In constructing the result tree, the tree transformation process also adds the information necessary to format that result tree. Formatting is enabled by including formatting semantics in the result tree. Formatting semantics are expressed in terms of a catalog of classes of formatting objects. The nodes of the result tree are formatting objects. The classes of formatting objects denote typographic abstractions such as page, paragraph, table, and so forth. Finer control over the presentation of these abstractions is provided by a set of formatting properties, such as those controlling indents, word- and letter-spacing, and widow, orphan, and hyphenation control. In XSL, the classes of formatting objects and formatting properties provide the vocabulary for expressing presentation intent. The XSL processing model is intended to be conceptual only. An implementation is not mandated to provide these as separate processes. Furthermore, implementations are free to process the source document in any way that produces the same result as if it were processed using the conceptual XSL processing model." The new CR has been produced by the XSL Working Group as part of the W3C Style Activity. The Candidate Recommendation review period ends on February 28, 2001; meantime, comments may be sent to the publicly archived XSL mailing list. The following exit criteria for the CR (preceding advancement to PR) are proposed: "(1) Sufficient reports of implementation experience have been gathered to demonstrate that XSL processors based on the specification are implementable and have compatible behavior. (2) An implementation report shows that there is at least one implementation for each basic formatting object and property. (3) Providing formal responses to all comments received." The specification is available also in PDF, XML, HTML, and .ZIP archive formats. For related references, see (1) the W3C XSL specification work and (2) "Extensible Stylesheet Language (XSL/XSLT)."


    [November 20, 2000]   
    W3C's Natural Language Semantics Markup Language for the Speech Interface Framework.    

    The W3C has issued a new working draft specification which describes markup for representing natural language semantics: Natural Language Semantics Markup Language for the Speech Interface Framework. Reference: W3C Working Draft 20-November-2000, by Deborah A. Dahl (Unisys). Document abstract: "The W3C Voice Browser working group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of specifications for voice browsers, and provides details of an XML markup language for describing the meanings of individual natural language utterances. It is expected to be automatically generated by semantic interpreters for use by components that act on the user's utterances, such as dialog managers." In this proposal, the NL semantics representation "uses the data models of the W3C XForms draft specification to represent application-specific semantics. While XForms syntax may change in future revisions of the specification, it is not expected to change in ways that affect the NL Semantics Markup Language significantly." The authors of the WD are members of the W3C Voice Browser Working Group. The specification has been produced as part of the W3C Voice Browser Activity, and forms part of the proposals for the W3C Speech Interface Framework. The specification includes a set of draft elements and attributes and [later will include] a draft DTD. Markup uses a root element <result> (with attributes grammar, x-model, and xmlns) which includes one or more <interpretation> elements. Multiple interpretations result from ambiguities in the input or in the semantic interpretation. The <interpretation> element has attributes confidence, grammar, x-model, and xmlns. The <interpretation> element includes an <input> element which contains the input being analyzed, optionally a <model> element defining the XForms data model and an <instance> element containing the instantiation of the data model for this utterance. Description: "The general purpose of the NL Semantics Markup is to represent information automatically extracted from a user's utterances by a semantic interpretation component, where utterance is to be taken in the general sense of a meaningful user input in any modality supported by the platform. Referring to the sample Voice Browser architecture in Introduction and Overview of the W3C Speech Interface Framework, a specific architecture can take advantage of this representation by using it to convey content among various system components that generate and make use of the markup. Components that generate NL Semantics Markup: (1) ASR, (2) Natural language understanding, (3) Other input media interpreters [e.g. DTMF, pointing, keyboard], (4) Reusable dialog component, (5) Multimedia integration component. Components that use NL Semantics Markup: (1) Dialog manager, and (2) Multimedia integration component. A platform may also choose to use this general format as the basis of a general semantic result that is carried along and filled out during each stage of processing. In addition, future systems may also potentially make use of this markup to convey abstract semantic content to be rendered into natural language by a natural language generation component..." Comments on the working draft may be sent to the publicly archived W3C mail list 'www-voice@w3.org'. See also the related grammar specification: Speech Recognition Grammar Specification for the W3C Speech Interface Framework.


    [November 17, 2000]   
    Rule Markup Language (RuleML).    

    The RuleML Initiative represents a collaborative research effort by an international team of participants seeking to develop a shared Rule Markup Language (RuleML). The project is consciously related to other standards work, including Mathematical Markup Language (MathML), DARPA Agent Markup Language (DAML), Predictive Model Markup Language (PMML), Attribute Grammars in XML (AG-markup), and Extensible Stylesheet Language Transformations (XSLT). From the web site description: "The participants of the RuleML Initiative constitute an open network of individuals and groups from both industry and academia. We are not commencing from zero but have done some work related to rule markup or have actually proposed some specific tag set for rules. Our main goal is to provide a basis for an integrated rule-markup approach that will be beneficial to all involved and to the rule community at large. This shall be achieved by having all participants collaborate in establishing translations between existing tag sets and in converging on a shared rule-markup vocabulary. This RuleML kernel language can serve as a specification for immediate rule interchange and can be gradually extended - possibly together with related initiatives - towards a proposal that could be submitted to the W3C. Rules can be stated (1) in natural language, (2) in some formal notation, or (3) in a combination of both. Being in the third, 'semiformal' category, the RuleML Initiative is working towards an XML-based markup language that permits Web-based rule storage, interchange, retrieval, and firing/application. Rules in (and for) the Web have become a mainstream topic since inference rules were marked up for E-Commerce and were identified as a Design Issue of the Semantic Web, and since transformation rules were put to practice for document generation from a central XML repository (as used here). Rules have also continued to play an important role in Intelligent Agents and AI shells for knowledge-based systems, which need a Web interchange format, too. The Rule Markup Initiative has taken initial steps towards defining a shared Rule Markup Language (RuleML), permitting both forward (bottom-up) and backward (top-down) rules in XML for deduction, rewriting, and further inferential-transformational tasks. The initiative started during PRICAI 2000, as described in the Original RuleML Slide, and was launched in the Internet on 2000-11-10. A complementary effort coordinates the development of Java rule engines. A Rule Markup Workshop is planned in conjunction with the third International Conference on Electronic Commmerce, ICEC2001, in Vienna, Austria, in October 2001." For background and references, see (1) the RuleML web site and "Rule Markup Language (RuleML)." See similarly Relational-Functional Markup Language (RFML) and Business Rules Markup Language (BRML).


    [November 17, 2000]   
    Ontopia Topic Map Navigator Publicly Available.    

    A communiqué from Sylvia Schwab announces the public availability of Ontopia's Topic Map Navigator (limited edition): "Ontopia is pleased to announce that you can now download a free version of the Topic Map Navigator directly from the Ontopia website. The Navigator allows you to browse your topic maps in a convenient web interface with no need for programming or configuration. Ontopia will be adding support for the XTM (XML Topic Map) DTD as soon as its been finalized; in the meantime an XML DTD (Document Type Definition) defined by Ontopia is required. If your topic map is valid against the following DTD (http://www.ontopia.net/ontopia/tmdtds.html) you can load it into the navigator and start browsing it right away. The free version of the Navigator is restricted to only accept topic maps smaller than 5 kilotao in size. This means that the topic map can have no more than 5000 topics, associations and occurrences. The Navigator will expire on 15 April 2001 and is intended for non-commercial use. Shortly before the expiry date, you will be able to upgrade to a trial of our 1.0 version of the software... The Ontopia Navigator is a navigational interface for topic maps built using the Ontopia Engine. It is written as a collection of Java Server Pages (JSPs) that use the Ontopia Engine to load a topic map and produce a navigational web interface to it. This means that the Navigator can be deployed on any web server that supports JSP. It includes a high-level API which enables any Java developer or web-developer with JSP skills to quickly create fully-functional, customised web applications. The resulting interface consists of simple HTML web pages using frames and some very simple JavaScript for the implementation of the default occurrence types extension. This means that the Navigator works with any web browser that supports frames. The Navigator package also includes a reference implementation to provide a starting point for developing new visualisations." [Note: 'kilotao' in the announcement is a suspected Ontopian neologism, derived from "kilo" (1024) + "TAO = 'topic, association, occurrence'," as in "The TAO of Topic Maps,", by Steve Pepper.] For other TM information, see (1) "The Ontopia Topic Map Engine: A Technical Introduction" -- a brief introduction to the Ontopia Topic Map Engine and Navigator for technically oriented readers, by Lars Marius Garshol; (2) online demonstrations of the Navigator; (3) the XTM (XML Topic Maps) Document Web site; (4) "(XML) Topic Maps."


    [November 17, 2000]   
    IDOOX Releases XDB: XML Database.    

    Miloslav Nic announced the pre-release publication of XDB: XML Database. "XDB is an XML document repository providing structured storage of XML data, at present using an RDBMS (Relational Database Management System) mapping over PostgreSQL. As the first step, our plan is to develop a lightweight XML persistent storage engine on top of a relational database backend to come up with a UI and API in short time and replace it by our native XML storage system in the second step to satisfy complex XML processing requirements. XDB intention is to offer a fast, reliable and scalable XML database framework with powerful querying techniques according to W3C standards (XPath, XML Query) and standard XML processing APIs (SAX, DOM)... the main purpose of XDB is to provide native storage of XML data. RDBMS is not the target, but just temporal method which will be replaced by dedicated storage within couple of months. Principal features: (1) Ability to store and process large collections of XML documents; (2) Stores any well-formed document; (3) Provides SAX interface; (4) RDBMS mapping of XML documents; (5) Access via XPath based query language; (6) Independence on database system." See also the associated white paper. For related tools, see "XML and Databases."

    [November 16, 2000]   
    MATE Project Uses XML Tools for Spoken Language Dialogue Corpora.    

    Numerous encoding initiatives now employ XML in the annotation of spoken language dialogue corpora. One such XML-based project is MATE (Telematics Project LE4-8370; Multilevel Annotation, Tools Engineering), which "aims to facilitate re-use of language resources by addressing the problems of creating, acquiring, and maintaining language corpora. The problems are addressed along two lines: (1) through the development of a standard for annotating resources; (2) through the provision of tools which will make the processes of knowledge acquisition and extraction more efficient. Specifically, MATE treats spoken dialogue corpora at multiple levels, focusing on prosody, (morpho-) syntax, co-reference, dialogue acts, and communicative difficulties, as well as inter-level interaction. The results of the project will be of particular benefit to developers of spoken language dialogue systems but will also be directly useful for other applications of language engineering." The 'MATE Dialogue Annotation Guidelines' provide "a comprehensive collection of recommendations or guidelines for representing descriptive annotation of spoken dialogue material. Descriptive annotation includes any information that encodes linguistic data with respect to their physical, perceptual, or functional dimensions. Spoken dialogue material refers to any collection of spoken dialogue data (human-human, human-system, or human-human-system), including not only speech files but also logfiles or scenarios which are related to the spoken dialogues. Spoken dialogue annotation is the only area considered in this report, however this does not exclude that the recommendations may apply to other areas as well. It builds on a common standard framework in terms of a coding module at the conceptual level and an underlying representation in XML at the implementational level. For each level considered by MATE recommendations are provided on how to encode relevant phenomena, one or more best practice coding modules are provided and several examples are given. The descriptions given in this document allow a complete separation from the underlying machine representation for which MATE uses XML. The separation means that in principle one could decide to other formats than XML at the implementational level without affecting the coding module in any way. In this document recommendations will be made that rely on a given markup language, XML, that has already found broad support. This is an important factor as the availability of parsers and other software enhances the integration of this proposal into existing environments." Annex C supplies the XML DTDs. The associated MATE workbench program "provides support for flexible display and editing of XML annotations, and complex querying of a set of linked files. The workbench was designed to support the annotation of XML coded linguistic corpora, but it could be used to annotate any kind of data, as it is not dependent on any particular annotation scheme. Rather than being a general purpose XML-aware editor it is a system for writing specialised editors tailored to a particular annotation task. A particular editor is defined using a transformation language, with suitable display formats and allowable editing operations. The workbench is written in Java, which means that it is platform-independent. This paper outlines the design of the workbench software and compares it with other annotation programs. . . The major features of the MATE workbench are: (1) An internal database - using arbitrary XML as an interchange format, extended to cover multiple hierarchies and arbitrary directed graphs using hyperlinks or ID/IDREF pointers between elements. This extension from trees to graphs is required to allow XML to represent more complex data. (2) A query language which is tailored to this internal representation. This language returns tuples instead of single elements (as in the XSLT query language). The architecture allows us to add new structure to the database by evaluating a query. (3) A transformation language and processor that goes beyond XSLT in some respects. (4) A display and editing engine for displaying to the user and enabling editing actions. The MATE workbench uses XML as its input/output format, and uses a similar internal data model. However, the strictly hierarchic nature of XML is at odds with certain aspects of linguistic (particularly speech) data. In multi-speaker dialogues, speech may overlap, and different annotation hierarchies coded on a corpus may overlap, for example prosody and syntax. One way to indicate this non-hierarchical structure in XML is by the use of standoff annotation. Linking between elements is done by means of a distinguished href attribute of elements, which uses a subset of the XPointer proposal to point to arbitrary elements in the same or different files. Such attributes are often called hyperlinks. This extended data model allows us to represent overlapping or crossing annotations...for example, such that XML represent a case where a contrastive marking is on the subject and verb and crosses a <vp> constituent..." For references, see "Multilevel Annotation, Tools Engineering (MATE)." Related speech data annotation projects include, for example: (1) DARPA Communicator Project and XML Log Standard; (2) Computing Environment for Linguistic, Literary, and Anthropological Research (CELLAR); (3) Architecture and Tools for Linguistic Analysis Systems (ATLAS); (4) TalkBank and the Codon XML-Based Annotation Framework; (5) ACE Pilot Format DTDs; (6) Transcriber - Speech Segmentation and Annotation DTD.


    [November 16, 2000]   
    AuthXML Standard for Web Security.    

    Securant Technologies recently "announced the formation of an open industry working group to facilitate the creation of the first XML-based standard for Web security, called AuthXML. This standard will leverage XML, which is platform and programming language independent, to enable authentication and authorization functions to be performed across and interoperate with multi-vendor Web security systems, packaged and custom Web applications, and network level security systems. AuthXML will allow integrated Web commerce and a transparent user experience by providing a standardized approach for presenting and keeping track of security details as a transaction or session traverses linked Web sites based on disparate technologies, applications and platforms. Securant has been working with its key customers and partners for several months to develop a framework for the AuthXML specification, and is now opening up its research and design efforts to help foster and accelerate the adoption of a universal standard. AuthXML is a vendor-neutral standard that enables integration of Web security, network security, B2B infrastructures and applications. AuthXML is named as such because it comprises 2 primary components: Authentication and Authorization and is designed to ease integration of transactions between trading partner sites that may be using different security systems and within a given site that may be deploying multiple applications that need integrated security. AuthXML will enable: (1) Faster deployment for customers through standards based integration, (2) Interoperability between Web security vendors allowing for secure and simplified integrated commerce, (3) Simplified user experience through reduced sign-ons across Web networks, (4) More tightly integrated Web sites and applications based on non-proprietary integration. AuthXML is intended to be a completely open standard for Web-based application security and inter-application integration. The standard defines a set of XML message formats, XML schemas and interaction models that web sites can use in order to provide seamless user experience and business transactions that span multiple parties and security domains across the Internet. AuthXML is not owned by any one vendor. Instead, the standards proposal will be submitted to an appropriate open standards body to ensure that it remains an open industry standard in which any interested companies and organizations can participate. The AuthXML 1.0 Specification is currently [2000-11-16] under development by Securant Technologies and some of its key partners and customers." For references, see "AuthXML Standard for Web Security."


    [November 16, 2000]   
    Security Services Markup Language (S2ML).
        

    Netegrity, Inc. has "announced that it is working with a group of industry leading companies to define the first standard for enabling secure e-commerce transactions using XML. The industry's first major collaboration, called Security Services Markup Language (S2ML), will create a common language for sharing security information about transactions and end users between companies engaged in online B2B and B2B2C transactions. Authors of the S2ML specification are Bowstreet, Commerce One, Jamcracker, Netegrity, Sun Microsystems, VeriSign, and webMethods. Reviewers of the specification include Art Technology Group, Oracle, PricewaterhouseCoopers, and TIBCO. S2ML is intended to solve [security] problems by helping to unify access control methods through an open, standards-based framework for the next generation of secure e-commerce transactions. The S2ML specification addresses three main areas of security services: authentication, authorization, and entitlement/privilege. S2ML defines standard XML schemas, as well as an XML request/response protocol, for describing authentication and authorization services through XML documents. S2ML also will provide specific bindings for various protocols such as HTTP and SOAP and B2B messaging frameworks such as ebXML. S2ML will deliver the following benefits: (1) Interoperability: With S2ML e-marketplaces, service providers, and end user companies of all sizes will be able to securely exchange information about authenticated users, Web services, and authorization information without requiring partners to change their current security solutions. S2ML will become the common language for different infrastructures to communicate security data. (2) Open Solution: S2ML is designed to work with multiple XML document exchange protocols and frameworks such as SOAP, OAG, MIME, Biztalk, and ebXML. (3) Single Sign-On Across Partner Sites: S2ML will enable users to travel across sites with their credentials and entitlements so that companies and partners in a trusted relationship can deliver single sign-on across sites, regardless of the security infrastructures in place. The S2ML effort is an open industry initiative in which any organization can participate and implement the specifications. The vendors behind the S2ML initiative plan to submit the S2ML 0.8 specification to the World Wide Web Consortium (W3C) and OASIS for consideration within the next 30 days." For other details, see (1) the S2ML web site; (2) "Security Services Markup Language (S2ML)"; and (3) the full text of the announcement: "Netegrity And Industry Leaders To Define First XML Standard For Secure E-Commerce. Art Technology Group, Bowstreet, Commerce One, Jamcracker, Oracle, PricewaterhouseCoopers, Sun Microsystems, TIBCO Software Inc., VeriSign, and webMethods join Netegrity to Develop Security Services Markup Language (S2ML)."


  • [November 14, 2000] 
    "Free XML Starter Kit. Software AG now offering trial version of the new Tamino XML Platform.

    "Software AG is offering its new Tamino XML Platform free of charge for a 90-day evaluation period. The trial version contains Tamino XML database, X-Studio (development environment), and X-Bridge (integration tool). The software features a comprehensive set of functionalities, which introduce professional-level users to XML applications. For instance, they can test the products with a sample application from the real-estate industry included in the Kit. With the XML Starter Kit, testers see how easy it is to develop applications with the new Tamino XML Platform and acquire hands-on experience using XML. Everyone's talking about XML -- but only a very select few have actually employed XML applications, not to mention developed them. 'With the XML Starter Kit, Software AG is offering developers, partners and professional-scale users the chance to install and test all the components of the first complete XML architecture for electronic business applications,' explains Andreas Zeitler, Software AG board member for sales and marketing. Users quickly learn how to install an XML database. WithTamino X-Studio, the development tool of the Tamino XML Platform, they can develop their first XML-based application and link it to a web or application server with ease. And, with Tamino X-Bridge, the architecture's XML middleware, the different applications can be integrated seamlessly. The Starter Kit also includes an in-depth tutorial, comprising two versions. The full version requires installation of all components, allowing users to go through every step of the application development process. The demo version is for less experienced programmers and does not entail as much installation... XML Starter Kit from Software AG: You will find that this unique composition of tools, database and connectivity software will open up the world of XML for you. Find out how Software AG's native XML database Tamino, the XML development tool X-Studio and the XML connectivity tool X-Bridge, can simplify any XML project."


  • [November 14, 2000]   
    SpeechObjects Specification Published as a W3C NOTE.
        

    The W3C has acknowledged a submission from Nuance Communications, Inc. for a SpeechObjects Specification Version 1.0. Reference: W3C Note 14-November-2000, edited by Daniel C. Burnett. Document abstract: "This document describes SpeechObjects, a core set of reusable dialog components that are callable through a dialog markup language such as VoiceXML, to perform specific dialog tasks, for example, get a date or a credit card number, etc. The major goal of SpeechObjects is to complement the capabilities of the dialog markup language and to leverage best practices and reusable component technology in the development of speech applications." Description: "SpeechObjects are reusable software components that encapsulate discrete pieces of conversational dialog. SpeechObjects are based on an open architecture that can be deployed on any of the major server and IVR (interactive voice response) platforms. This paper describes a specification based on Nuance's Java implementation of SpeechObjects. Simply stated, a SpeechObject is a reusable software component that implements a dialog flow and is packaged with the audio prompts and recognition grammars that support that dialog. An implementation of the foundation set of SpeechObjects, including source code, is freely available to the SpeechObjects developer community as part of Nuance's Open Voice Framework initiative." The specification from Nuance is set against the backdrop of work conducted in the W3C Voice Browser Working Group, which "has determined requirements for several specifications including one for a Reusable Dialog Component Requirements." According to the W3C staff comment: "W3C is working to expand access to the Web to allow people to interact with Web sites via spoken commands, and listening to prerecorded speech, music and synthetic speech. The W3C Voice Browser Activity has produced a set of requirements for interactive voice response applications and is now developing a set of specifications that meet these requirements... The W3C Voice Browser Working Group plans to develop specifications for its Speech Interface Framework using SpeechObjects as a model for work on reusable dialog components. This work is already underway, following the publication of a requirements draft for reusable dialog components. A specification meeting these requirements is under development, with the goal of being used together with W3C's dialog markup language. It is recommended that the Nuance Communications SpeechObjects submission is carefully examined in the context of this work." See further: (1) the W3C Voice Browser Activity and (2) "VoiceXML."


    [November 14, 2000]   
    DOM Level 2 Published As a W3C Recommendation.    

    W3C has released the Document Object Model (DOM) Level 2 Core Specification Version 1.0 and its associated modules as a W3C Recommendation. Core Reference: W3C Recommendation 13-November-2000, edited by Arnaud Le Hors, Philippe Le Hégaret, Lauren Wood (WG Chair), Gavin Nicol, Jonathan Robie, Mike Champion, and Steve Byrne. Four other modules released with the Core include: (1) Document Object Model (DOM) Level 2 Views Specification; (2) Document Object Model (DOM) Level 2 Events Specification; (3) Document Object Model (DOM) Level 2 Style Specification; (4) Document Object Model (DOM) Level 2 Traversal and Range Specification. At the same time, a working draft has been issued for Document Object Model (DOM) Level 2 HTML Specification (to ensure backwards compatibility). Excerpts from the W3C press release: "Leading the Web to its full potential, the World Wide Web Consortium (W3C) today released the Document Object Model Level 2 specification as a W3C Recommendation. The specification reflects cross-industry agreement on a standard API (Applications Programming Interface) for manipulating documents and data through a programming language (such as Java or ECMAScript). A W3C Recommendation indicates that a specification is stable, contributes to Web interoperability, and has been reviewed by the W3C Membership, who favor its adoption by the industry. Created and developed by the W3C Document Object Model (DOM) Working Group, this specification extends the platform- and language-neutral interface to access and update dynamically a document's content, structure, and style first described by the DOM Level 1 Recommendation. The DOM Level 2 provides a standard set of objects for representing Extensible Markup Language (XML) documents and data, including namespace support, a style sheet platform which adds support for CSS 1 and 2, a standard model of how these objects may be combined, and a standard interface for accessing and manipulating them. DOM Level 1 was designed for HTML 4.0 and XML 1.0. With DOM Level 2, authors can take further advantage of the extensibility of XML. Simply put, anywhere you use XML, you can now use the DOM to manipulate it. The standard DOM interface makes it possible to write software (similar to plug-ins) for processing customized tag-sets in a language- and platform-independent way. A standard API makes it easier to develop modules that can be re-used in different applications. DOM Level 2 provides support for XML namespaces, extending and improving the XML platform. As more sites move to XML for content delivery, DOM Level 2 emerges as a critical tool for developing dynamic Web content. The DOM defines a standard API that allows authors to write programs that work without changes across tools and browsers from different vendors. But beyond this, it provides a uniform way to produce programs that work across a variety of different devices, so all may benefit from dynamically generated content.. The DOM Level 2 Cascading Style Sheet (CSS) API makes it possible for a script author to access and manipulate style information associated with contents, while preserving accessibility. DOM Level 2 also includes an Events API to provide interactivity anywhere someone uses XML - in documents, in data, or in B2B applications..." For related references, see: (1) testimonials for the DOM Level 2 Recommendation, (2) the DOM Activity report, and (3) "W3C Document Object Model (DOM)."


    [November 13, 2000]   
    Comprehensive Real Estate Transaction Markup Language (CRTML).    

    The Alliance for Advanced Real Estate Transaction Technology (AARTT) recently announced an initiative "to create open standards for data exchange within the real estate industry in order to streamline the online home-buying and selling process. The initiative is called CRTML (Comprehensive Real Estate Transaction Markup Language). Member companies include: 9keys, AppraisalHub, Bowstreet, Commission Advance, Deloitte & Touche, GHR Systems, Homeadvisor Technologies Inc., Homebid, iLumin, Inciscent, InfoStream, Instanet Forms, InteliTouch, Interealty, iProperty, MarketLinx, Property I.D., Supra Products, and VISTAinfo. The mission of AARRT is to promote and coordinate data interchange standards for the Real Estate industry, based on XML, that will significantly enhance and automate all key elements of Real Estate transactions allowing forging of strong alliances between Real Estate technology providers to foster end-to-end solutions for the industry, and by doing this, to facilitate the acceleration of migrating existing industry participants core business processes towards fully integrated and streamlined Real Estate transactions. The initial objectives or AARTT are: (1) to coordinate the development of standards between the various groups -- RETS, MISMO, LegalXML, etc., (2) to promote the development of standards in areas of the industry not covered by existing initiatives, and (3) to develop interoperability standards between segments of the industry. The results of this effort, in cooperation with the segment-specific standards bodies, will be what we call a Comprehensive Real Estate Transaction Markup Language (CRTML), which adds an interoperability standard so that each of the segment XML standards can talk with one another without friction. It is AARTT's goal to incorporate current schemas wherever practical and participate in an open dialog with all recognized XML workgroups currently active in the Real Estate sector. At the same time, AARTT will continue forging efficiently ahead to develop CRTML which will be designed to augment and fill the gaps in existing schemas while forming the agreed upon foundation for data-interchange between all parties in the Alliance. Each Alliance partner company will agree to incorporate the CRTML standard into its products as soon as possible after release of the specification. By analyzing the core data elements that are required by all participants in order to transact real estate, CRTML will be able to significantly speed up the process of delivering on the promise of seamless data interchange, and efficient, end-to-end, single-point of data entry systems." For references and related initiatives, see "Comprehensive Real E