Manage Learn to apply best practices and optimize your operations.

Chapter 2: 'XML primer'

This chapter examines the origins of XML and dives into issues like document vs. data-centric XML, instances, namespaces and document type definitions. We also get some insights into the best approach for XML schemas and processing best-practices. With 80+ information-packed pages and ample code samples, this is a beefy overview recommended for anyone working with XML.

Download chapter 2: 'XML primer'

Excerpted from the book "Building Web Services with Java: Making Sense of XML, SOAP, WSDL, and UDDI, 2nd Edition," ISBN 0672326418, Copyright 2004. Written permission from SAMS Publishing is required for all other uses. Copyright © 2005 SAMS. All rights reserved.

Chapter Excerpt:

SINCE ITS INTRODUCTION IN 1998, Extensible Markup Language (XML) has revolutionized how we think about structuring, describing, and exchanging information.The ways in which XML is used in the software industry are many and growing. Certainly for Web services the importance of XML is paramount; all key Web service technologies are based on it.

One great thing about XML is that it's constantly changing and evolving. However, this can also be its downside. New problems require new approaches and uses of XML that drive aggressive technological innovation.The net result is a maelstrom of invention— a pace of change so rapid that it leaves most people confused.To say that you're using XML is meaningless.Are you using DTDs or XML Schema and, if so, whose? How about XML Namespaces, XML Encryption, XML Signature, XPointer, XLink, XPath, XSLT, XQuery, XKMS, RDF, SOAP,WSDL, UDDI, XAML, BPEL,WSIA, WSRP, or WS-Whatever? Does your software use SAX, DOM, JAXB, JAXP, JAXM, JAXR, or JAX-RPC? It's easy to get lost, to drown in the acronym soup.You're interested in Web services (you bought this book, remember?). How much do you really need to know about XML?

The truth is pleasantly surprising. First, many XML technologies you might have heard about aren't relevant to Web services.You can safely forget half the acronyms you wish you knew more about. Second, even with relevant technologies, you need to know only a few core concepts. (The 80/20 rule doesn't disappoint.) Third, this chapter is all you need to read and understand to be able to handle the rest of the book and make the most of it.

This chapter will develop a set of examples around SkatesTown's processes for submitting POs and generating invoices.The examples cover all the technologies we've listed here.

If you're an old hand at XML who understands the XML namespace mechanism and feels at home with schema extensibility and the use of xsi:type, you should go straight to Chapter 3,"The SOAP Protocol," and dive into Web services. If you can parse and process a significant portion of the previous sentence, you should skim this chapter to get a quick refresher of some core XML technologies. And if you're someone with more limited XML experience, don't worry—by the end of this chapter, you'll be able to hold your own.

XML is here to stay.The XML industry is experiencing a boom. XML has become the de facto standard for representing structured and semistructured information in textual form. Many specifications are built on top of XML to extend its capabilities and enable its use in a broader range of scenarios. One of the most exciting areas of use for XML is Web services.The rest of this chapter will introduce the set of XML technologies and standards that are the foundation of Web services:

  • XML instances—The rules for creating syntactically correct XML documents
  • XML Schema—A standard that enables detailed validation of XML documents as well as the specification of XML datatypes
  • XML Namespaces—Definitions of the mechanisms for combining XML from multiple sources in a single document
  • XML processing—The core architecture and mechanisms for creating, parsing, and manipulating XML documents from programming languages as well as mapping Java data structures to XML

Document- Versus Data-Centric XML

Generally speaking, there are two broad application areas of XML technologies.The first relates to document-centric applications, and the second to data-centric applications. Because XML can be used in so many different ways, it's important to understand the difference between these two categories.

Document-Centric XML

The following markup is a perfect example of XML used in a document-centric manner.The content is directed toward human consumption—it's part of the FastGlide skateboard user guide.The content is semistructured.The usage rules for tags such as <B>, <I>, and <LINK> are loosely defined; they could appear just about anywhere in the document:

<H1>Skateboard Usage Requirements</H1>
<P>In order to use the <B>FastGlide</B> skateboard you have to
Document- Versus Data-Centric XML 33
<ITEM>A strong pair of legs.</ITEM>
<ITEM>A reasonably long stretch of smooth road surface.</ITEM>
<ITEM>The impulse to impress others.</ITEM>
<P>If you have all of the above, you can proceed to <LINK
HREF="Chapter2.xml">Getting on the Board</LINK>.</P>

Because of its SGML origins, in the early days of its existence, XML gained rapid adoption within publishing systems as a mechanism for representing semistructured documents such as technical manuals, legal documents, and product catalogs.The content in these documents is typically meant for human consumption, although it could be processed by any number of applications before it's presented to humans.The key element of these documents is semistructured marked-up text.

Data-Centric XML

Consider the example in Listing 2.1. It's a purchase order (PO) from the Skateboard Warehouse, a retailer of skateboards to SkatesTown.The order is for 5 backpacks, 12 skateboards, and 1,000 SkatesTown promotional stickers (this is what the stock-keeping unit [SKU] 008-PR stands for).

Listing 2.1 Purchase Order in XML

<po id="43871" submitted="2004-01-05" customerId="73852">
        <company>The Skateboard Warehouse</company>
        <street>One Warehouse Park</street>
        <street>Building 17</street>
        <company>The Skateboard Warehouse</company>
        <street>One Warehouse Park</street>
        <street>Building 17</street>
        <item sku="318-BP" quantity="5">
            <description>Skateboard backpack; five pockets</description>
        <item sku="947-TI" quantity="12">
            <description>Street-style titanium skateboard.</description>
        <item sku="008-PR" quantity="1000">
Chapter 2: 'XML primer'

Visit the SAMS Publishing website for a detailed description and to learn how to purchase this title.

By contrast, data-centric XML is used to mark up highly structured information such as the textual representation of relational data from databases, financial transaction information, and programming language data structures. Data-centric XML is typically generated by machines and is meant for machine consumption. XML's natural ability to nest and repeat markup makes it the perfect choice for representing these types of data.

Dig Deeper on SAP Java and J2EE

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.