50
votes

I would like to understand the purpose of targetNamespace as used both in XML Schema and in WSDL. In fact, to keep things simple, let's limit this question to XML Schema.

I feel like I fully understand the notion of (simple) XML namespaces. By convention we use URI/URLs, but we could use any string, which we then assign to a prefix for reuse by XML nodes and attributes, or use simply as the default namespace for the scope at hand. So far, so good ?

Now enters XML Schema. For some reason the inventors of XML Schema felt the notion of simple namespaces wasn't enough and they had to introduce the targetNamespace. My question is : what significant benefit does a targetNamespace introduce that couldn't be provided by a normal XML namespace ? If an XML document references a xsd document, either by schemaLocation or with an import statement, in either case I give the path to the actual xsd document being referenced. This is what uniquely defines the Schema I want to refer to. If in addition I want to bind this Schema to a particular namespace in my referencing document, why should I be obliged to replicate the precise targetNamespace already defined in the XML Schema I am referencing? Why couldn't I simply redefine this namespace however I want within the XML document in which this namespace will be used to refer to that particular XML Schema document I want to reference ?

Update:

To give an example, if I have the following in an XML instance document:

<p:Person
   xmlns:p="http://contoso.com/People"
   xmlns:v="http://contoso.com/Vehicles"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation=
    "http://contoso.com/schemas/Vehicles
     http://contoso.com/schemas/vehicles.xsd
     http://contoso.com/schemas/People
     http://contoso.com/schemas/people.xsd">
   <name>John</name>
   <age>28</age>
   <height>59</height>
   <v:Vehicle>
      <color>Red</color>
      <wheels>4</wheels>
      <seats>2</seats>
   </v:Vehicle>
</p:Person>

Why does e.g. the people.xsd Schema need to define a targetNamespace which is "http://contoso.com/schemas/People"? Why do we need the targetNamespace definition in the xsd document at all? It seems to me all you have to gain from the namespace part of the schemaLocation is already contained in the XML instance document. What is the benefit of enforcing the existence of a targetNamespace with equal value over in the xsd document ?

Follow-up question to Paul's answer:

Can you give me a concrete example where such "clashes" between xsd element names becomes apparent and that would explain the need for targetNamespace ?


Ok, here's an attempt to answer my own question. Let me know if it seems coherent to you. Looking at the examples on the page linked by Paul helped me.

If we take the XML instance example in the original question above, we have two references to the definition of the vehicle element. One is explicit and visible in the XML instance document itself, but we must also imagine that the person.xsd XML Schema references the same vehicle definition again as an allowed child element of person. If we were to use normal namespaces where each document were allowed to define its own namespace for vehicle, how would we know that the XML instance is referencing the same XML Schema definition for vehicle as is the person.xsd ? The only way is by enforcing a concept of namespace which is stricter than the original simple one and which must be written exactly the same way across multiple documents.

If I wasn't writing this on a tablet I would provide a code example, but here I will just attempt to describe the example I have in mind.

Imagine that we have two different XML Schema definitions for a vehicle element. location1/vehicles.xsd would contain the definition that validates the example from the question of this post (containing color, wheels, and seats child elements), whereas location2/vehicles.xsd would contain an entirely different definition for a vehicle element, (say, with child elements year, model, and volume). Now, if the XML instance document refers to the location1 Schema, as is the case in the example above, but person.xsd says that the person element can contain a vehicle child element of the type defined in the location2 Schema, then without the notion of a targetNamespace, the XML instance would validate, even though it clearly doesn't have the right kind of vehicle as a child element of its person element.

Target namespaces then help us make sure that if two different documents are referencing the same third XML Schema, that they are both in deed referencing the same Schema and not just a Schema that contains elements that are similar, but not identical to one another...

Does that make any sense ?

4

4 Answers

18
votes

You seem to be on the right track. I'll make a few points here that might help.

  • Within an instance document, you use XML namespaces to identify the namespace that an element or attribute is in.
  • Within a schema document, you declare elements and attributes that will appear in instances. What namespace are they declared to be in? This is what targetNamespace is for.
  • The schema document location and the namespace are not the same thing. It is quite common to have multiple .xsd documents with the same targetNamespace. (They may or may not include each other, but typically will include each other.)
  • Instance documents do not always have an xsi:schemaLocation element to tell parsers where to locate the schemas. Various methods may be used to tell a parser where to locate relevant schema documents. An XSD may be located on local disk or at some web address and this should not affect the namespace of the elements in it.
    • xsi:schemaLocation is a hint. Parsers may locate the schema for the given namespace elsewhere, which implies that they must be able to know what namespace a schema is for.
    • Tools, such as databinding tools, will precompile schemas and produce code that recognizes valid documents. These must be able to know the namespaces of the declared elements.

I think what you were assuming is that the instance document could specify the namespace of the elements and attributes declared in some schema document, using xsi:schemaLocation. That doesn't work. For one thing, the parser may locate other schema documents than those listed, and it needs to know what namespace they are for. For another, it would make reasoning about schemas difficult or impossible: you wouldn't be able to look at a schema and know the namespaces that everything belonged in because that decision would be postponed until an instance was written.

14
votes

Q: "By convention we use URI/URLs, but we could use any string, which we then assign to a prefix for reuse by XML nodes and attributes, or use simply as the default namespace for the scope at hand."

A: Yes, exactly.

Q: "For some reason the inventors of XML Schema felt the notion of simple namespaces wasn't enough and they had to introduce the targetNamespace."

A: http://www.liquid-technologies.com/Tutorials/XmlSchemas/XsdTutorial_04.aspx

Breaking schemas into multiple files can have several advantages. You can create re-usable definitions that can be used across several projects. They make definitions easier to read and version as they break down the schema into smaller units that are simpler to manage.

...

This all works fine without namespaces, but if different teams start working on different files, then you have the possibility of name clashes, and it would not always be obvious where a definition had come from. The solution is to place the definitions for each schema file within a distinct namespace.

Clarification:

  • The primary purpose of XML Schemas is to declare "vocabularies".

  • These vocabularies can be identified by a namespace that is specified in the targetNamespace attribute.

  • The Schema (an XML document) can have a "namespace". The "vocabulary" the document describes can have a "targetNamespace".

  • Just as XML Schemas provide a higher level of abstraction than SGML DTD's (the original architects of XML thought DTD's were sufficient), XML Schema "targetNamespaces" provide a level of abstraction over "simple namespaces".

'Hope that helps

12
votes

I think it helps to look at both the instance document and the schema document at the same time to understand what targetNamespace does. Consider this (based on your instance document):

<p:Person
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns:p="http://localhost:8080/scribble/xml/Person"
        xmlns:v="http://localhost:8080/scribble/xml/Vehicle"
        xsi:schemaLocation="
            http://localhost:8080/scribble/xml/Person
            http://localhost:8080/scribble/xml/person.xsd">
    <name>John</name>
    <age>28</age>
    <height>59</height>
    <v:Vehicle>
        <color>Red</color>
        <wheels>4</wheels>
        <seats>2</seats>
    </v:Vehicle>
</p:Person>

There's no default namespace specified for the document, but p:* and v:* are aliased to specific NS URIs. Now take a look at the schema document itself:

<?xml version="1.0" encoding="UTF-8"?>
<schema
    xmlns="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://localhost:8080/scribble/xml/Person"
    elementFormDefault="qualified"
    xmlns:v="http://localhost:8080/scribble/xml/Vehicle">

    <import
        namespace="http://localhost:8080/scribble/xml/Vehicle"
        schemaLocation="http://localhost:8080/scribble/xml/v.xsd"/>

    <element name="Person">
        <complexType>
            <sequence>
                <element name="name" form="unqualified" type="NCName"/>
                <element name="age" form="unqualified" type="integer"/>
                <element name="height" form="unqualified" type="integer"/>
                <element ref="v:Vehicle"/>
            </sequence>
        </complexType>
    </element>

</schema>

and

<?xml version="1.0" encoding="UTF-8"?>
<schema
    xmlns="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://localhost:8080/scribble/xml/Vehicle"
    elementFormDefault="qualified">

    <element name="Vehicle">
        <complexType>
            <sequence>
                <element name="color" form="unqualified" type="NCName"/>
                <element name="wheels" form="unqualified" type="integer"/>
                <element name="seats" form="unqualified" type="integer"/>
            </sequence>
        </complexType>
    </element>
</schema>

If you look at the attributes on the tags, the default namespace is "http://www.w3.org/2001/XMLSchema" for both the schema documents... but the targetNamespace is the one used as the aliased namespace in the instance document.

targetNamespace is the expected namespace of the instances regardless of the namespace of the schema documents and any other namespace specified in the instance document.

I find it kind of helpful to think of it like hosting a party where you have a guest list and guests wearing name-tags. Think of the targetNamespace in the schema documents like the names on the guest list. The xmlns, aliased or not, in the instance documents is like the name-tags on the guests. As long as you have the guest list (which miraculously includes a photocopy of their state-issued ID), whenever you encounter someone you can validate their identity. If you come across someone wearing a name-tag that doesn't match the attached parameters, you can freak out (i.e. throw an error).

With the schema/instances, you have:

Schemas:

targetNamespace="http://localhost:8080/scribble/xml/Person"
targetNamespace="http://localhost:8080/scribble/xml/Vehicle"

Instance:

xmlns:p="http://localhost:8080/scribble/xml/Person"
xmlns:v="http://localhost:8080/scribble/xml/Vehicle"

Or... any guest nicknamed "v" that you encounter anywhere in the party (barring special rules that say otherwise), any floor of the house or in the backyard or in the pool, better match the description for a guest on the guest list named http://localhost:8080/scribble/xml/Vehicle. or they're an intruder.

Those special rules may say something like, V can only hang out if they're immediately next to P, or P can only hang out if V is present. In this case, P has to hang when V is there, but V can go pretty much anywhere they want without A being there.

This way a schema can be incredibly flexible, defining pretty much any data structure desired and being able to track what goes where just by matching the namespaces (default or prefixed) of any given element back to the TNS and associated schema.

4
votes

It's not clear to me exactly what you are asking. Clearly a schema can contain definitions of components in many different namespaces, and there has to be some way of saying "This is a declaration of element E in namespace N". The designers of XSD chose to design the language so that all the declarations in one schema document belong to the same namespace, called the target namespace of the module. It could have been packaged differently, but the difference would be very superficial. What exactly do you think is wrong with the decision to align modules with namespaces?