3
votes

I do not understand how an xml validator ("schema aware processor" as the w3c refers to it) knows where to find the schema instance in a typical external reference to an xsd from within an xml document.

Here's a typical declaration:

<root xmlns="www.example.org"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="www.example.org" "http://example.org/schemas/schema1.xsd">
  <foo>some data</foo>
</root>
  1. we declare the default namespace for the root element and all its children to be "www.example.org"

  2. we bind the name/prefix "xsi" to the namespace "http://www.w3.org/2001/XMLSchema-instance".

  3. If I am understanding correctly (which is evidently not the case!), it is the information within the actual resource that the xsi namespace refers to that allows the validator to know that schemaLocation (in the following line) is a legitimate attribute of the xsi ("http://www.w3.org/2001/XMLSchema-instance") namespace itself.

But a namespace is not a location (URI), so how does the parser know where to go to determine whether schemaLocation is in fact an attribute defined in the "http://www.w3.org/2001/XMLSchema-instance" namespace?

2
for "w3.org/2001/XMLSchema-instance" the answer is simple: namespace==schemaLocation :)), so case of xsi, you are not right! (see: stackoverflow.com/q/17094247/592355) ..you can also verify by downloading (the document) from this url...xerx593
...and your syntax is somewhat wrong(invalid!): it should be xsi:schemaLocation="www.example.org http://example.org/schemas/schema1.xsd" ..not xsi:schemaLocation="www.example.org" "http://example.org/schemas/schema1.xsd"xerx593
also possible/valid: xsi:schemaLocation="www.example.org http://example.org/schemas/schema1.xsd http://www.w3.org/2001/XMLSchema-instance http://www.w3.org/2001/XMLSchema-instance" ;)xerx593

2 Answers

0
votes

The validator has the schema for that namespace built in. The XML Schema definition spec section 2.7 Schema-Related Markup in Documents Being Validated says:

XML Schema Definition Language: Structures defines several attributes for direct use in any XML documents. These attributes are in the schema instance namespace (http://www.w3.org/2001/XMLSchema-instance) described in The Schema Instance Namespace (xsi) (§1.3.1.2) above. All schema processors must have appropriate attribute declarations for these attributes built in, see Attribute Declaration for the 'type' attribute (§3.2.7.1), Attribute Declaration for the 'nil' attribute (§3.2.7.2), Attribute Declaration for the 'schemaLocation' attribute (§3.2.7.3) and Attribute Declaration for the 'noNamespaceSchemaLocation' attribute (§3.2.7.4).

0
votes

how does the parser know where to go to determine whether schemaLocation is in fact an attribute defined in the "http://www.w3.org/2001/XMLSchema-instance" namespace?

The attribute is written using the name xsi:schemaLocation, and there is a namespace declaration that binds the prefix xsi to the URI http://www.w3.org/2001/XMLSchema-instance, so the XML parser knows that the expanded name of the attribute is (in Clark notation) {http://www.w3.org/2001/XMLSchema-instance}schemaLocation. This doesn't require any schema knowledge or any reference to an external resource.

The knowledge of the permitted content of attributes in this namespace, and where these attributes may appear, and what they mean, is built in to every schema processor.

Having found an attribute with the expanded name {http://www.w3.org/2001/XMLSchema-instance}schemaLocation, the schema validator therefore knows that its content should be a sequence of namespace/location URI pairs. This is something that schema validators just know, they don't need to refer to a schema to find this out. It therefore knows that the schema for namespace www.example.org can be found at http://example.org/schemas/schema1.xsd, and it can go and fetch the schema from that location.