8
votes

I have an application that has to load XML document and output nodes depending on XPath.

Suppose I start with a document like this:

<aaa>
  ...[many nodes here]...
  <bbb>text</bbb>
  ...[many nodes here]...
  <bbb>text</bbb>
  ...[many nodes here]...
</aaa>

With XPath //bbb

So far everything is nice.

And selection doc.SelectNodes("//bbb"); returns the list of required nodes.

Then someone uploads a document with one node like <myfancynamespace:foo/> and extra namespace in the root tag, and everything breaks.

Why? //bbb does not give a damn about myfancynamespace, theoretically it should even be good with //myfancynamespace:foo, as there is no ambiguity, but the expression returns 0 results and that's it.

Is there a workaround for this behavior?

I do have a namespace manager for the document, and I am passing it to the Xpath query. But the namespaces and the prefixes are unknown to me, so I can't add them before the query.

Do I have to pre-parse the document to fill the namespace manager before I do any selections? Why on earth such behavior, it just doesn't make sense.

EDIT:

I'm using: XmlDocument and XmlNamespaceManager

EDIT2:

XmlDocument doc = new XmlDocument();
doc.XmlResolver = null;
XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
//I wish I could:
//nsmgr.AddNamespace("magic", "http://magicnamespaceuri/
//...
doc.LoadXML(usersuppliedxml);
XmlNodeList nodes = doc.SelectNodes(usersuppliedxpath, nsmgr);//usersuppliedxpath -> "//bbb"

//nodes.Count should be > 0, but with namespaced document they are 0

EDIT3: Found an article which describes the actual scenario of the issue with one workaround, but not very pretty workaround: http://codeclimber.net.nz/archive/2008/01/09/How-to-query-a-XPath-doc-that-has-a-default.aspx

Almost seems that stripping the xmlns is the way to go...

4
Could you add the pertinent bits of code? (Instantiating XmlDocument, XPath, etc)Jeff Swensen
Ok, edited the post, see Edit2.Coder
@Coder: You are saying that an unexpected input results in unexpected output for a given process. That's the use case for validation.user357812
"and extra namespace in the root tag" - this is almost gibberish. I suppose you mean there is an extra namespace declaration in the start tag of the outermost element. Is it a default namespace declaration (xmlns="...")? or is it a declaration for myfancynamespace (xmlns:myfancynamespace="...")? Only the former would affect the namespace of <bbb>. You haven't shown us what the input XML looks like, nor described it clearly, which makes it hard to guess what the problem is.LarsH
When I said 'You haven't shown us what the input XML looks like' I meant the one that caused the problem.LarsH

4 Answers

13
votes

You're missing the whole point of XML namespaces.

But if you really need to perform XPath on documents that will use an unknown namespace, and you really don't care about it, you will need to strip it out and reload the document. XPath will not work in a namespace-agnostic way, unless you want to use the local-name() function at every point in your selectors.

private XmlDocument StripNamespace(XmlDocument doc)
{
    if (doc.DocumentElement.NamespaceURI.Length > 0)
    {
        doc.DocumentElement.SetAttribute("xmlns", "");
        // must serialize and reload for this to take effect
        XmlDocument newDoc = new XmlDocument();
        newDoc.LoadXml(doc.OuterXml);
        return newDoc;
    }
    else
    {
        return doc;
    }
}
6
votes

<myfancynamespace:foo/> is not necessarily the same as <foo/>.

Namespaces do matter. But I can understand your frustration as they usually tend to breaks codes as various implementation (C#, Java, ...) tend to output it differently.

I suggest you change your XPath to allow for accepting all namespaces. For example instead of

//bbb 

Define it as

//*[local-name()='bbb']

That should take care of it.

0
votes

You should describe a bit more detailed what you want to do. The way you ask your question it make no sense at all. The namespace is just a part of the name. Nothing more, nothing less. So your question is the same as asking for an XPath query to get all tags ending with "x". That's not the idea behind XML, but if you have strange reasons to do so: Feel free to iterate over all nodes and implement it yourself. The same applies to functionality you are requesting.

0
votes

You could use the LINQ XML classes like XDocument. They greatly simplify working with namespaces.