3
votes

I have the following XML that I want to transform by XSLT. My main purpose is to have a list of urls only. It means any line that contains "http:// ".

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<urlset xmlns="http://www.google.com" xmlns:image="http://www.google1.com" xmlns:video="http://www.google2.com" xmlns:xhtml="http://www.google3.com">
  <url>
    <loc id="837">http://url1</loc>
  </url>
  <url>
    <loc id="2332">http://url2</loc>
    <image:image>
      <image:loc>http://url3</image:loc>
    </image:image>
    <image:image>
      <image:loc>http://url4</image:loc>
    </image:image>    
  </url>
</urlset>

I created an XSLT as the following;

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml"
  doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"
  doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/>

    <xsl:template match="/">
        <html>
            <body>
                <h1>URLS</h1>
               <ul>                    
                 <xsl:for-each select="urlset/url">
                    <li>
                        <xsl:value-of select="loc"/>
                    </li>
                 </xsl:for-each>
                  <xsl:for-each select="urlset/url/image:image">
                    <li>
                        <xsl:value-of select="image:loc"/>
                    </li>
                 </xsl:for-each>
               </ul>
            </body>
        </html>
    </xsl:template>
</xsl:stylesheet> 

The first foreach does not return anything and the second foreach gives exception like:

SystemId Unknown; Line #15; Column #53; Prefix must resolve to a namespace: image

Could anybody help why this XSLT fails ?

2
1. Does your XML really contain 4 different namespace declarations with the same URI http://www.google.com? -- 2. "My main purpose is to have a list of urls only. It means any line that contains "http:// "." I don't see any lines containing "http:// " in your example.michael.hor257k
1. Yes I see that it is not necessary , I can do as kjhughes suggested. 2. I put plaintext to demonstrate some urls. Imagine urls starting with http prefix for the tag values where I put url1, url2 ... and so on.shamaleyte
Well, it's kind of confusing. If the input is using different URIs for different prefixes (as it well should), then image:loc will retrieve only some of the locations.michael.hor257k
If you make use of a namespace inside any xml you must define this namespace beforeMichael Pacheco

2 Answers

6
votes

To resolve the immediate error, add a namespace definition of the image namespace prefix to your stylesheet:

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:image="http://www.google.com">

There are a number of other adjustments that will be required once you eliminate that error. It makes no sense to define so many namespace prefixes in your XML for the same (dubious) namespace.

Something like the following input XML would be more reasonable:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<urlset xmlns="http://www.example.com/urlset"
        xmlns:image="http://www.example.com/image">
  <url>
    <loc id="837">url1</loc>
  </url>
  <url>
    <loc id="2332">url2</loc>
    <image:image>
      <image:loc>url3</image:loc>
    </image:image>
    <image:image>
      <image:loc>url4</image:loc>
    </image:image>    
  </url>
</urlset>

Then, the following XSLT,

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
                xmlns:u="http://www.example.com/urlset"
                xmlns:image="http://www.example.com/image"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/u:urlset">
    <html>
      <body>
        <h1>URLS</h1>
        <ul>                    
          <xsl:for-each select="u:url">
            <li>
              <xsl:value-of select="u:loc"/>
            </li>
          </xsl:for-each>
          <xsl:for-each select="u:url/image:image">
            <li>
              <xsl:value-of select="image:loc"/>
            </li>
          </xsl:for-each>
        </ul>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet> 

will yield the URL list you seek:

<html xmlns:u="http://www.example.com/urlset" xmlns:image="http://www.example.com/image">
   <body>
      <h1>URLS</h1>
      <ul>
         <li>url1</li>
         <li>url2</li>
         <li>url3</li>
         <li>url4</li>
      </ul>
   </body>
</html>

You also might consider using a template-match-based organization of your XSLT rather than a loop-based organization as a matter of style. For a small example such as this, there's little difference in complexity, but for a larger problem, match-based organization is cleaner and clearer.

2
votes

It seems that your XML can have three kinds of loc elements, each in a different namespace. To get all of them, you need to do something like this:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:g="http://www.google.com" 
xmlns:image="http://www.google1.com" 
xmlns:video="http://www.google2.com"
exclude-result-prefixes="g image video">

<xsl:template match="/g:urlset">
    <html>
        <body>
            <h1>URLS</h1>
            <ul>                    
                <xsl:for-each select="g:url/g:loc | g:url/image:image/image:loc | g:url/video:video/video:loc">
                    <li>
                        <xsl:value-of select="."/>
                    </li>
                 </xsl:for-each>
            </ul>
        </body>
    </html>
</xsl:template>

</xsl:stylesheet>

Test input

<urlset xmlns="http://www.google.com" xmlns:image="http://www.google1.com" xmlns:video="http://www.google2.com">
  <url>
    <loc id="123">http://urll</loc>
  </url>
  <url>
    <loc id="456">http://url2</loc>
    <image:image>
      <image:loc>http://url3</image:loc>
    </image:image>
  </url>
  <url>
    <loc id="789">http://url4</loc>
    <video:video>
      <video:loc>http://url5</video:loc>
    </video:video>
  </url>
</urlset>

Result

<html>
   <body>
      <h1>URLS</h1>
      <ul>
         <li>http://urll</li>
         <li>http://url2</li>
         <li>http://url3</li>
         <li>http://url4</li>
         <li>http://url5</li>
      </ul>
   </body>
</html>

An easy, but less efficient alternative:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
    <html>
        <body>
            <h1>URLS</h1>
            <ul>                    
                <xsl:for-each select="//*[local-name()='loc']">
                    <li>
                        <xsl:value-of select="."/>
                    </li>
                 </xsl:for-each>
            </ul>
        </body>
    </html>
</xsl:template>

</xsl:stylesheet>