0
votes

For the (contrived) HTML example below:

<div>
  <p>lorem <a href="lorem.html" target="_blank">ipsum</a></p>
  <a href="foo.html" target="top">foo</a>
  <p><img src="foo.jpg" class="bar"/></p>
  <img src="bar.jpg" class="bar"/>
</div>

I'm trying to write an XSLT 1.0 transform which:

  • whitelists top-level <p>
  • whitelists href attribute for <a>
  • whitelists src attribute for <img>
  • wraps top-level <a> and <img> in <p>...</p>

Ideally this would be done is a way that allows adding more elements and attributes.

Expected output:

<div>
  <p>lorem <a href="lorem.html">ipsum</a></p>
  <p><a href="foo.html">foo</a></p>
  <p><img src="foo.jpg"/></p>
  <p><img src="bar.jpg"/></p>
</div>

The following XSLT 2.0 works thanks to <xsl:next-match>:

Fiddle: https://xsltfiddle.liberty-development.net/6r5Gh3p:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:template match="/div">
    <xsl:copy>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

  <!-- whitelist <p> as top-level element -->

  <xsl:template match="/div/p">
    <xsl:copy>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

  <!-- coerce top-level <img> and <a> as children <p> -->

  <xsl:template match="/div/img|/div/a">
    <p><xsl:next-match/></p>
  </xsl:template>

  <!-- whitelist href attribute for <a> -->

  <xsl:template match="a">
    <xsl:copy>
      <xsl:copy-of select="@href"/>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

  <!-- whitelist src attribute for <img> -->

  <xsl:template match="img">
    <xsl:copy>
      <xsl:copy-of select="@src"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

In XSLT 1.0 there is no <next-match> and using the template below only matches once, so <a> and <img> do get wrapped in <p> but their attributes don't get whitelisted:

Fiddle: https://xsltfiddle.liberty-development.net/94rmq6r

  <xsl:template match="/div/img|/div/a">
    <p>
      <xsl:copy><xsl:apply-templates/></xsl:copy>
    </p>
  </xsl:template>

Output:

<div>
  <p>lorem <a href="lorem.html">ipsum</a></p>
  <p><a>foo</a></p>
  <p><img src="foo.jpg"/></p>
  <p><img/></p>
</div>

How can this be accomplished in XSLT 1.0?

2

2 Answers

1
votes

You could make use of xsl:import and xsl:apply-imports here.

You would start off by putting your "whitelist" templates in a separate XSLT file (call it "Whitelist.xslt")

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <!-- whitelist <p> as top-level element -->
  <xsl:template match="/div/p">
    <xsl:copy>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

  <!-- whitelist href attribute for <a> -->
  <xsl:template match="a">
    <xsl:copy>
      <xsl:copy-of select="@href"/>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

  <!-- whitelist src attribute for <img> -->
  <xsl:template match="img">
    <xsl:copy>
      <xsl:copy-of select="@src"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Then, your main XSLT could import this, and use xsl:apply-imports wherever you had used xsl:next-match

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
  <xsl:import href="Whitelist.xslt" />

  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:template match="/div">
    <xsl:copy>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

  <!-- coerce top-level <img> and <a> as children <p> -->
  <xsl:template match="/div/img|/div/a">
    <p><xsl:apply-imports/></p>
  </xsl:template>
</xsl:stylesheet>

With an imported stylesheet, the templates inside have lower priority than the ones in the main stylesheet, so the main template will always be matched first.

EDIT: As an aside... I know your example is contrived, but for this particular case, you can re-write it without the next matches or applying imports, like so...

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:template match="/div|/div/p">
    <xsl:copy>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="/div/img|/div/a">
    <p>
      <xsl:copy>
        <xsl:apply-templates select="@*|node()" />
      </xsl:copy>
    </p>
  </xsl:template>

  <xsl:template match="a|img">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="a/@href|img/@src">
    <xsl:copy />
  </xsl:template>

  <xsl:template match="@*" />
</xsl:stylesheet>
0
votes

I think that the proper answer is that there is no equivalence in XSLT 1.0 for the XSLT 2.0 instruction xsl:next-match semantics. What is somehow expected. From the spec:

A template rule that is being used to override another template rule (see 6.4 Conflict Resolution for Template Rules) can use the xsl:apply-imports or xsl:next-match instruction to invoke the overridden template rule. The xsl:apply-imports instruction only considers template rules in imported stylesheet modules; the xsl:next-match instruction considers all other template rules of lower import precedence and/or priority. Both instructions will invoke the built-in template rule for the node (see 6.6 Built-in Template Rules) if no other template rule is found.

So, the specification itself is giving you the relation and difference between both instructions: you cannot know wich template has occurred last in declaration order among all those that have left after conflict resolution. It is also worth notice that the imported stylesheet modules is not just a form of C preprocessor inclusion mechanism. You might consider it the heritance mechanism between transformations.