I am searching for a lib or tool or even some simple code that can parse the Xpath/XSLT data in our XSLT files to produce a Dictionary/List/Tree of all the XML nodes that the XSLT is expecting to work on or find. Sadly everything I am finding is dealing with using XSLT to parse XML rather than parsing XSLT. And the real difficult part I'm dealing with is how flexible XPath is.
For example in the several XSLT files we work with an entry may select on
nodeX/nodeY/nodeNeeded;
OR
../nodeNeeded;
OR
select nodeX
then select nodeY
then select nodeNeeded
;
and so forth.
What we would like to do is to be able to parse out that XSLT doc and get a data structure of sorts that explicitly tell us that the XSLT is looking for nodeNeeded in path nodeX, nodeY so that we can custom build the XML data in a minimalism fashion
Thanks!
Here is a mocked up sub-set of data for visualization purposes:
<server_stats>
<server name="fooServer">
<uptime>24d52m</uptime>
<userCount>123456</userCount>
<loggedInUsers>
<user name="AnnaBannana">
<created>01.01.2012:00.00.00</created>
<loggedIn>25</loggedIn>
<posts>3</posts>
</user>
</loggedInUsers>
<temperature>82F</temperature>
<load>72</load>
<mem_use>45</mem_use>
<visitors>
<current>42</current>
<browsers name="mozilla" version="X.Y.Z">22</browsers>
<popular_link name="index.html">39</popular_link>
<history>
<max_visitors>789</max_visitors>
<average_visitors>42</average_visitors>
</history>
</visitors>
</server>
</server_stats>
From this one customer may just want create an admin HTML page where they pull the hardware stats out of the tree, and perhaps run some load calculations from the visitor count. Another customer may just want to pull just the visitor count information to display as information on their public site. To have each of these customers system load to be as small as possible we would like to parse their stat selecting XSLT and provide them with just the data they need (which has been requested). Obviously the issue is that one customer may perform a direct select on the visitor count node and another may select the visitors node and select each of the child nodes they want etc.
The 2 hypothetical customers looking for the "current" node in "visitors" might have XSLT looking like:
<xsl:template match="server_stats/server/visitors">
<xsl:value-of select="current"/>
</xsl:template>
OR
<xsl:template match="server_stats">
<xsl:for-each select="server">
<xsl:value-of select="visitors/current"/>
<xsl:value-of select="visitors/popular_link"/>
</xsl:for-each>
</xsl:template>
In this example both are trying to select the same node but the way they do it is different and "current" is not all that specific so we also need the path they used to get there since "current" could be nodes for several items. This hurts us from just looking for "current" in their XSLT and because the way they access the path can be very different we cant just search for the whole path either.
So the result we would like is to parse their XSLT and give us say a List of stats:
Customer 1:
visitors/current
Customer 2:
visitors/current
visitors/popular_link
etc.
Some example selects that break the solution provided below which we will be working on solving:
<xsl:variable name="fcolor" select="'Black'"/> results in a /'Black' entry
<xsl:for-each select="server"> we get the entry, but its children don't show it anymore
<xsl:value-of select="../../@name"/> This was kind of expected, we can try to figure out how to skip attribute based selections but the relative paths show up as I thought they would
<xsl:when test="substring(someNode,1,2)=0 and substring(someNode,4,2)=0 and substring(someNode,7,2)>30"> This one is kind of throwing me, because this shows up as a path item, it's due to the when check in the solution but I don't see any nice solution since the same basic statement could have been checking for a branching path, so this might just be one of those cases we need to post-process or something of that nature.