I need to compare xml files from two folders and collect those xml elements that only show up in one of the xml file.
The xml files in two folder has same file name. Below is the sample of what I want to do:
old/booklist1.xml
<books>
<book @type="fiction">
<isn>12345678</isn>
<name>xxxx</name>
</book>
</books>
new/booklist1.xml
<books>
<book @type="fiction">
<isn>12345678</isn>
<name>xxxx</name>
</book>
<book @type="history">
<isn>23456789</isn>
<name>yyyyy</name>
</book>
</books>
I will need the output of the booklist1.xml as the below:
<books>
<book @type="history">
<isn>23456789</isn>
<name>yyyyy</name>
</book>
</books>
I have below findDiff.xsl that works when I specify / hardcode the xml file name:
<xsl:key name="book" match="book" use="." />
<xsl:template match="/books">
<xsl:copy>
<xsl:copy-of select="book[not(key('book', ., document('old_booklist1.xml')))]"/>
</xsl:copy>
</xsl:template>
The fidDiff.xsl current is associated with new/booklist1.xml and I copied the old/booklist1.xml to the same folder with new/booklist1.xml and made the name as old_booklist1.xml and above xsl works with the hard coded uri. I have to loop throw xml file in folder new and then compare it with the same named xml file in folder old.
I am thinking to use the following way to build the xml file URI:
loop in the new and get the file uri
build the file uri for xml file in old folder
<xsl:variable name="xmlPath" select="document-uri()"/>
<xsl:variable name="compareWithPath" select=" replace($xmlFilePath, 'new', 'old')"/>
then pass the compareWithPath to below template:
<xsl:template match="/books">
<xsl:copy>
<xsl:copy-of select="book[not(key('book',., document($compareWithPath)))]"></xsl:copy-of>
</xsl:copy>
</xsl:template>
But I got the error that The system cannot find the file specified file:/C:/Users/phyllis/Documents/old/booklist1.xml
Michael Kay mentioned that we can convert the file name to URI and use doc() or document() to load it. I build the filename URI exactly the same way that I got from document-uri(). What am I wrong here?
The converted file URI looks like this:
<compareWithPath>file:/C:/Users/phyllis/Documents/old/booklist1.xml</compareWithPath>
Returns false when check above file URI using:
<fileExist><xsl:value-of select="doc-available($compareWithPath)"/></fileExist>
uri-collection('old?select=*.xml')
or where/how exactly do you try to find and load the URIs? – Martin Honnencollection
function there is theuri-collection
function which only gives you the URI of the files in the collection but doesn't pull them in all together. Additionally, in the commercial editions of Saxon you have adiscard-document
function to avoid memory problems I think. – Martin Honnen<xsl:key name="book" match="book" use="." />
, given thatbook
seems to have various child elements and whitespace any key on the complete contents can easily break by a change in indentation or white-space stripping. Perhaps a composite key on the particular elements you need to use to identify a book is a better approach in XSLT 3. – Martin Honnenuri-collection()
and want to process each document use e.g.<xsl:apply-templates select="uri-collection() ! doc(.)"/>
or e.g.<xsl:iterate select="uri-collection() ! doc(.)">...</xsl:iterate>
or for-each with the same select if wanted/needed. – Martin Honnen<xsl:key name="book" match="book" composite="yes" use="*" />
andbook[not(key('book', *, document($compareWithPath)))]
is a cleaner approach. – Martin Honnen