I have an XML file with several translation units. "Content a" and "Content b" both have translations in German and French. "Content a" and "Content b" both appear twice in this file.
<unit>
<src lang="en">Content a</src>
<trg lang="de">Translation content a</trg>
</unit>
<unit>
<src lang="en">Content a</src>
<trg lang="fr">Translation content a</trg>
</unit>
<unit>
<src lang="en">Content b</src>
<trg lang="de">Translation content b</trg>
</unit>
<unit>
<src lang="en">Content b</src>
<trg lang="fr">Translation content b</trg>
</unit>
My aim is to avoid duplicates, so this is my desired output:
<unit>
<src lang="en">Content a</src>
<trg lang="de">Translation content a</trg>
<trg lang="fr">Translation content a</trg>
</unit>
<unit>
<src lang="en">Content b</src>
<trg lang="de">Translation content b</trg>
<trg lang="fr">Translation content b</trg>
</unit>
<unit>
My stylesheet so far:
<xsl:template match="unit">
<xsl:copy>
<xsl:copy-of select="src"/>
<xsl:for-each-group select="current-group()/(* except src)" group-by="node-name(.)">
<xsl:for-each-group select="current-group()" group-by=".">
<xsl:copy-of select="."/>
</xsl:for-each-group>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
It produces the following output:
<unit>
<src lang="en">Content a</src>
<trg lang="de">Translation content a</trg>
<trg lang="fr">Translation content a</trg>
<trg lang="de">Translation content b</trg>
<trg lang="fr">Translation content b</trg>
</unit>
Thanks for any help.