0
votes

The XML file consists of absolutely homogeneous sequences of nodes. Following xml formal logic - there is no parent-child interconnections. All nodes are on the same level, they are siblings. All nodes consists of:

  • single element with the same name, consisting of
  • same set of attributes

So its structure always looks like:

<document ID-1="value" ID-2="value" ID-3="value" attr-4="value"/>
<document ID-1="value" ID-2="value" ID-3="value" attr-4="value"/>
<document ID-1="value" ID-2="value" ID-3="value" attr-4="value"/>
<document ID-1="value" ID-2="value" ID-3="value" attr-4="value"/>
...etc

But. Despite this homogeneity, actually, at the data level contained in the attributes' "value", there is information about the hierarchy, which then is needed to be explicated. The virtual hierarchy of conditional model:

  • parent
    • subparent
      • child

Connection is established according following scheme:

  • child's ID-2="value" is equal subparent ID-1="value"
  • subparent ID-2="value" is equal parent's ID-1="value"

    The complete visualization scheme in this PICTURE

AIM: Restore all hierarchy chain information within each node. Technically - to all subordinate elements (child, subparent) add attributes with the value from all "overlying" elements. In the proposed model, this means adding (copying) attr-4 = "value" from the corresponding parent and / or subparent nodes. Easy speaking it means that to child's element should be added two attr-4="value" (from the subparent and parent).

1-SOURCE:

<document ID-1="SunID"   ID-2="NULL"  ID-3="value" attr-4="SUN"/>      <!-- this is parent's node -->
<document ID-1="EarthID" ID-2="SunID" ID-3="value" attr-4="EARTH" />   <!-- this is subparent -->

<document ID-1="value" ID-2="EarthID" ID-3="value" attr-4="Tokio"/>     <!-- child-1 -->
<document ID-1="value" ID-2="EarthID" ID-3="value" attr-4="London"/>    <!-- child-2 -->
<document ID-1="value" ID-2="EarthID" ID-3="value" attr-4="Rome"/>      <!-- child-3 -->
<document ID-1="value" ID-2="EarthID" ID-3="value" attr-4="Cairo"/>     <!-- child-4 -->

2-XSLT-solution

I can assume the following points of the algorithm that may be implemented in XSLT:

  • matching document node
  • node's self-copying
  • search expression through the XML-file where (child's ID-2)=(subparent's ID-1)
  • search expression through the XML-file where (subparent's ID-2)=(parent's ID-1)
  • so when all that hierarchy's ID chain have found, we can explicit desired model for a node

(Note) potentially useful information for those expressions: ID-3 value is the truly unique id within all xml file.

3-OUTPUT (alleged model)

<document ID-1="SunID"   ID-2="NULL"  ID-3="value" attr-4="SUN"/>                <!-- this is parent's date -->
<document ID-1="EarthID" ID-2="SunID" ID-3="value" attr-4="EARTH" attr-5="SUN"/> <!-- this is subparent -->

<document ID-1="value" ID-2="EarthID" ID-3="value" attr-4="Tokio"  attr-5="EARTH" attr-6="SUN" />  <!-- child-1 -->
<document ID-1="value" ID-2="EarthID" ID-3="value" attr-4="London" attr-5="EARTH" attr-6="SUN" />  <!-- child-2 -->
<document ID-1="value" ID-2="EarthID" ID-3="value" attr-4="Rome"   attr-5="EARTH" attr-6="SUN" />  <!-- child-3 -->
<document ID-1="value" ID-2="EarthID" ID-3="value" attr-4="Cairo"  attr-5="EARTH" attr-6="SUN" />  <!-- child-4 -->

Main question: How may XSLT code looks like? [upd: clarification in XSLT 1.0]

(Note) - of course, we don’t know in advance exactly where the parent, subparent, child nodes are located. As well as the content of the values of their attributes. All these EARTH, SUN values must be compute dynamically.

1
MS SQL Server supports such use case via Recursive CTE. Check it out: sqlservertutorial.net/sql-server-basics/… - Yitzhak Khabinsky
How many attr-4 attributes do you think XML allows on the same element? Attribute names need to be unique on a single element. - Martin Honnen
Yes, i know about T-SQL possibilities and how it may be done. Actually this task came from ssms-sql field. But things is to solve this before converting into ssms tables. At the preceding stage. - Alex
@Martin additional names "attr-4" from parent\subparent in output file may be changed. (Only their value matters) - Alex

1 Answers

1
votes

Even in XSLT 1 you have keys to define and follow any references, with that it is just a recursive apply-templates using the element(s) found by the key function:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

<xsl:template match="@* | node()">
    <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
</xsl:template>

<xsl:key name="ref" match="document" use="@ID-1"/>

<xsl:template match="document">
    <xsl:copy>
        <xsl:copy-of select="@*"/>
        <xsl:apply-templates select="key('ref', @ID-2)" mode="att"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="document" mode="att">
    <xsl:param name="pos" select="count(@*) + 1"/>
    <xsl:attribute name="attr-{$pos}">
        <xsl:value-of select="@attr-4"/>
    </xsl:attribute>
    <xsl:apply-templates select="key('ref', @ID-2)" mode="att">
        <xsl:with-param name="pos" select="$pos + 1"/>
    </xsl:apply-templates>
</xsl:template>

</xsl:stylesheet>

https://xsltfiddle.liberty-development.net/ncntCSJ/1