I am attempting parse an XML into a flat file. Of the many topics I have found on this subject at SO, these two both partially what I wish to accomplish.
Example XML
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
<env:Body>
<wd:Get_Schools_Response wd:version="v29.1" xmlns:wd="urn:com.workday/bsvc">
<wd:Response_Filter>
<wd:Page>1</wd:Page>
<wd:Count>50</wd:Count>
</wd:Response_Filter>
<wd:Response_Group>
<wd:Include_Reference>0</wd:Include_Reference>
</wd:Response_Group>
<wd:Response_Results>
<wd:Total_Results>19448</wd:Total_Results>
<wd:Total_Pages>389</wd:Total_Pages>
<wd:Page_Results>50</wd:Page_Results>
<wd:Page>1</wd:Page>
</wd:Response_Results>
<wd:Response_Data>
<wd:School>
<wd:School_Data>
<wd:ID>Chonnam_National_University_Yosu</wd:ID>
<wd:School_Name>Chonnam National University (Yosu)</wd:School_Name>
<wd:Country_Reference>
<wd:ID wd:type="WID">7a5a2aadf9d34086a2bfbfd408bc28da</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-2_Code">KR</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-3_Code">KOR</wd:ID>
<wd:ID wd:type="ISO_3166-1_Numeric-3_Code">410</wd:ID>
</wd:Country_Reference>
<wd:Inactive>0</wd:Inactive>
</wd:School_Data>
</wd:School>
<wd:School>
<wd:School_Data>
<wd:ID>Asian_University_Of_Science_Technology</wd:ID>
<wd:School_Name>Asian University of Science & Technology</wd:School_Name>
<wd:Country_Reference>
<wd:ID wd:type="WID">873d0f604e3b458c990cb4d83a5c0f14</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-2_Code">TH</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-3_Code">THA</wd:ID>
<wd:ID wd:type="ISO_3166-1_Numeric-3_Code">764</wd:ID>
</wd:Country_Reference>
<wd:Inactive>0</wd:Inactive>
</wd:School_Data>
</wd:School>
<wd:School>
<wd:School_Data>
<wd:ID>Groep_T_Leuven</wd:ID>
<wd:School_Name>Groep T Leuven</wd:School_Name>
<wd:Country_Reference>
<wd:ID wd:type="WID">a04ea128f43a42e59b1e6a19e8f0b374</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-2_Code">BE</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-3_Code">BEL</wd:ID>
<wd:ID wd:type="ISO_3166-1_Numeric-3_Code">56</wd:ID>
</wd:Country_Reference>
<wd:Inactive>0</wd:Inactive>
</wd:School_Data>
</wd:School>
<wd:School>
<wd:School_Data>
<wd:ID>Tohono_O_Odham_Community_College</wd:ID>
<wd:School_Name>Tohono O'Odham Community College</wd:School_Name>
<wd:Country_Region_Reference>
<wd:ID wd:type="WID">c7b20b0d4bc04711a00900569e9afabd</wd:ID>
<wd:ID wd:type="Country_Region_ID">USA-AZ</wd:ID>
<wd:ID wd:type="ISO_3166-2_Code">AZ</wd:ID>
</wd:Country_Region_Reference>
<wd:Country_Reference>
<wd:ID wd:type="WID">bc33aa3152ec42d4995f4791a106ed09</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-2_Code">US</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-3_Code">USA</wd:ID>
<wd:ID wd:type="ISO_3166-1_Numeric-3_Code">840</wd:ID>
</wd:Country_Reference>
<wd:Inactive>0</wd:Inactive>
</wd:School_Data>
</wd:School>
</wd:Response_Data>
</wd:Get_Schools_Response>
</env:Body>
</env:Envelope>
<xsl:stylesheet version="1.0"
In the case of the first link I get the following:
1|50|0|19448|389|50|1|Chonnam_National_University_Yosu|Chonnam National University (Yosu)|7a5a2aadf9d34086a2bfbfd408bc28da|KR|KOR|410|0|Asian_University_Of_Science_Technology|Asian University of Science & Technology|873d0f604e3b458c990cb4d83a5c0f14|TH|THA|764|0|Groep_T_Leuven|Groep T Leuven|a04ea128f43a42e59b1e6a19e8f0b374|BE|BEL|56|0|Tohono_O_Odham_Community_College|Tohono O'Odham Community College|c7b20b0d4bc04711a00900569e9afabd|USA-AZ|AZ|bc33aa3152ec42d4995f4791a106ed09|US|USA|840|0
This is a good solution because it drills down into each child nodes and puts in a separator, but doesn't know about the child nodes of the previous ancestor. In addition I do not want the page/results/total_pages information to come over. I added the standard template override but that didn't do anything.
<xsl:template match="text()|@*">
<!--<xsl:value-of select="."/>
Do nothing -->
</xsl:template>
In the case of the second:
ID|School_Name|Country_Reference|Inactive|Country_Region_Reference
Chonnam_National_University_Yosu|Chonnam National University (Yosu)|7a5a2aadf9d34086a2bfbfd408bc28daKRKOR410|0|
Asian_University_Of_Science_Technology|Asian University of Science & Technology|873d0f604e3b458c990cb4d83a5c0f14THTHA764|0|
Groep_T_Leuven|Groep T Leuven|a04ea128f43a42e59b1e6a19e8f0b374BEBEL56|0|
Tohono_O_Odham_Community_College|Tohono O'Odham Community College|bc33aa3152ec42d4995f4791a106ed09USUSA840|0|c7b20b0d4bc04711a00900569e9afabdUSA-AZAZ
In the case of the second example, it's not dynamic enough, it doesn't add bars between the child values. I tried doing things like this:
<xsl:key name="field" match="/*/*/*/*/*/*/*/child::*" use="local-name()"/>
<!-- variable containing the first occurrence of each field -->
<xsl:variable name="allFields"
select="/*/*/*/*/*/*/*/child::*[generate-id()=generate-id(key('field', local-name())[1])]" />
Which produces something like:
ID
Chonnam_National_University_Yosu
Asian_University_Of_Science_Technology
Groep_T_Leuven
Tohono_O_Odham_Community_College
What I am hoping for is to dynamically drill into all children and grandchildren, etc and produce a flat file with delimiters for all values, even if the previous node didn't have those values, and finish each line with a line feed. In addition get rid of 1|50|0|19448|389|50|1 from the first result:
Chonnam_National_University_Yosu|Chonnam National University (Yosu)|7a5a2aadf9d34086a2bfbfd408bc28da||||KR|KOR|410|0
Asian_University_Of_Science_Technology|Asian University of Science & Technology|873d0f604e3b458c990cb4d83a5c0f14||||TH|THA|764|0
Groep_T_Leuven|Groep T Leuven|a04ea128f43a42e59b1e6a19e8f0b374||||BE|BEL|56|0
Tohono_O_Odham_Community_College|Tohono O'Odham Community College|c7b20b0d4bc04711a00900569e9afabd|USA-AZ|AZ|bc33aa3152ec42d4995f4791a106ed09|US|USA|840|0
I am using XSLT but I am open to suggestions on other tools or methods.