3
votes

I'm trying to process an XML file (snip below) where I've extracted the attributes from the element nodes. I would also like to extract the header value, if it exists, and return it with all the types attributes for that "group", but only if it exists. I can't quite work out how to do that though - I can retrieve the header value but can't work out how to only associate with the "group" that has a header value. I'm sure group is the wrong term, it's almost like I want to get the header from the parent node, but it isn't stored in the parent.

I've included example output to hopefully demonstrate what I've tried to explain.

$xml = [xml]@"
<document document="test">
  <elements>
    <element type="header">Header1</element>
    <element type="link" title="Title1" />
    <element type="link" title="Title2" />
    <element type="link" title="Title3" />
  </elements>
  <elements>
    <element type="link" title="Title200" />
  </elements>
  <elements>
    <element type="header">Header2</element>
    <element type="link" title="Title300" />
    <element type="link" title="Title301" />
  </elements>
</document>
"@

$objs = @()
$nodes = $xml.SelectNodes("//*[@type]")
foreach ($node in $nodes) {
    #$node.ParentNode.ToString()
    $type = $node.Attributes['type'].value
    if ($type -eq "header") {$header = $node.InnerText}
    $title = $node.Attributes['title'].value
    $obj = New-Object PSObject -Prop @{TYPE=$type;TITLE=$title;HEADER=$header}
    $objs += $obj
}
$header = ""
$objs

Output I'm currently getting:

TITLE    HEADER  TYPE
-----    ------  ----
         Header1 header
Title1   Header1 link
Title2   Header1 link
Title3   Header1 link
Title200 Header1 link
         Header2 header
Title300 Header2 link
Title301 Header2 link

Output I would like (no header shown for Title200):

TITLE    HEADER  TYPE
-----    ------  ----
         Header1 header
Title1   Header1 link
Title2   Header1 link
Title3   Header1 link
Title200         link
         Header2 header
Title300 Header2 link
Title301 Header2 link
2

2 Answers

2
votes

You are not initializing the $header variable at the beginnig of each foreach pass, which leads to the fact that previous value is kept inside. Try this:

$objs = @()
$nodes = $xml.SelectNodes("//*[@type]")
foreach ($node in $nodes) {
    #$node.ParentNode.ToString()
    $header = ""
    $type = $node.Attributes['type'].value
    if ($type -eq "header") {$header = $node.InnerText}
    $title = $node.Attributes['title'].value
    $obj = New-Object PSObject -Prop @{TYPE=$type;TITLE=$title;HEADER=$header}
    $objs += $obj
}
$header = ""
$objs
0
votes

I've finally worked this out - I can get the header for the "group" by referring to the parent node and then using SelectSingleNode to retrieve the element where the attribute type='header'.

$xml = [xml]@"
<document document="test">
  <elements>
    <element type="header">Header1</element>
    <element type="link" title="Title1" />
    <element type="link" title="Title2" />
    <element type="link" title="Title3" />
  </elements>
  <elements>
    <element type="link" title="Title200" />
  </elements>
  <elements>
    <element type="header">Header2</element>
    <element type="link" title="Title300" />
    <element type="link" title="Title301" />
  </elements>
</document>
"@

cls

$objs = @()
$nodes = $xml.SelectNodes("//*[@type]")
foreach ($node in $nodes) {
    $header = ""
    $type = $node.Attributes['type'].value
    #using the ParentNode, retrieve the element where the attribute type='header' 
    #and then get the InnerText to get the actual value
    $header = $node.ParentNode.SelectSingleNode("element[@type='header']").InnerText
    $title = $node.Attributes['title'].value
    $obj = New-Object PSObject -Prop @{TYPE=$type;TITLE=$title;HEADER=$header}
    if ($type -ne "header") {
    $objs += $obj
    }
}
$header = ""
$objs

This give the output I'm looking for. There may be more efficient ways but it does work and will hopefully help someone else.

TITLE    HEADER  TYPE
-----    ------  ----
Title1   Header1 link
Title2   Header1 link
Title3   Header1 link
Title200         link
Title300 Header2 link
Title301 Header2 link