1
votes

I am working on an XML/XQuery project about the most recent Rugby World Cup. I have been working to design an XQuery that will return a list of stadiums that each team played in.

My XQuery looks like this:

for $a in doc("squads.xml")/tournament/squad,
$b in doc("match.xml")/tournament/match

where (($a/@country = $b/team1) or ($a/country = $b/team2))

return <team country ="{$a/@country}">
         <stadia>
           <stadium>{data($b/hostedBy)}</stadium>
         </stadia>
       </team>

and gives output:

<team country="Argentina">
  <stadia>
    <stadium>Twickenham</stadium>
  </stadia>
</team

and so on, where each stadium is listed under a separate team element.

Ideally, I would like the output to produce the stadiums as a list, like so:

<team country = "Argentina">
  <stadia>
    <stadium>Twikenham</stadium>
    <stadium>Leicester City Stadium</stadium>
    <stadium>Kingsholm Stadium</stadium>
  </stadia>
</team>

Is there an easy way to return $b/hostedBy as a like the list shown?

A sample of squads.xml:

<tournament>
  <squad country = "Argentina">
    <tier>1</tier>
    <result>4th</result>
    <games_played>7</games_played>
    ...
  </squad>
...
</tournament>

A sample of match.xml:

<tournament>
  <match>
      <matchNumber>8</matchNumber>
      <pool>C</pool>
      <team1>New Zealand</team1>
      <team2>Argentina</team2>
      <attendance>89,019</attendance>
      <hostedBy>Wembley Stadium</hostedBy>
  </match>
  ...
</tournament>
1

1 Answers

1
votes

That's what happens when you use for $x in ..., $y in ... - it's essentially multiplying the two sequences. This often trips people up coming to XQuery from SQL. What you want is grouping, and you can simply add a nested loop that joins on the team values from the outer loop:

for $squad in doc("squads.xml")/tournament/squad
return 
  <team country ="{$squad/@country}">
    <stadia>{
      for $match in doc("match.xml")/tournament/match
      where ($match/(team1 | team2) = $squad/@country)
      return <stadium>{ $match/hostedBy/string() }</stadium>
    }</stadia>
  </team>

If you need to de-duplicate stadiums, you can use distinct-values:

for $squad in doc("squads.xml")/tournament/squad
let $hosted-by := 
  for $match in doc("match.xml")/tournament/match
  where ($match/(team1 | team2) = $squad/@country)
  return $match/hostedBy/string()
return 
  <team country ="{$a/@country}">
    <stadia>{
      for $stadium in distinct-values($hosted-by)
      return <stadium>{ $stadium }</stadium>
    }</stadia>
  </team>