I have a fairly large set of large XML files, from each of which I want to extract some data. I'm using an evaluation version of Altova XMLSpy
, in which I managed to get an XPATH
to work. However, I need the data in CSV
or text format, so I can use it in R or Excel for further evaluation and I cannot copy the results of the XPATH to a file. I found that using XQUERY
I would be able to, but I fail to get the XQUERY
to work for at least one file.
The XML is structured like:
<d2LogicalModel xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datex2.eu/schema/2_0/2_0" modelBaseVersion="2.0" xsi:schemaLocation="http://datex2.eu/schema/2_0/2_0 D:\NDW\CSS\DataGenerator\DATEXIISchema_2_0_2_0.xsd">
<payloadPublication xmlns="http://datex2.eu/schema/2_0/2_0" xsi:type="MeasuredDataPublication" lang="nl">
<publicationTime>2011-04-21T05:58:34Z</publicationTime>
<publicationCreator>
<country>nl</country>
<nationalIdentifier>NDW-CNS</nationalIdentifier>
</publicationCreator>
<measurementSiteTableReference>NDW01_MT_321</measurementSiteTableReference>
<headerInformation>
<areaOfInterest>national</areaOfInterest>
<confidentiality>restrictedToAuthorities</confidentiality>
<informationStatus>real</informationStatus>
</headerInformation>
<siteMeasurements>
<measurementSiteReference>GRT01_MORO_1002_2</measurementSiteReference>
<measurementTimeDefault>2011-04-21T05:57:00Z</measurementTimeDefault>
<measuredValue index="1">
<basicDataValue xsi:type="TrafficSpeed"/>
</measuredValue>
<measuredValue index="2">
<basicDataValue xsi:type="TrafficSpeed"/>
</measuredValue>
<measuredValue index="3">
<basicDataValue xsi:type="TrafficSpeed"/>
</measuredValue>
<measuredValue index="4">
<basicDataValue xsi:type="TrafficSpeed"/>
</measuredValue>
<measuredValue index="5">
<basicDataValue xsi:type="TrafficSpeed"/>
</measuredValue>
<measuredValue index="6">
<basicDataValue xsi:type="TrafficSpeed"/>
</measuredValue>
</siteMeasurements>
<siteMeasurements>
<measurementSiteReference>RWS01_MONIBAS_0021hrr2131ra</measurementSiteReference>
<measurementTimeDefault>2011-04-21T05:57:00Z</measurementTimeDefault>
<measuredValue index="1">
<basicDataValue xsi:type="TrafficFlow">
<time>2011-04-21T05:56:00Z</time>
<vehicleFlow>900</vehicleFlow>
</basicDataValue>
</measuredValue>
<measuredValue index="2">
<basicDataValue xsi:type="TrafficSpeed">
<numberOfInputValuesUsed>60</numberOfInputValuesUsed>
<standardDeviation>0</standardDeviation>
<time>2011-04-21T05:56:00Z</time>
<averageVehicleSpeed>115</averageVehicleSpeed>
</basicDataValue>
</measuredValue>
<measuredValue index="3">
<basicDataValue xsi:type="TrafficFlow">
<time>2011-04-21T05:56:00Z</time>
<vehicleFlow>1020</vehicleFlow>
</basicDataValue>
</measuredValue>
<measuredValue index="4">
<basicDataValue xsi:type="TrafficSpeed">
<numberOfInputValuesUsed>60</numberOfInputValuesUsed>
<standardDeviation>0</standardDeviation>
<time>2011-04-21T05:56:00Z</time>
<averageVehicleSpeed>104</averageVehicleSpeed>
</basicDataValue>
</measuredValue>
</siteMeasurements>
I want to filter on a specific value for measurementSiteReference and get the results of all measuredValue
with a basicDataValue
for TrafficFlow
, preferably in the format of:
index, value, timestamp
1, 900, 05:56:00
3, 1020, 05:56:00
I have the following XPATH:
//text()[contains(.,"GEO01_Z_RWSTI1011")]/parent::*/parent::*/descendant::measuredValue[(@index)]/basicDataValue/vehicleFlow
This gives me the results for one file, but I cannot find a way to convert that XPATH
to an XQUERY
. The current XQUERY returns no results:
let $nl := " "
for $x in doc("TrafficSpeed 20110421 0800-1559\0800_trafficspeed")/d2LogicalModel/payloadPublication/siteMeasurements
where $x/measurementSiteReference/text()[contains(.,"GEO01_Z_RWSTI1011")]
return concat($x/measurementSiteReference/measuredValue,$nl)
How could I get the return I want using XQUERY?