2
votes

I have a webservice that returns the XML in the following format. I'm using XML::LibXML to parse the output.

<QueryResponse xmlns="http://www.exchangenetwork.net/schema/node/2" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
    <LastSet>true</LastSet>
    <Results>
        <SRS:SubstanceInformation xsi:schemaLocation="http://www.exchangenetwork.net/schema/SRS/3 http://www.exchangenetwork.net/schema/SRS/3" xmlns:SRS="http://www.exchangenetwork.net/schema/SRS/3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
            <SRS:ChemicalSubstance>
                <SRS:ChemicalSubstanceIdentification>
                    <SRS:EPAChemicalInternalNumber>76109</SRS:EPAChemicalInternalNumber>
                    <SRS:CASRegistryNumber>1000-82-4</SRS:CASRegistryNumber>
                    <SRS:ChemicalSubstanceSystematicName>Urea, N-(hydroxymethyl)-</SRS:ChemicalSubstanceSystematicName>
                    <SRS:EPAChemicalRegistryName>Methylolurea</SRS:EPAChemicalRegistryName>
                    <SRS:EPAChemicalIdentifier/>
                    <SRS:ChemicalSubstanceDefinitionText/>
                    <SRS:ChemicalSubstanceCommentText/>
                    <SRS:MolecularFormulaCode>C2H6N2O2</SRS:MolecularFormulaCode>
                    <SRS:ChemicalSubstanceFormulaWeightQuantity>90.08</SRS:ChemicalSubstanceFormulaWeightQuantity>
                    <SRS:ChemicalSubstanceLinearStructureCode>O=C(NCO)N</SRS:ChemicalSubstanceLinearStructureCode>
                    <SRS:InternationalChemicalIdentifier/>
                    <SRS:FormerCASRegistryNumberList/>
                    <SRS:IncorrectlyUsedCASRegistryNumberList>
                        <SRS:CASRegistryNumber>50-00-0</SRS:CASRegistryNumber>
                    </SRS:IncorrectlyUsedCASRegistryNumberList>
                    <SRS:ClassificationList/>
                    <SRS:TechnicalPointOfContact/>
                    <SRS:SubstanceRequestor/>
                    <SRS:SubstanceCreateDate>2006-10-13 14:30:12.0</SRS:SubstanceCreateDate>
                    <SRS:SubstanceLastUpdateDate>2010-01-20 12:29:21.0</SRS:SubstanceLastUpdateDate>
                    <SRS:SubstanceStatus>A</SRS:SubstanceStatus>
                </SRS:ChemicalSubstanceIdentification>
                <SRS:ChemicalSubstanceSynonymList>
                    <SRS:ChemicalSubstanceSynonym>
                        <SRS:ChemicalSubstanceSynonymName>Urea, (hydroxymethyl)-</SRS:ChemicalSubstanceSynonymName>
                        <SRS:ChemicalSynonymStatusName>Reviewed</SRS:ChemicalSynonymStatusName>
                        <SRS:ChemicalSynonymSourceName>Chemical Update System (CUS) 1986</SRS:ChemicalSynonymSourceName>
                        <SRS:RegulationReasonText/>
                        <SRS:CharacteristicList/>
                        <SRS:AlternateIdentifierList/>
                    </SRS:ChemicalSubstanceSynonym>
                </SRS:ChemicalSubstanceSynonymList>
            </SRS:ChemicalSubstance>
        </SRS:SubstanceInformation>
    </Results>
    <RowCount>1</RowCount>
    <RowId>0</RowId>
</QueryResponse>

and I can't figure out how to get to ChemicalSubstanceIdentification node in the XML. My code is

my $parser = XML::LibXML->load_xml(location => 'output.xml');

my $doc = XML::LibXML::XPathContext->new($parser);
$doc->registerNs('SRS', 'http://www.exchangenetwork.net/schema/SRS/3');
my $chemIdent = $doc->findnodes('/QueryResponse/Results/SRS:SubstanceInformation/SRS:ChemicalSubstance/SRS:ChemicalSubstanceIdentification');

Is something wrong with what i'm doing. Any help is appreciated. Thanks!

1

1 Answers

3
votes

The first couple elements on your XPath are in the http://www.exchangenetwork.net/schema/node/2 namespace in your XML document. You'll have to specify that namespace for the QueryResponse and Results elements for your XPath to work.

Alternatively, if there's only a single SRS:SubstanceInformation in Results anyway, you might just skip over QueryResponse and Results via //:

//SRS:SubstanceInformation/SRS:ChemicalSubstance/SRS:ChemicalSubstanceIdentification