2
votes

I need to get the child node data values of nodes with a given name in an XML file, using a Perl script. I am using XML::LibXML::Simple.

A code snippet is shown below:

my $booklist = XMLin(path);

  foreach my $book (@{$booklist->{detail}}) {
    print $book->{name} . "\n";
}

And the XML file looks like the following:

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
<book>
<detail label='label1' status='active' type='none'>
<name>book1</name>
</detail >
<detail label='label2' status='active' type='none'>
<name>book2</name>
</detail >
</book>
</booklist>

When I use the above code, I got the following error message: "Not an ARRAY reference"

Can anyone please help me ?

5
are you like? book1, book2, textuser1811486
can you please explain what you which output?user1811486
Yes I want book1 and book2 text as outputuser2526936

5 Answers

2
votes

Below a solution for XML::Simple, which was used in the OP.

use strict;
use warnings;
use XML::Simple;

my $booklist = XMLin($ARGV[0], KeyAttr => [], ForceArray => qr/detail/);

foreach my $book (@{$booklist->{book}->{detail}}) {
    print $book->{name} . "\n";
}

The important piece here are the options given to XMLin, forcing the "detail" subnodes to be represented as an array.

A good quick start for XML::Simple is the documentation on CPAN: http://metacpan.org/pod/XML::Simple

1
votes

I think like this....

use strict;
use XML::Twig;

my $text = join '', <DATA>;
my $story_file = XML::Twig->new(
                twig_handlers =>{
                'name' => \&name,
                keep_atts_order => 1,
},
                pretty_print => 'indented',
);
$story_file->parse($text);

sub name {
        my ($stroy_file, $name) = @_;
    print $name->text, "\n";
}

__END__
<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
<book>
<detail label='label1' status='active' type='none'>
<name>book1</name>
</detail >
<detail label='label2' status='active' type='none'>
<name>book2</name>
</detail >
</book>
</booklist>
1
votes

From the XML::Simple docs:

The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces. In particular, XML::LibXML is highly recommended.

The major problems with this module are the large number of options and the arbitrary ways in which these options interact - often with unexpected results.

Anyway.

In your code, you are skimming over the fact that the booklist contains books which contain details. The booklist has no immediate details. Here is a short solution using XML::LibXML:

use strict; use warnings; use 5.010; use XML::LibXML;

my $dom = XML::LibXML->load_xml(IO => \*DATA) or die "Can't load";

for my $detail ($dom->findnodes('/booklist/book/detail')) {
    say $detail->findvalue('./name');
}

__DATA__
<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>
    <detail label='label1' status='active' type='none'>
      <name>book1</name>
    </detail >
    <detail label='label2' status='active' type='none'>
      <name>book2</name>
    </detail >
  </book>
</booklist>

As you can see in the XPATH expression /booklist/book/detail, we first have to look into the book before finding the details. Of course, this could be shortened to //detail.

In general, if a data structure isn't that what it seems, you should dump it, e.g.

use Data::Dumper;
print Dumper $booklist;

This would output:

$VAR1 = {
  'book' => {
    'detail' => {
      'book2' => {
        'status' => 'active',
        'type' => 'none',
        'label' => 'label2'
      },
      'book1' => {
        'status' => 'active',
        'type' => 'none',
        'label' => 'label1'
      }
    }
  }
};

So for some fucked up reason, the book1 and book2 strings are now keys in a nested hash. Do yourself a favour, and stop using the most complicated XML module on CPAN, the “XML::Simple”.

1
votes

When you write:

@{ $booklist->{detail} }

...you are saying that $booklist->{detail} returns an array reference, and you want perl to dereference it into an array, i.e. the '@'.

Don't use <name> as a tag. XML::Simple parses that weirdly. Here is an example:

1)

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>
      <bname>book1</bname>
  </book>
  <book>
      <bname>book2</bname>
  </book>
</booklist>

use strict;   
use warnings;   
use 5.016;  

use XML::Simple;
use Data::Dumper;



my $booklist = XMLin('xml.xml');
print Dumper($booklist);


--output:--

$VAR1 = {
          'book' => [
                    {
                      'bname' => 'book1'
                    },
                    {
                      'bname' => 'book2'
                    }
                  ]
        };

2) Now look at what happens when you use a <name> tag:

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>
      <name>book1</bname>
  </book>
  <book>
      <name>book2</bname>
  </book>
</booklist>

--output:--
$VAR1 = {
          'book' => {
                    'book2' => {},
                    'book1' => {}
                  }
        };

So with your original example:

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>

    <detail label='label1' status='active' type='none'>
      <bname>book1</bname>
    </detail>

    <detail label='label2' status='active' type='none'>
      <bname>book2</bname>
    </detail>

  </book>
</booklist>


--output:--
$VAR1 = {
          'book' => {
                    'detail' => [
                                {
                                  'bname' => 'book1',
                                  'status' => 'active',
                                  'label' => 'label1',
                                  'type' => 'none'
                                },
                                {
                                  'bname' => 'book2',
                                  'status' => 'active',
                                  'label' => 'label2',
                                  'type' => 'none'
                                }
                              ]
                  }
        };

And to get all the bname tags, you can do this:

use strict;   
use warnings;   
use 5.016;  

use XML::Simple;
use Data::Dumper;

my $booklist = XMLin('xml.xml');
my $aref = $booklist->{book}{detail};

for my $href (@$aref) {
    say $href->{bname};
}


--output:--
book1
book2
0
votes

Yet another way using XML::Rules (assuming the point is to get stuff in 'detail' rather than just print content of 'name'):

use XML::Rules;
my @rules = (
  detail => sub {
    print "$_[1]{name}\n";
    return;
  },
  name => 'content',
  _default => undef,
);

my $xr = XML::Rules->new(rules => \@rules);
$xr->parsefile("tmp.xml");