1
votes

I want to get a list of top-level children nodes from an HTML string. Using PHP's DomDocument, I tried the following:

$html = new DomDocument();
$html->loadHTML('<p>One</p><p>Two</p><p>Three</p>');
foreach( $html->childNodes as $node ) {
    echo $node->nodeName . ':' . $node->nodeValue. '<br>';
}

Unfortunately, the output I get is

html:
html:OneTwoThree

Where what I want is something like

paragraph: One
paragraph: Two
paragraph: Three

Am I missing something? PHP documentation isn't of much help. I tried on PHPTester using different PHP versions and still get the same result.

2

2 Answers

1
votes

Remember that DomDocument creates an entire dom document not just a fragment of one, so you p elements should be in the body elements.

foreach( $html->getElementsByTagName('body')->item(0)->childNodes as $node ) {
1
votes

You can use the getElementsByTagName() method:

$html = new DomDocument();
$html->loadHTML('<html><p>One</p><p>Two</p><p>Three</p></html>');
$nodes = $html->getElementsByTagName('p');
foreach($nodes as $node) {
    echo $node->nodeName . ':' . $node->nodeValue. '<br>';
}

// The above results in:
// p:One
// p:Two
// p:Three

I hope that's equivalent for your purposes.