Html Agility Pack empty Nodes

Question

<?xml version="1.0" encoding="UTF-8"?>
<div class="contentLeft">

<h1>Hello</h1>

    <ul id="resultlist" class="stories">
        <li>
          
        </li>
    </ul>

</div>

I have the following XML file, and I would like to read the "li" entry as follows:

var doc = new HtmlDocument();
doc.Load(path);

var query = "//div[contains(@class,'contentLeft')]//ul";
var childNodes = doc.DocumentNode.SelectSingleNode(query).ChildNodes;

Now I should have an entry in the list - but I have three!

Actually I only expect the "li" entry, does anyone of you know where the two "#text" entries come from?

Here are the dontnetfiddle.net link to my Problem:

DotNetfiddle.net

Hung Cao Hung Cao · Accepted Answer · 2018-01-03T19:56:51

There are many ways to solve it:

Modify your XPATH
Use LINQ: nodes.ChildNodes.Where(_ => _.NodeType != HtmlNodeType.Text); or nodes.ChildNodes.Where(_ => _.Name.Equals("li")).

I don't remember exactly but 1 of them should work.

Html Agility Pack empty Nodes

1 Answers