1
votes

hi i want to parse table but I can't get the information completely

I used the following code that does not return the href link

HtmlNode table = doc.DocumentNode.SelectSingleNode("//table[1]//tbody");
            foreach (var cell in table.SelectNodes(".//tr/td"))
            {
                string someVariable = cell.InnerText;
                Debug.WriteLine(someVariable);
       }

i need to get href too, how can i do this?

<table>
    <tbody>
    <tr>
    <td class="a1">
    <a href="/subtitles/joker-2019/farsi_persian/2110062">
    <span class="l r positive-icon">
    Farsi/Persian
    </span>
    <span>
    Joker.2019.WEBRip.XviD.MP3-SHITBOX
    </span>
    </a>
    </td>
    <td class="a3">
    </td>
    <td class="a40">
    &nbsp;
    </td>
    <td class="a5">
    <a href="/u/695804">
    meisam_t72
    </a>
    </td>
    <td class="a6">
    <div>
    ►► زیرنویس از میثم ططری - ویرایش شده ◄◄ - meisam_t72 کانال تلگرام&nbsp; </div>
    </td>
    </tr>
    </tbody>
    </table>
1

1 Answers

0
votes

Inside your foreach you need to check if the content of your cell contains a <a> tag. If it contains just get the attribute href from this tag.

Something like this (untested)

foreach (var cell in table.SelectNodes(".//tr/td"))
{
    string someVariable = cell.InnerText;
    Debug.WriteLine(someVariable);

    var links = cell.SelectNodes(".//a");
    if (links == null || !links.Any())
    {
        continue;
    }

    foreach (var link in links)
    {
      var href = link.Attributes["href"].Value;
      // do whatever you want with the link.
    }
}