0
votes

I'm new to Python and BeautifulSoup, how would I search certain tags whose children have certain attributes? For example,

<section ...>
<a href="URL" ...>
<h4 itemprop="name">ABC</h4>
<p class="open"></p>
</a>
</section>

I hope if I could get all names ('ABC') and urls("URL") if class="open". I can get all sections by

soup.findAll(lambda tag: tag.name="section")

But I don't know how to add other conditions since tag.children is a listiterator.

1

1 Answers

1
votes

Because you're looking for certain attributes with the <p> tags, I would search for only <p> tags with attrs={"class": "open"} and then select the parent (which is the <a> tag) and gather the rest of the information from that.

soup = BeautifulSoup(data, "html.parser")
items = soup.find_all("p", attrs={"class": "open"})
for item in items:
    name = item.parent.h4.text
    url = item.parent.attrs.get('href', None)
    print("{} : {}".format(name, url))