0
votes

I am trying to retrieve the contents of a list that contains of tag elements.

The lists consists for example of the following :

list_titles = [tag, tag, tag]

Where each tag is described by the following structure:

list_titles[i] = <meta content="first title" name="title"/>

And I need to retrieve the titles of these tags. Therefore, I tried to the following:

content_list = []
for title in list_title:
    content = title['content']
    content_list.append(content)

And also tried the following:

for i, title in enumerate(list_title):
   test = list_title[i]
   content = test['content']

Both give the error 'NoneType' object is not subscriptable'. What is the correct way to get the content of each bs4 tag?

2

2 Answers

0
votes

You method is correct to the best of my knowledge, particularly the first one. As you can access attributes by treating the element as a dictionary. The issue most likely is that the element you are trying to retrieve does not exist or you are not using the right selectors.

0
votes

The content attribute is available on each meta by using the following syntax:

print(tag['content'])

Consider the following example snippet:

from bs4 import BeautifulSoup

html = """
<meta content="first title" name="title 1"/>
<meta content="second title" name="title 2"/>
"""

soup = BeautifulSoup(html, 'html.parser')

for meta in soup.findAll('meta'):
    print(meta['content'])

This will print

first title
second title