4
votes

I've been trying to learn Python (currently requests and beautifulsoup4) and I found a tutorial online

The issue is I keep getting the below error and cannot figure it out at all...

Any help would be appreciated!

Traceback (most recent call last): File "C:\Users\BillyBob\Desktop\Web Scrap.py", line 14, in title = a.string.strip() AttributeError: 'NoneType' object has no attribute 'strip'

Here is my code in case I made a mistake;

import requests
from bs4 import BeautifulSoup

result = requests.get("http://www.oreilly.com/")

c = result.content

soup = BeautifulSoup(c, "html.parser")
samples = soup.find_all("a")
samples[0]

data = {}
for a in samples:
    title = a.string.strip()
    data[title] = a.attrs['href']
2
The string attribute of a is None. You need to look over the documentation for BeautifulSoup and see what .find_all() returns. - Christian Dean

2 Answers

5
votes

From BS4 documentation:

If a tag contains more than one thing, then it’s not clear what .string should refer to, so .string is defined to be None

I believe you can use .text to get what you want:

title = a.text.strip()
4
votes

The first member of samples does not have a string attribute, and as a result, a.string doesn't return anything, so you're calling the strip() method on something that doesn't exist.

However, then you have another problem; it is not necessarily true that a has the href attribute. Instead, you should check explicitly for both, or else you will get errors (which is the problem with Yevhen's answer, which is otherwise correct).

One potential fix to your problem is to write:

for a in samples:
    if not a.string is None:
        title = a.string.strip()
        if 'href' in a.attrs.keys():
            data[title] = a.attrs['href']

This way, you check explicitly for each parameter before calling the associated method.