1
votes

I am new to learning Python and trying to make a web scraper app but ran into the following error. This is not a complete code but I cannot go ahead if this issue is not resolved first.

Any help will be highly appreciated!

AttributeError: 'NoneType' object has no attribute 'get_text'

import requests
from bs4 import BeautifulSoup

url = "https://www.amazon.co.uk/b?node=13978643031&pf_rd_r=7WY9X56GFTSX0ZTD0VQQ&pf_rd_p=7510143e-2d7f-4e64-a435-f4e242b0abc4"
headers = {
    "user-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36"}
price = 389


def getPrice():
    page = requests.get(url, headers=headers)
    soup = BeautifulSoup(page.content, 'html.parser')
    title = soup.find(id="productTitle").get_text().strip()
    print(title)


if __name__ == "__main__":
    getPrice()

Error:

Traceback (most recent call last): File "/Users/sumeet/vs_code_py/app.py", line 18, in getPrice() File "/Users/sumeet/vs_code_py/app.py", line 13, in getPrice title = soup.find(id="productTitle").get_text().strip() AttributeError: 'NoneType' object has no attribute 'get_text'

1
Does this answer your question? Web-scraping JavaScript page with Python - ggorlen
Do you know which line is raising the error ? - Marco
Do you know how to use the Python debugger? That's the most important skill you have to lean right now! - Peter
@ggorlen Thanks for the suggestion but I'm a noob and it is quite hard to understand that solution. - Suumeet Singh
@Marco Yes, here's the info. Traceback (most recent call last): File "/Users/sumeet/vs_code_py/app.py", line 18, in <module> getPrice() File "/Users/sumeet/vs_code_py/app.py", line 13, in getPrice title = soup.find(id="productTitle").get_text().strip() AttributeError: 'NoneType' object has no attribute 'get_text' - Suumeet Singh

1 Answers

1
votes

The issue is probably that the BeautifulSoup.find() method has not found anything with id='productTitle' so you are trying to get the text of a non-existent element, represented in Python as the primitive None. You need to add some logic to accomodate this possibility, either using if/else or try/except (assuming of course that this actually works on other pages? If not perhaps you aren't searching for the right thing)

Also, as flagged in the comments if the element you are searching for is created by Javascript (which it looks like it is) then the reason you aren't finding it is because it doesn't exist at the stage you are scraping. You need something like Selenium that will actually execute the Javascript.