0
votes

I'm working with requests and beautifulsoup to parse response content of a url.

But when I try to parse the response and find the title using soup.find('title') in Beautifulsoup, it is returning nothing to me. Not even an error.

It is simply doing nothing. the print statement above soup.find() is being executed. but not the one inside if and after if.

import requests, os
from bs4 import BeautifulSoup
lis=[
    'https://oxhp-member-elr.uhc.com/Member/MemberPortal/'
    ]
for element in lis:
    resp = requests.get(element)
    if resp.status_code == 200:
        cont = resp.content.decode('UTF-8')
        try:
            soup = BeautifulSoup(cont, "html.parser")
            print('Now')
            if soup.findAll('title')[0].get_text() is None:
                print('Hi')
            print('after if')
            print(element.ljust(element_length), resp.status_code, soup.find('title').text)
        except:
            pass

I tried 'soup.find('title').text also. But that didn't work either.

Can anyone let me know what's wrong with my code?

1
Remove the try - except and post the errror message. - Daniel
Could also be that the request is invalid and it's returning something other than 200 - Jab

1 Answers

1
votes

You are handling the exception with the try block and doing nothing (just pass) which is why you are not seeing the error message. If an error occurs which is not inside a try block, the default behaviour is to interrupt the code and print a stack trace. If an error occurs inside a try block, the code jumps to the except block, and it is up to you as to what happens next. No error message will be printed automatically.

If you try printing the error or adding a print statement of the Soup object inside the loop, you'll see the following:

    try:
        soup = BeautifulSoup(cont, "html.parser")
        print('Now')

        # Print the soup object
        print(soup)
        if soup.findAll('title')[0].get_text() is None:
            print('Hi')
        print('after if')
        #print(element.ljust(element_length), resp.status_code, soup.find('title').text)
    except Exception as error:
        # Handle the exception with some information.
        print(error)
        pass

Gives an output of

Sorry, we are unable to process your request at this time.

for the print statement, and the error message looks like this:

list index out of range

Basically you're failing to parse the URL and so you're trying to access an empty array with the [0] in your if statement, which is throwing an error.