1
votes

I'm trying to catch error 60 and continue the execution of my script, here is what I am doing at the moment :

import urllib2
import csv
from bs4 import BeautifulSoup


matcher = csv.reader(open('matcher.csv', "rb" ))

for i in matcher:
    url = i[1]
    if len(list(url)) > 0:
        print url
        try:
            soup = BeautifulSoup(urllib2.urlopen(url,timeout=10))   

        except urllib2.URLError, e:
            print ("There was an error: %r" % e)

It returns this :

Traceback (most recent call last): File "debug.py", line 13, in soup = BeautifulSoup(urllib2.urlopen(url,timeout=10)) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 400, in open response = self._open(req, data) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 418, in _open '_open', req) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 378, in _call_chain result = func(*args) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1207, in http_open return self.do_open(httplib.HTTPConnection, req) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1180, in do_open r = h.getresponse(buffering=True) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1030, in getresponse response.begin() File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 407, in begin version, status, reason = self._read_status() File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 365, in _read_status line = self.fp.readline() File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 447, in readline data = self._sock.recv(self._rbufsize) socket.timeout: timed out

How would I catch this error and "continue" ?

2
Take a look at thisinspectorG4dget

2 Answers

1
votes

You can try except Exception as e: to catch all errors. However remember this catches all errors and should be avoided if you want to catch only specific errors.

Edit: you can check the exception type by doing:

except Exception as e:
    exc_type, exc_obj, exc_tb = sys.exc_info()
    fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]      
    print(exc_type, fname, exc_tb.tb_lineno)
4
votes

You could import the exception object and modify your except block:

import socket

try:
    soup = BeautifulSoup(urllib2.urlopen(url,timeout=10))   

except urllib2.URLError as e:
    print ("There was an error: %r" % e)
except socket.timeout as e: # <-------- this block here
    print "We timed out"

update: Well, learnt something new - just found a reference to a .reason property:

except urllib2.URLError as e:
    if isinstance(e.reason, socket.timeout):
        pass # ignore this one
    else:
        # do stuff re other errors if you can...
        raise # otherwise propagate the error