3
votes

What is the best way to handle fetching of list of urls in kaggle kernels?

I tried first testing with google.com.

First Method: Using urllib.request

import urllib.request resp =  urllib.request.urlopen('http://www.google.com')

This lead to error of gai and urlopen error [Errno -2] Name or service not known

Second Method: Using requests

import requests resp = requests.get('http://www.google.com')

This lead to error gaierror: [Errno -3] Temporary failure in name resolution and Failed to establish a new connection: [Errno -3] Temporary failure in name resolution.

import urllib.request
req = urllib.request.Request('http://www.google.com')
print (req)

try:
    response = urllib.request.urlopen(req)
    print (response)
except urllib.error.URLError as e:
    print (e.reason)
    print("something wrong")

Output:

<urllib.request.Request object at 0x7fed1d00c518>
[Errno -2] Name or service not known
something wrong

I tried resolving DNS resolve as suggested by stackoverflow answer.

What is the way around to fix this error? Why is urlopen or requests not working in kaggle kernels?
I have seen many kernels with the same errors kernel 1 kernel 2 kernel 3.

2

2 Answers

10
votes

The reason this isn't working for you is because Kaggle Kernels currently don't currently have internet access. As a result, there's not a way for you to make API calls that require a network connection from within kernels.

Edit August 2018: Just FYI, we have now added internet access to Kernels. :) You can enable it in the left-hand side bar from within the editor.

2
votes

Caveat: You need to enable Internet Access for your Kernel be able to use it in its Setting Menu. And to be able to do that, you must register once via mobile with Kaggle. enter image description here