python - Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org

167

votes

I'm practicing the code from 'Web Scraping with Python', and I keep having this certificate problem:

from urllib.request import urlopen 
from bs4 import BeautifulSoup 
import re

pages = set()
def getLinks(pageUrl):
    global pages
    html = urlopen("http://en.wikipedia.org"+pageUrl)
    bsObj = BeautifulSoup(html)
    for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
        if 'href' in link.attrs:
            if link.attrs['href'] not in pages:
                #We have encountered a new page
                newPage = link.attrs['href'] 
                print(newPage) 
                pages.add(newPage) 
                getLinks(newPage)
getLinks("")

The error is:

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1319, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1049)>

Btw,I was also practicing scrapy, but kept getting the problem: command not found: scrapy (I tried all sorts of solutions online but none works... really frustrating)

pythonweb-scrapingbeautifulsoupscrapyssl-certificate

urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1049)> – Catherine4j

and... please tell me the reason behind this error, really want to know~~thanks!! – Catherine4j

There are 529 existing questions on SSL: CERTIFICATE_VERIFY_FAILED, please figure out which is your solution then close this as duplicate. – smci

For example: “SSL: certificate_verify_failed” python? – smci

And I was about to comment the obvious: did you access it with https instead of http? – smci

604

votes

Once upon a time I stumbled with this issue. If you're using macOS go to Macintosh HD > Applications > Python3.6 folder (or whatever version of python you're using) > double click on "Install Certificates.command" file. :D

75

votes

to use unverified ssl you can add this to your code:

import ssl
ssl._create_default_https_context = ssl._create_unverified_context

32

votes

To solve this:

All you need to do is to install Python certificates! A common issue on macOS.

Open these files:

Install Certificates.command
Update Shell Profile.command

Simply Run these two scripts and you wont have this issue any more.

Hope this helps!

24

votes

This terminal command:

open /Applications/Python\ 3.7/Install\ Certificates.command

Found here: https://stackoverflow.com/a/57614113/6207266

Resolved it for me. With my config

pip install --upgrade certifi

had no impact.

19

votes

For novice users, you can go in the Applications folder and expand the Python 3.7 folder. Now first run (or double click) the Install Certificates.command and then Update Shell Profile.command

7

votes

For anyone who is using anaconda, you would install the certifi package, see more at:

https://anaconda.org/anaconda/certifi

To install, type this line in your terminal:

conda install -c anaconda certifi

5

votes

Two steps worked for me : - going Macintosh HD > Applications > Python3.7 folder - click on "Install Certificates.command"

5

votes

I could find this solution and is working fine:

cd /Applications/Python\ 3.7/
./Install\ Certificates.command

4

votes

Take a look at this post, it seems like for later versions of Python, certificates are not pre installed which seems to cause this error. You should be able to run the following command to install the certifi package: /Applications/Python\ 3.6/Install\ Certificates.command

Post 1: urllib and "SSL: CERTIFICATE_VERIFY_FAILED" Error

Post 2: Airbrake error: urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate

2

votes

i didn't solve the problem, sadly. but managed to make to codes work (almost all of my codes have this probelm btw) the local issuer certificate problem happens under python3.7 so i changed back to python2.7 QAQ and all that needed to change including "from urllib2 import urlopen" instead of "from urllib.request import urlopen" so sad...

2

votes

If you're running on a Mac you could just search for Install Certificates.command on the spotlight and hit enter.

1

votes

I had the same error and solved the problem by running the program code below:

# install_certifi.py
#
# sample script to install or update a set of default Root Certificates
# for the ssl module.  Uses the certificates provided by the certifi package:
#       https://pypi.python.org/pypi/certifi

import os
import os.path
import ssl
import stat
import subprocess
import sys

STAT_0o775 = ( stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR
             | stat.S_IRGRP | stat.S_IWGRP | stat.S_IXGRP
             | stat.S_IROTH |                stat.S_IXOTH )


def main():
    openssl_dir, openssl_cafile = os.path.split(
        ssl.get_default_verify_paths().openssl_cafile)

    print(" -- pip install --upgrade certifi")
    subprocess.check_call([sys.executable,
        "-E", "-s", "-m", "pip", "install", "--upgrade", "certifi"])

    import certifi

    # change working directory to the default SSL directory
    os.chdir(openssl_dir)
    relpath_to_certifi_cafile = os.path.relpath(certifi.where())
    print(" -- removing any existing file or link")
    try:
        os.remove(openssl_cafile)
    except FileNotFoundError:
        pass
    print(" -- creating symlink to certifi certificate bundle")
    os.symlink(relpath_to_certifi_cafile, openssl_cafile)
    print(" -- setting permissions")
    os.chmod(openssl_cafile, STAT_0o775)
    print(" -- update complete")

if __name__ == '__main__':
    main()

0

votes

I'm a relative novice compared to all the experts on Stack Overflow.

I have 2 versions of jupyter notebook running (one through a fresh Anaconda Navigator installation and one through ????). I think this is because Anaconda was installed as a local installation on my Mac (per Anaconda instructions).

I already had python 3.7 installed. After that, I used my terminal to open jupyter notebook and I think that it put another version globally onto my Mac.

However, I'm not sure because I'm just learning through trial and error!

I did the terminal command:

conda install -c anaconda certifi

(as directed above, but it didn't work.)

My python 3.7 is installed on OS Catalina10.15.3 in:

/Library/Python/3.7/site-packages AND
~/Library/Python/3.7/lib/python/site-packages

The certificate is at:

~/Library/Python/3.7/lib/python/site-packages/certifi-2019.11.28.dist-info

I tried to find the Install Certificate.command ... but couldn't find it through looking through the file structures...not in Applications...not in links above.

I finally installed it by finding it through Spotlight (as someone suggested above). And it double clicked automatically and installed ANOTHER certificate in the same folder as:

~/Library/Python/3.7/lib/python/site-packages/

NONE of the above solved anything for me...I still got the same error.

So, I solved the problem by:

closing my jupyter notebook.
opening Anaconda Navigator.
opening jupyter notebook through the Navigator GUI (instead of through Terminal).
opening my notebook and running the code.

I can't tell you why this worked. But it solved the problem for me.

I just want to save someone the hassle next time. If someone can tell my why it worked, that would be terrific.

I didn't try the other terminal commands because of the 2 versions of jupyter notebook that I knew were a problem. I just don't know how to fix that.

0

votes

For me the problem was that I was setting REQUESTS_CA_BUNDLE in my .bash_profile

/Users/westonagreene/.bash_profile:
...
export REQUESTS_CA_BUNDLE=/usr/local/etc/openssl/cert.pem
...

Once I set REQUESTS_CA_BUNDLE to blank (i.e. removed from .bash_profile), requests worked again.

export REQUESTS_CA_BUNDLE=""

The problem only exhibited when executing python requests via a CLI (Command Line Interface). If I ran requests.get(URL, CERT) it resolved just fine.

Mac OS Catalina (10.15.6). Pyenv of 3.6.11. Error message I was getting: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)

My answer elsewhere: https://stackoverflow.com/a/64151964/4420657

0

votes

I am using Debian 10 buster and try download a file with youtube-dl and get this error: sudo youtube-dl -k https://youtu.be/uscis0CnDjk

[youtube] uscis0CnDjk: Downloading webpage ERROR: Unable to download webpage: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)> (caused by URLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)')))

Certificates with python2 and python3.8 are installed correctly, but i persistent receive the same error. finally (which is not the best solution, but works for me was to eliminate the certificate check as it is given as an option in youtube-dl) whith this command sudo youtube-dl -k --no-check-certificate https://youtu.be/uscis0CnDjk

0

votes

I am seeing this issue on a Ubuntu 20.04 system and none of the "real fixes" (like this one) helped.

While Firefox was willing to open the site just fine neither GNOME Web (i.e. Epiphany) nor Python3 or wget were accepting the certificate. After some searching, I came across this answer on ServerFault which lists two common reasons:

The certificate is really signed by an unknown CA (for instance an internal CA).

The certificate is signed with an intermediate CA certificate from one of the well known CA's and the remote server is misconfigured in the regard that it doesn't include that intermediate CA certificate as a CA chain it's response.

You can use the Qualys SSL Labs website to check the site's certificates and if there are issues, contact the site's administrator to have it fixed.

If you really need to work around the issue right now, I'd recommend a temporary solution like Rambod's confined to the site(s) you're trying to access.

-1

votes

Use requests library. Try this solution, or just add https:// before the URL:

import requests
from bs4 import BeautifulSoup
import re

pages = set()
def getLinks(pageUrl):
    global pages
    html = requests.get("http://en.wikipedia.org"+pageUrl, verify=False).text
    bsObj = BeautifulSoup(html)
    for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
        if 'href' in link.attrs:
            if link.attrs['href'] not in pages:
                #We have encountered a new page
                newPage = link.attrs['href']
                print(newPage)
                pages.add(newPage)
                getLinks(newPage)
getLinks("")

Check if this works for you

-1

votes

This will work. Set the environment variable PYTHONHTTPSVERIFY to 0.

By typing linux command:

export PYTHONHTTPSVERIFY = 0

OR

Using in python code:

import os
os.environ["PYTHONHTTPSVERIFY"] = "0"

python - Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org

18 Answers