1
votes

I am getting the following error when I run the code below when running my python code in Cloud9 IDE using the default version of Python (2.7.6):

import urllib
artistValue = "Sigur Rós"
artistValueUrl = urllib.quote(artistValue)

SyntaxError: Non-ASCII character '\xc3' in file /home/ubuntu/workspace/test.py on line 2, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

I read to adjust to the following code below was a work around.

import urllib
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
artistValue = "Sigur Rós"
artistValueUrl = urllib.quote(artistValue)

When I tried this a red x pop-up error that read:

Module 'sys' has no 'setdefaultencoding' member"

and if I run the code I still get the Syntax Error.

Why is this happening and what should I do?

EDIT: I also tried the following from the selected answer:

import urllib
print urllib.quote(u"Sigur Rós")

When I ran it I received the following error:

SyntaxError: Non-ASCII character '\xc3' in file /home/ubuntu/workspace/test.py on line 2, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

1
Sorry, sys.setdefaultencoding('utf-8') won't work in the Cloud9 IDE; see the sys docs for details. And it's not a good idea anyway, see Dangers of sys.setdefaultencoding('utf-8') for info. Please post your code (in a code block) & a sample of the data you're trying to read (also in a code block) so we can help you fix your problem. Also mention what Python version you're using, since Python 2 & Python 3 handle Unicode differently. - PM 2Ring
Thank you for the feedback. I did my best to edit what you recommended above. Do you have any suggestions? - Jason Melo Hall

1 Answers

1
votes

Ok, that's a bit weird. The Python interpreter should give a SyntaxError complaining about the non-ASCII character in your source code if you don't declare an encoding at the start of the script; OTOH, if you have declared an encoding (or Cloud9 does it automatically), then the Python interpreter ought to treat it as a UTF-8 encoded string.

I'm not familiar with Cloud9, so I can't guarantee that this will work, but it ought to. :)

Make your string a Unicode string (by using the u string prefix) and then explicitly encode it to UTF-8:

import urllib

artistValue = u"Sigur Rós"
artistValueUrl = urllib.quote(artistValue.encode('utf-8'))
print artistValueUrl

output

Sigur%20R%C3%B3s

edit

What happens if you run this:

# -*- coding: utf-8 -*-
import urllib
print urllib.quote("Sigur Rós")

The following should work. Of course, this isn't a practical way to enter such strings into your script, I'm just trying to get a handle on what Cloud9 is doing.

import urllib
print urllib.quote("Sigur R\xc3\xb3s")

And I guess you might as well also try this, just so we can see what error message it produces:

import urllib
print urllib.quote(u"Sigur Rós")