3
votes

I have a project in Python 2.6 and I'd like to write a utf-8 message to stdout using the system encoding. However it appears that such a function does not exist until Python 3.2:

PySys_FormatStdout

http://docs.python.org/dev/c-api/sys.html

Is there a way to do this from Python 2.6?

To clarify I have a banner that needs to print after Py_Initialize() and before the main interpreter is run. The string is a c-literal containing: "\n and Copyright \xC2\xA9"

where \xC2\xA9 is the utf-8 copyright symbol. I verified in gdb that the copyright symbol is encoded correctly.

Update: I just decided all this grief isn't necessary and I'm going to remove the offending character from the startup banner. There are just too many issues with this, and the documentation is lacking. My expectations were that this would be like Tcl, where:

  1. The embedded interpreter's C-API would make writing stdout out in unicode easy in the system's encoding, and not some default ascii encoding
  2. An exception wouldn't be thrown, if an offending character does not exist in the current encoding. Instead some default replacement character would be displayed.
  3. Additional modules, (e.g. sys), would not be necessary to import just to find out what the system encoding is.
2
1. bugs.python.org/issue4947 (encode by hand in Python < 2.7) 2. use errors="replace" instead of errors="strict" if you must 3. PyUnicode_GetDefaultEncoding()jfs
Thanks J.F., As of now I am just going to avoid using the character in my application's banner.Juan

2 Answers

1
votes

You could use PyFile_WriteObject():

f_stdout = PySys_GetObject("stdout");
text = PyUnicode_DecodeUTF8((char*)str, strlen(str), "strict");
PyFile_WriteObject(text, f_stdout, Py_PRINT_RAW);

If you know the final encoding then you could use PyUnicode_AsEncodedString().