I am trying to analyze data within CSV files with Chinese characters in their names (E.g. "粗1 25g"). I am using Tkinter to choose the files like so:
selectedFiles = askopenfilenames(filetypes=[("xlsx","*"),("xls","*")]) # Utilize Tkinker dialog window to choose files
selectedFiles = master.tk.splitlist(selectedFiles) # Create list from files chosen
I have attempted to convert the filename to unicode in this way:
selectedFiles = [x.decode("utf-8") for x in selectedFiles]
Only to yield the error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb4 in position 0: ordinal not in range(128)
I have also tried converting the filenames as the files are created with the following:
titles = [x.encode('utf-8') for x in titles]
Only to receive the error:
IOError: [Errno 22] invalid mode ('wb') or filename: 'C:\...\\data_division_files\\\xe7\xb2\x971 25g.csv'
I have also tried combinations of the above methods to no avail. What can I do to allow these files to be read in Python?
(This question,while related, has not been able to solve my problem: Obtain File size with os.path.getsize() in Python 2.7.5)
codecs
is for Unicode contents; it doesn't do any good for Unicode filenames. - abarnert