I am working on Twitter. I got data from Twitter with Stream API and the result of app is JSON file. I wrote tweets data in a text file and now I see Unicode characters instead of Turkish characters. I don't want to do find/replace in Notepad++ by hand. Is there any automatic option to replace characters by opening txt file, reading all data in file and changing Unicode characters with Turkish characters by Python?
Here are Unicode characters and Turkish characters which I want to replace.
- ğ - \u011f
- Ğ - \u011e
- ı - \u0131
- İ - \u0130
- ö - \u00f6
- Ö - \u00d6
- ü - \u00fc
- Ü - \u00dc
- ş - \u015f
- Ş - \u015e
- ç - \u00e7
- Ç - \u00c7
I tried two different type
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re
dosya = open('veri.txt', 'r')
for line in dosya:
match = re.search(line, "\u011f")
if (match):
replace("\u011f", "ğ")
dosya.close()
and:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
f1 = open('veri.txt', 'r')
f2 = open('veri2.txt', 'w')
for line in f1:
f2.write=(line.replace('\u011f', 'ğ'))
f2.write=(line.replace('\u011e', 'Ğ'))
f2.write=(line.replace('\u0131', 'ı'))
f2.write=(line.replace('\u0130', 'İ'))
f2.write=(line.replace('\u00f6', 'ö'))
f2.write=(line.replace('\u00d6', 'Ö'))
f2.write=(line.replace('\u00fc', 'ü'))
f2.write=(line.replace('\u00dc', 'Ü'))
f2.write=(line.replace('\u015f', 'ş'))
f2.write=(line.replace('\u015e', 'Ş'))
f2.write=(line.replace('\u00e7', 'ç'))
f2.write=(line.replace('\u00c7', 'Ç'))
f1.close()
f2.close()
Both of these didn't work. How can I make it work?
'\u00c7' == 'Ç'
in the python interpreter. It will returnTrue
. More information here: docs.python.org/3/howto/… – Manuel Jacobf2.write=(line.replace('\u00c7', 'Ç'))
does not do what you want. It replaces thewrite
method by a string instead of calling the method (which would bef2.write(...)
). – Manuel Jacob