I have been trying to work on this issue for a while.I am trying to remove non ASCII characters form DB_user column and trying to replace them with spaces. But I keep getting some errors. This is how my data frame looks:
+----------------------------------------------------------- | DB_user source count | +----------------------------------------------------------- | ???/"Ò|Z?)?]??C %??J A 10 | | ?D$ZGU ;@D??_???T(?) B 3 | | ?Q`H??M'?Y??KTK$?Ù‹???ЩJL4??*?_?? C 2 | +-----------------------------------------------------------
I was using this function, which I had come across while researching the problem on SO.
def filter_func(string):
for i in range(0,len(string)):
if (ord(string[i])< 32 or ord(string[i])>126
break
return ''
And then using the apply function:
df['DB_user'] = df.apply(filter_func,axis=1)
I keep getting the error:
'ord() expected a character, but string of length 66 found', u'occurred at index 2'
However, I thought by using the loop in the filter_func function, I was dealing with this by inputing a char into 'ord'. Therefore the moment it hits a non-ASCII character, it should be replaced by a space.
Could somebody help me out?
Thanks!