74
votes

Consider..

dict = {
'Спорт':'Досуг',
'russianA':'englishA'
}

s = 'Спорт russianA'

I'd like to replace all dict keys with their respective dict values in s.

7
This might not be so straightforward. You should probably have an explicit tokenizer (for example {'cat': 'russiancat'} and "caterpillar"). Also overlapping words ({'car':'russiancar', 'pet' : 'russianpet'} and 'carpet'). - Joe
As an aside: I think dict is best avoided as a variable name, because a variable of this name would shadow the built-in function of the same name. - jochen

7 Answers

99
votes

Using re:

import re

s = 'Спорт not russianA'
d = {
'Спорт':'Досуг',
'russianA':'englishA'
}

pattern = re.compile(r'\b(' + '|'.join(d.keys()) + r')\b')
result = pattern.sub(lambda x: d[x.group()], s)
# Output: 'Досуг not englishA'

This will match whole words only. If you don't need that, use the pattern:

pattern = re.compile('|'.join(d.keys()))

Note that in this case you should sort the words descending by length if some of your dictionary entries are substrings of others.

25
votes

You could use the reduce function:

reduce(lambda x, y: x.replace(y, dict[y]), dict, s)
17
votes

Solution found here (I like its simplicity):

def multipleReplace(text, wordDict):
    for key in wordDict:
        text = text.replace(key, wordDict[key])
    return text
5
votes

one way, without re

d = {
'Спорт':'Досуг',
'russianA':'englishA'
}

s = 'Спорт russianA'.split()
for n,i in enumerate(s):
    if i in d:
        s[n]=d[i]
print ' '.join(s)
3
votes

Almost the same as ghostdog74, though independently created. One difference, using d.get() in stead of d[] can handle items not in the dict.

>>> d = {'a':'b', 'c':'d'}
>>> s = "a c x"
>>> foo = s.split()
>>> ret = []
>>> for item in foo:
...   ret.append(d.get(item,item)) # Try to get from dict, otherwise keep value
... 
>>> " ".join(ret)
'b d x'
1
votes

I used this in a similar situation (my string was all in uppercase):

def translate(string, wdict):
    for key in wdict:
        string = string.replace(key, wdict[key].lower())
    return string.upper()

hope that helps in some way... :)

1
votes

With the warning that it fails if key has space, this is a compressed solution similar to ghostdog74 and extaneons answers:

d = {
'Спорт':'Досуг',
'russianA':'englishA'
}

s = 'Спорт russianA'

' '.join(d.get(i,i) for i in s.split())