can you write a str.replace() using dictionary values in Python?

53

votes

I have to replace the north, south, etc with N S in address fields.

If I have

list = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
address = "123 north anywhere street"

Can I for iterate over my dictionary values to replace my address field?

for dir in list[]:
   address.upper().replace(key,value)

I know i'm not even close!! But any input would be appreciated if you can use dictionary values like this.

pythondictionarystr-replace

this is pretty tricky if matches can overlap. See this question - georg

A BIG part of the problem is that the string replace() method returns a copy of string with occurrences replaced -- it doesn't do it in-place. - martineau

You can simply use str.translate. - Neel Patel

See stackoverflow.com/questions/2400504/… for the best solution - Ethan Bradford

48

votes

address = "123 north anywhere street"

for word, initial in {"NORTH":"N", "SOUTH":"S" }.items():
    address = address.replace(word.lower(), initial)
print address

nice and concise and readable too.

16

votes

you are close, actually:

dictionary = {"NORTH":"N", "SOUTH":"S" } 
for key in dictionary.iterkeys():
    address = address.upper().replace(key, dictionary[key])

Note: for Python 3 users, you should use .keys() instead of .iterkeys():

dictionary = {"NORTH":"N", "SOUTH":"S" } 
for key in dictionary.keys():
    address = address.upper().replace(key, dictionary[key])

12

votes

One option I don't think anyone has yet suggested is to build a regular expression containing all of the keys and then simply do one replace on the string:

>>> import re
>>> l = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
>>> pattern = '|'.join(sorted(re.escape(k) for k in l))
>>> address = "123 north anywhere street"
>>> re.sub(pattern, lambda m: l.get(m.group(0).upper()), address, flags=re.IGNORECASE)
'123 N anywhere street'
>>>

This has the advantage that the regular expression can ignore the case of the input string without modifying it.

If you want to operate only on complete words then you can do that too with a simple modification of the pattern:

>>> pattern = r'\b({})\b'.format('|'.join(sorted(re.escape(k) for k in l)))
>>> address2 = "123 north anywhere southstreet"
>>> re.sub(pattern, lambda m: l.get(m.group(0).upper()), address2, flags=re.IGNORECASE)
'123 N anywhere southstreet'

8

votes

You are probably looking for iteritems():

d = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
address = "123 north anywhere street"

for k,v in d.iteritems():
    address = address.upper().replace(k, v)

address is now '123 N ANYWHERE STREET'

Well, if you want to preserve case, whitespace and nested words (e.g. Southstreet should not converted to Sstreet), consider using this simple list comprehension:

import re

l = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}

address = "North 123 East Anywhere Southstreet    West"

new_address = ''.join(l[p.upper()] if p.upper() in l else p for p in re.split(r'(\W+)', address))

new_address is now

N 123 E Anywhere Southstreet    W

6

votes

"Translating" a string with a dictionary is a very common requirement. I propose a function that you might want to keep in your toolkit:

def translate(text, conversion_dict, before=None):
    """
    Translate words from a text using a conversion dictionary

    Arguments:
        text: the text to be translated
        conversion_dict: the conversion dictionary
        before: a function to transform the input
        (by default it will to a lowercase)
    """
    # if empty:
    if not text: return text
    # preliminary transformation:
    before = before or str.lower
    t = before(text)
    for key, value in conversion_dict.items():
        t = t.replace(key, value)
    return t

Then you can write:

>>> a = {'hello':'bonjour', 'world':'tout-le-monde'}
>>> translate('hello world', a)
'bonjour tout-le-monde'

4

votes

def replace_values_in_string(text, args_dict):
    for key in args_dict.keys():
        text = text.replace(key, str(args_dict[key]))
    return text

4

votes

I would suggest to use a regular expression instead of a simple replace. With a replace you have the risk that subparts of words are replaced which is maybe not what you want.

import json
import re

with open('filePath.txt') as f:
   data = f.read()

with open('filePath.json') as f:
   glossar = json.load(f)

for word, initial in glossar.items():
   data = re.sub(r'\b' + word + r'\b', initial, data)

print(data)

3

votes

Try,

import re
l = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}

address = "123 north anywhere street"

for k, v in l.iteritems():
    t = re.compile(re.escape(k), re.IGNORECASE)
    address = t.sub(v, address)
print(address)

2

votes

All of these answers are good, but you are missing python string substitution - it's simple and quick, but requires your string to be formatted correctly.

address = "123 %(direction)s anywhere street"
print(address % {"direction": "N"})

2

votes

Both using replace() and format() are not so precise:

data =  '{content} {address}'
for k,v in {"{content}":"some {address}", "{address}":"New York" }.items():
    data = data.replace(k,v)
# results: some New York New York

'{ {content} {address}'.format(**{'content':'str1', 'address':'str2'})
# results: ValueError: unexpected '{' in field name

It is better to translate with re.sub() if you need precise place:

import re
def translate(text, kw, ignore_case=False):
    search_keys = map(lambda x:re.escape(x), kw.keys())
    if ignore_case:
        kw = {k.lower():kw[k] for k in kw}
        regex = re.compile('|'.join(search_keys), re.IGNORECASE)
        res = regex.sub( lambda m:kw[m.group().lower()], text)
    else:
        regex = re.compile('|'.join(search_keys))
        res = regex.sub( lambda m:kw[m.group()], text)

    return res

#'score: 99.5% name:%(name)s' %{'name':'foo'}
res = translate( 'score: 99.5% name:{name}', {'{name}':'foo'})
print(res)

res = translate( 'score: 99.5% name:{NAME}', {'{name}':'foo'}, ignore_case=True)
print(res)

can you write a str.replace() using dictionary values in Python?

10 Answers