Python: anagram finder

Question

I have a base string and a dictionary with certain words. I want to find all the possible anagrams of the base string using the words from the dictionary.

for example:

base_string = 'Oscar Wilde'
words = {1: 'sidecar', 2: 'owl', 3: 'low', 4: 'acid', 5: 'bread', 6: 'slower'}

Now I want to see how many different anagrams I can make with the words from the dictionary. The desired output would be 'sidecar owl', 'sidecar low', 'acid slower'.

I transformed the string into a list, which looks like:

letters = ['o', 's', 'c', 'a', 'r', 'w', 'i', 'l', 'd', 'e']

What I hope my code does is check every combination of words from the dictionary. I have a counter that counts the number of tried combinations.

anagrams = []
counter = 0
for i in range(1, len(words)+1):
    anagram = ''
    for i in range(i+1, len(words)+1):
        if contain(letters, words[i]):  #if word is contained in the base string
            for i in words[i]:  #remove each letter of the word from the list of letters of the base string 
                letters.remove(i)
            anagram += words[i] + ' '
    if len(letters) >= 1:  #if all the letters are not used, it's not an anagram
        counter += 1
    if len(letters) == 0:  #if all the letters are used, it's an anagram
        anagrams.append(anagram)

print anagrams

def contain(list1, list2):
    counter1 = Counter(list1)
    counter2 = Counter(list2)
    for k in counter2:
        if counter2[k] != counter1.get(k):
            return False
    return True

findanagram()

I am getting KeyError for anagram += words[i] + ' '

I hope I've explained myself well enough.

Reti43 Reti43 · Accepted Answer · 2015-12-12T03:36:19

I would personally recommend hege's solution. It's simple, straightforward and to the point. However, if you plan on using a large dictionary and repeating this process a number of times, a faster approach may be of interest.

The idea is to associate each letter with a prime number, i.e., a = 2, b = 3, c = 5, etc. The only way to get the number 25 is by having the letter c twice in your word. By multiplying all the letters in a word you obtain its id number. Naturally, any anagrams of that word would also result to the same id.

So, all you have to is check the product of the ids for words A and B is equal to the id of the word you're interested in.

from itertools import combinations
from string import ascii_lowercase as alphabet

primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43,
          47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101]
letter_id = dict(zip(alphabet, primes))

def get_word_id(word):
    product = 1
    for letter in word:
        product *= letter_id[letter]
    return product

words = ['sidecar', 'owl', 'low', 'acid', 'bread', 'slower']
dictionary = {}
for w in words:
    dictionary[w] = get_word_id(w)

base_string = 'Oscar Wilde'

for comb in combinations(words, 2):
    comb_id = 1
    for word in comb:
        comb_id *= dictionary[word]
    if get_word_id(base_string.replace(' ', '').lower()) == comb_id:
        print comb

As I have commented in hege's answer, if you're interested in more than pairs, you can generalise the combinations like this

for no_of_words in xrange(1, len(words)+1):
    for comb in combinations(words, no_of_words):
        ...

Python: anagram finder

2 Answers

Example implementation