0
votes

I have a base string and a dictionary with certain words. I want to find all the possible anagrams of the base string using the words from the dictionary.

for example:

base_string = 'Oscar Wilde'
words = {1: 'sidecar', 2: 'owl', 3: 'low', 4: 'acid', 5: 'bread', 6: 'slower'}

Now I want to see how many different anagrams I can make with the words from the dictionary. The desired output would be 'sidecar owl', 'sidecar low', 'acid slower'.

I transformed the string into a list, which looks like:

letters = ['o', 's', 'c', 'a', 'r', 'w', 'i', 'l', 'd', 'e']

What I hope my code does is check every combination of words from the dictionary. I have a counter that counts the number of tried combinations.

anagrams = []
counter = 0
for i in range(1, len(words)+1):
    anagram = ''
    for i in range(i+1, len(words)+1):
        if contain(letters, words[i]):  #if word is contained in the base string
            for i in words[i]:  #remove each letter of the word from the list of letters of the base string 
                letters.remove(i)
            anagram += words[i] + ' '
    if len(letters) >= 1:  #if all the letters are not used, it's not an anagram
        counter += 1
    if len(letters) == 0:  #if all the letters are used, it's an anagram
        anagrams.append(anagram)

print anagrams

def contain(list1, list2):
    counter1 = Counter(list1)
    counter2 = Counter(list2)
    for k in counter2:
        if counter2[k] != counter1.get(k):
            return False
    return True

findanagram()

I am getting KeyError for anagram += words[i] + ' '

I hope I've explained myself well enough.

2

2 Answers

0
votes

I would personally recommend hege's solution. It's simple, straightforward and to the point. However, if you plan on using a large dictionary and repeating this process a number of times, a faster approach may be of interest.

The idea is to associate each letter with a prime number, i.e., a = 2, b = 3, c = 5, etc. The only way to get the number 25 is by having the letter c twice in your word. By multiplying all the letters in a word you obtain its id number. Naturally, any anagrams of that word would also result to the same id.

So, all you have to is check the product of the ids for words A and B is equal to the id of the word you're interested in.

from itertools import combinations
from string import ascii_lowercase as alphabet

primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43,
          47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101]
letter_id = dict(zip(alphabet, primes))

def get_word_id(word):
    product = 1
    for letter in word:
        product *= letter_id[letter]
    return product

words = ['sidecar', 'owl', 'low', 'acid', 'bread', 'slower']
dictionary = {}
for w in words:
    dictionary[w] = get_word_id(w)

base_string = 'Oscar Wilde'

for comb in combinations(words, 2):
    comb_id = 1
    for word in comb:
        comb_id *= dictionary[word]
    if get_word_id(base_string.replace(' ', '').lower()) == comb_id:
        print comb

As I have commented in hege's answer, if you're interested in more than pairs, you can generalise the combinations like this

for no_of_words in xrange(1, len(words)+1):
    for comb in combinations(words, no_of_words):
        ...
0
votes

Example implementation

The easiest, but far from most efficient way to do it is this. It will search for two word anagrams:

from itertools import combinations
from collections import Counter

name = 'Oscar Wilde'
words = ['sidecar', 'owl', 'low', 'acid', 'bread', 'slower']

letter_counter = Counter(name.replace(' ', '').lower())
for ws in combinations(words, 2):
    if Counter(''.join(ws)) == letter_counter:
        print(' '.join(ws))

# sidecar owl
# sidecar low
# acid slower

It basically does the same as yours intended, but in a more pythonic way.

There are some issues with your implementation:

  • Your contain function does not work properly. It would give false to contain('a', 'aa'), as it checks the occuring letters count by equality.
  • Your two for loops use the same i index variable.
  • You use 1-based indices (range(1, len(words) + 1)) on arrays, but python arrays are 0-based (range(0, len(words)))