I'm new to programming and I'm currently trying to learn Python. I'm doing am following a tutorial that consist in building a Huffman coding app through 4 steps; I've got the first 3 but I'm stuck in the fourth. First step is getting the frequency of a unique character in a string like this:
{'a': 8, 'b': 7, 'c': 6, 'd': 5, 'e': 4, 'f': 3, 'g': 2, 'h': 1}
Second step is putting the characters and frequency in tuples:
[(8, 'a'), (7, 'b'), (6, 'c'), (5, 'd'), (4, 'e'), (3, 'f'), (2, 'g'), (1, 'h')]
Third step is actually building the tree, which looks something like this:
[(36, (15, (7, 'b'), (8, 'a')), (21, (9, (4, 'e'), (5, 'd')), (12, (6, 'c'),
(6, (3, 'f'), (3, (1, 'h'), (2, 'g'))))))]
The fourth step is to "traverse" the tree while assigning each branch a code, 1 if it's on the left and 0 if it's on the right. I've thought of a few ways to do this, but I haven't had much success.
I see that each tier of the tree list has two elements (tree[1]
and tree[2]
) that represent the left and right branches, but I don't know what's the most efficient way to reiterate over the tree. I thought of using a loop that would keep going deeper into each branch if it detected the character belonged to it, but it didn't work and I'm not sure why.
def in_list(my_list, item):
try:
return any(item in sublist for sublist in my_list)
except:
return False
def get_codes(tree, tuples):
tree = tree[0]
for i in tuples:
check = True
while check is True:
if in_list(tree[1], i) is True:
print(i, 'is in 1')
node1 = True
else:
node1 = False
if in_list(tree[2], i) is True:
print(i, 'is in 2')
node2 = True
else:
node2 = False
tree = tree[2]
if node1 is False and node2 is False:
check = False
return 'test'
I'm not even sure this is the best way to approach this.
Here's my full code in case it's necessary:
def get_frequency(string):
freq = dict.fromkeys(string, 0) # fromKeys function takes every unique element of the given parameter.
for i in string: # If given a string, it take each unique character.
freq[i] += 1
print('Step 1: ', freq)
return freq
def get_nodes(dictionary):
node_list = []
list_keys = list(dictionary.values())
list_values = list(dictionary.keys())
for i in range(len(list_keys)):
node_list.append((list_keys[i], list_values[i]))
print('Step 2: ', node_list)
return node_list
def get_tree(node_set):
tree_list = node_set
for i in range(len(tree_list)-1):
# Sort nodes in ascending order. Lowest frequency first
tree_list.sort(key=lambda tup: tup[0])
# Defining the next node (f,l,r) based on the two tuples with the lowest frequency
l_freq = tree_list[0][0]
r_freq = tree_list[1][0]
f = l_freq + r_freq
l_tuple = tree_list[0]
r_tuple = tree_list[1]
# Append new node, delete old node
node = (f, l_tuple, r_tuple)
tree_list.remove(l_tuple)
tree_list.remove(r_tuple)
tree_list.append(node)
print('Step 3: ', tree_list)
return tree_list
def in_list(my_list, item):
try:
return any(item in sublist for sublist in my_list)
except:
return False
def get_codes(tree, tuples):
tree = tree[0]
for i in tuples:
check = True
while check is True:
if in_list(tree[1], i) is True:
print(i, 'is in 1')
node1 = True
else:
node1 = False
if in_list(tree[2], i) is True:
print(i, 'is in 2')
node2 = True
else:
node2 = False
tree = tree[2]
if node1 is False and node2 is False:
check = False
return 'test'
text = 'aaaaaaaabbbbbbbccccccdddddeeeefffggh'
frequency = get_frequency(text)
nodes = get_nodes(frequency)
huff_tree = get_tree(nodes)
codes = get_codes(huff_tree, get_nodes(frequency))