I am using Python 2.7 to create a project that would use Twitter data and analyze it. The main concept is to collect tweets and get the most common hashtags used in that collection of tweets and then I need to create a graph where hashtags would be nodes. If those hashtags would happen to appear in the same tweet that would be an edge in the graph and weight of that edge would be the co-occurrence number. So I am trying to create a dictionary of dictionaries using defaultdict(lambda : defaultdict(int))
and create a graph using networkx.from_dict_of_dicts
My code for creating the co-occurrence matrix is
def coocurrence (common_entities):
com = defaultdict(lambda : defaultdict(int))
# Build co-occurrence matrix
for i in range(len(common_entities)-1):
for j in range(i+1, len(common_entities)):
w1, w2 = sorted([common_entities[i], common_entities[j]])
if w1 != w2:
com[w1][w2] += 1
return com
But in order to use networkx.from_dict_of_dicts
I need it to be in this format: com= {0: {1:{'weight':1}}}
Do you have any ideas how I can solve this? Or a different way of creating a graph like this?