I am using the leiden algorithm implementation in iGraph, and noticed that when I repeat clustering using the same resolution parameter, I get different results. Here is some small debugging code I wrote to find this. I tracked the number of clusters post-clustering at each step.
# Do the initial clustering
clustering = do_leiden_clustering(G, resolution_parameter=initial_resolution, n_iterations=n_iterations)
best_num_clusters, best_modularity, best_membership = len(clustering), clustering.modularity, clustering.membership
print(f"best_num_clusters = {best_num_clusters}")
num_clusters = best_num_clusters
i = 0
# Continue until there is a difference between current and previous clustering results.
while num_clusters == best_num_clusters:
clustering = do_leiden_clustering(G, resolution_parameter=initial_resolution, n_iterations=n_iterations)
num_clusters, best_modularity, best_membership = len(clustering), clustering.modularity, clustering.membership
print(f"{i} num_clusters = {num_clusters}")
if num_clusters != best_num_clusters:
# best_num_clusters = num_clusters
print(f"{i} End because num_clusters = {num_clusters} != {best_num_clusters}!")
break
i += 1
# Results
best_num_clusters = 2
0 num_clusters = 2
1 num_clusters = 2
2 num_clusters = 2
3 num_clusters = 2
4 num_clusters = 2
5 num_clusters = 1
5 End because num_clusters = 1 != 2!
Because num_clusters changes at some point, this implies that the clustering results are not deterministic.
Why is this the case?
Thanks! PC