So I have dataset tuple like this:
data = ((tag1, tag2, correlation_value), (tag1, tag3, correlation_value),...,(tag1, tag n, correlation value), (tag2, tag3, correlation value),...,(tag2, tag n, correlation value),......, (tag n-1, tag n, correlation value)).
I need to make a correlation matrix out of this. I already have the correlation values, as defined above by 'correlation value'. However I am not finding the right technique to do so. Most of the previous questions were regarding calculating the correlation (Pearson etc.) from a dataframe of data or a data array. However, here I have already calculated the correlation using a separate algorithm, and I want to put it in a correlation matrix form suing pandas, so that I can then visualize the correlations.
The correlation table should look something like this:
How can I achieve this? Converting directly to a pandas dataframe using pd.DataFrame() and then unpivoting does not work, as I am left with a lot of 'NaN' values, as my tuple 'data' does not have entries for the same tags, so for example, it does not have a (Tag1, Tag1, correlation value) entry.
It also does not have repeated values like (Tag 1, Tag 2, correlation values) AND (Tag 2 , Tag 1, correlation value). Instead it will have only (Tag 1, Tag 2, correlation value).
So in the corresponding dataframe using pd.DataFrame my entry in the dataframe corresponding to the row Tag 2 and the column Tag 1, is again, a NaN value.
How do I solve this?
Thank You.