1
votes

I have a DataFrame df_test as follows:

a   b   c
5   7   1
6   7   0
15  17  1
16  17  0

Question

I am trying to create a dictionary from this dataframe with column b as index. Please note that values in column b are repeated. When I created the dictionary using the code given below, it just shows the last row as the output. How to create a dictionary to include all the information provided in the dataframe.

Tested code

Following is the code:

df_test.set_index('b', inplace=True)
df_test.T.to_dict(orient="list")

Output

{7: [6, 0], 17: [16, 0]}

Desired output

The output should include all the rows corresponding to each of the keys, not just the last row. Something similar to, but not restricted to, the output shown below:

{7: [[5, 1],[6, 0]], 17: [[15, 1],[16, 0]]}
1

1 Answers

1
votes

Use DataFrame.set_index for processing all columns without b in GroupBy.apply in lambda function for convert to nested lists and then to dictionary:

d = df_test.set_index('b').groupby('b').apply(lambda x : x.to_numpy().tolist()).to_dict()
print (d)
{7: [[5, 1], [6, 0]], 17: [[15, 1], [16, 0]]}