8
votes

I have a pandas dataframe, with 4 rows and 4 columns - here is asimple version:

import pandas as pd
import numpy as np
rows = np.arange(1, 4, 1)
values = np.arange(1, 17).reshape(4,4)
df = pd.DataFrame(values, index=rows, columns=['A', 'B', 'C', 'D'])

what I am trying to do is to convert this to a 2 * 8 dataframe, with B, C and D alligng for each array - so it would look like this:

1  2 
1  3
1  4
5  6
5  7
5  8
9  10
9  11
9  12
13 14
13 15
13 16

reading on pandas documentation I tried this:

df1 = pd.pivot_table(df, rows = ['B', 'C', 'D'], cols = 'A')

but gives me an error that I cannot identify the source (ends with

DataError: No numeric types to aggregate

)

following that I want to split the dataframe based on A values, but I think the .groupby command is probably going to take care of it

1

1 Answers

14
votes

What you are looking for is the melt function

 pd.melt(df,id_vars=['A']) 

     A variable  value
0    1        B      2
1    5        B      6
2    9        B     10
3   13        B     14
4    1        C      3
5    5        C      7
6    9        C     11
7   13        C     15
8    1        D      4
9    5        D      8
10   9        D     12
11  13        D     16

    

A final sorting according to A is then necessary

 pd.melt(df,id_vars=['A']).sort('A')  

      A variable  value
 0    1        B      2
 4    1        C      3
 8    1        D      4
 1    5        B      6
 5    5        C      7
 9    5        D      8
 2    9        B     10
 6    9        C     11
 10   9        D     12
 3   13        B     14
 7   13        C     15
 11  13        D     16

Note: pd.DataFrame.sort has been deprecated in favour of pd.DataFrame.sort_values.