How to randomize (or permute) a dataframe rowwise and columnwise?

Question

I have a dataframe (df1) like this.

     f1   f2   f3   f4   f5
d1   1    0    1    1    1  
d2   1    0    0    1    0
d3   0    0    0    1    1
d4   0    1    0    0    1

The d1...d4 column is the rowname, the f1...f5 row is the columnname.

To do sample(df1), I get a new dataframe with count of 1 same as df1. So, the count of 1 is conserved for the whole dataframe but not for each row or each column.

Is it possible to do the randomization row-wise or column-wise?

I want to randomize the df1 column-wise for each column, i.e. the number of 1 in each column remains the same. and each column need to be changed by at least once. For example, I may have a randomized df2 like this: (Noted that the count of 1 in each column remains the same but the count of 1 in each row is different.

     f1   f2   f3   f4   f5
d1   1    0    0    0    1  
d2   0    1    0    1    1
d3   1    0    0    1    1
d4   0    0    1    1    0

Likewise, I also want to randomize the df1 row-wise for each row, i.e. the no. of 1 in each row remains the same, and each row need to be changed (but the no of changed entries could be different). For example, a randomized df3 could be something like this:

     f1   f2   f3   f4   f5
d1   0    1    1    1    1  <- two entries are different
d2   0    0    1    0    1  <- four entries are different
d3   1    0    0    0    1  <- two entries are different
d4   0    0    1    0    1  <- two entries are different

PS. Many thanks for the help from Gavin Simpson, Joris Meys and Chase for the previous answers to my previous question on randomizing two columns.

do you want to permute both the row and columns at the same time. Rereading this, it looks like the column constraint (same number of 1s in each column) didn't hold in your second example permuting rows. — Gavin Simpson
Please don't sign up for multiple accounts. I have asked the moderators to merge the account you used here with the one used on the previous Q. — Gavin Simpson

pms pms · Accepted Answer · 2012-07-16T11:35:33

Given the R data.frame:

Shuffle row-wise:

> df2 <- df1[sample(nrow(df1)),]
> df2
  a b c
3 0 1 0
4 0 0 0
2 1 0 0
1 1 1 0

By default sample() randomly reorders the elements passed as the first argument. This means that the default size is the size of the passed array. Passing parameter replace=FALSE (the default) to sample(...) ensures that sampling is done without replacement which accomplishes a row wise shuffle.

Shuffle column-wise:

> df3 <- df1[,sample(ncol(df1))]
> df3
  c a b
1 0 1 1
2 0 1 0
3 0 0 1
4 0 0 0

How to randomize (or permute) a dataframe rowwise and columnwise?

8 Answers