Sorting specific columns of a dataframe by their names in R

1

votes

df is a test dataframe and I need to sort the last three columns in ascending order (without hardcoding the order).

df <- data.frame(X = c(1, 2, 3, 4, 5),
            Z = c(1, 2, 3, 4, 5),
            Y = c(1, 2, 3, 4, 5),
            A = c(1, 2, 3, 4, 5),
            C = c(1, 2, 3, 4, 5),
            B = c(1, 2, 3, 4, 5))

Desired output:

> df
  X Z Y A B C
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
5 5 5 5 5 5 5

I'm aware of the order() function but I can't seem to find the right way to implement it to get the desired output.

r dataframe

2

votes

In base R, a selection on the first columns then sort the last 3 names :

df[, c(names(df)[1:(ncol(df)-3)], sort(names(df)[ncol(df)-2:0]))]

2

votes

Update:

Base R:

cbind(df[1:3],df[4:6][,order(colnames(df[4:6]))])

First answer:

We could use relocate from dplyr: https://dplyr.tidyverse.org/reference/relocate.html

It is configured to arrange columns:

Here we relocate by the index. We take last (index = 6) and put it before (position 5, which is C)

library(dplyr)
df %>% 
  relocate(6, .before = 5)

An alternative:

library(dplyr)
df %>% 
  select(order(colnames(df))) %>% 
  relocate(4:6, .before = 1)

X Z Y A B C
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
5 5 5 5 5 5 5

0

votes

We want to reorder the columns based on the column names, so if we use names(df) as the argument to order, we can reorder the data frame as follows.

The complicating factor is that order() returns a vector of numbers, so if we want to reorder only a subset of the column names, we'll need an approach that retains the original sort order for the first three columns.

We accomplish this by creating a vector of the first 3 column names, the sorted remaining column names using a function that returns the values rather than locations in the vector, and then use this with the [ form of the extract operator.

df <- data.frame(X = c(1, 2, 3, 4, 5),
                 Z = c(1, 2, 3, 4, 5),
                 Y = c(1, 2, 3, 4, 5),
                 A = c(1, 2, 3, 4, 5),
                 C = c(1, 2, 3, 4, 5),
                 B = c(1, 2, 3, 4, 5))

df[,c(names(df[1:3]),sort(names(df[4:6])))]

...and the output:

> df[,c(names(df[1:3]),sort(names(df[4:6])))]
  X Z Y A B C
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
5 5 5 5 5 5 5

Sorting specific columns of a dataframe by their names in R

3 Answers