2
votes

I have a data.frame in R with many columns (over 50+). The column types are integer, factor, and character. Is there a fast way to only select all the character columns for my dataframe?

I tried something like below but it didn't work. :/

Example: new_dataset <- class(old_dataset) %in% c("character") #only select characters

3
old_dataset[sapply(old_dataset, is.character)]? - jay.sf

3 Answers

3
votes

dplyr::select_if() is superseded by dplyr::select(where(...)) in the dplyr 1.0.0

To select columns where type is character use:

library(dplyr)
storms %>% select(where(is.character)) %>% 
  glimpse()
Rows: 10,010
Columns: 2
$ name   <chr> "Amy", "Amy", "Amy", "Amy"...
$ status <chr> "tropical depression", "tropical depression"...
0
votes

dplyr::select_if can help you.

For instance, this gets you all columns of type character in table t:

> t %>% 
    select_if(is.character) %>% 
    glimpse()

Rows: 199,303
Columns: 6
$ user_type     <chr> "registered", "registered", "registered", "registered",…
$ location      <chr> "Massachusetts", "United States", "South Africa", "Loui…
$ website_url   <chr> "http://caerwyn.com", "http://www.berbs.us", "http://tu…
$ link          <chr> "https://stackoverflow.com/users/2406/caerwyn", "https:…
$ profile_image <chr> "https://www.gravatar.com/avatar/0983795ba79aaa48a86752…
$ display_name  <chr> "caerwyn", "berberich", "tumbleweed", "Nate", "Ryan", "…

Replacing is.character with is.numeric, is.logical etc. gets your what you need.

0
votes

Data:

df <- data.frame(
  char = c("hi there", "how're you", "what's up"),
  int = 1:3,
  fac = c("A", "B", "C"),
  stringsAsFactors = F
)
str(df)
'data.frame':   3 obs. of  3 variables:
 $ char: chr  "hi there" "how're you" "what's up"
 $ int : int  1 2 3
 $ fac : chr  "A" "B" "C"

You can select columns by subsetting the dataframe by datatype, thus:

df[sapply(df, is.character)]
        char fac
1   hi there   A
2 how're you   B
3  what's up   C

Here, sapply applies the function is.characterto each of the columns in df. The function itself runs a test - Is the column of type 'character'? - and returns TRUE or FALSE accordingly.