33
votes

I have a data frame with a number of columns, and would like to output a separate column for each with the length of each row in it.

I am trying to iterate through the column names, and for each column output a corresponding column with '_length' attached.

For example col1 | col2 would go to col1 | col2 | col1_length | col2_length

The code I am using is:

df <- data.frame(col1 = c("abc","abcd","a","abcdefg"),col2 = c("adf qqwe","d","e","f"))

for(i in names(df)){
  df$paste(i,'length',sep="_") <- str_length(df$i)
 }

However this throws and error:

invalid function in complex assignment.

Am I able to use loops in this way in R?

4

4 Answers

74
votes

You need to use [[, the programmatic equivalent of $. Otherwise, for example, when i is col1, R will look for df$i instead of df$col1.

for(i in names(df)){
  df[[paste(i, 'length', sep="_")]] <- str_length(df[[i]])
}
10
votes

You can use lapply to pass each column to str_length, then cbind it to your original data.frame...

library(stringr)

out <- lapply( df , str_length )    
df <- cbind( df , out )

#     col1     col2 col1 col2
#1     abc adf qqwe    3    8
#2    abcd        d    4    1
#3       a        e    1    1
#4 abcdefg        f    7    1
7
votes

With dplyr and stringr you can use mutate_all:

> df %>% mutate_all(funs(length = str_length(.)))

     col1     col2 col1_length col2_length
1     abc adf qqwe           3           8
2    abcd        d           4           1
3       a        e           1           1
4 abcdefg        f           7           1
4
votes

For the sake of completeness, there is also a data.table solution:

library(data.table)
result <- setDT(df)[, paste0(names(df), "_length") := lapply(.SD, stringr::str_length)]
result
#      col1     col2 col1_length col2_length
#1:     abc adf qqwe           3           8
#2:    abcd        d           4           1
#3:       a        e           1           1
#4: abcdefg        f           7           1