2
votes

I am trying to split a column into two separate ones when the divider is a dot.

Following this example, I have tried first:

library(tidyverse)

df1 <- data.frame(Sequence.Name= c(2.1, 2.1, 2.1),
                  var1 = c(1, 1, 0))
df1$Sequence.Name %>% 
  as.character %>%
  str_split_fixed(".",2) %>%
  head
#>      [,1] [,2]
#> [1,] ""   ".1"
#> [2,] ""   ".1"
#> [3,] ""   ".1"

Created on 2021-04-05 by the reprex package (v0.3.0)

But this is not what I want: the first column is empty, and the second one has still the dot.

Following the comments in the post I linked above, I tried to add fixed="." or fixed=TRUE, but it does not seem to work:

library(tidyverse)

df1 <- data.frame(Sequence.Name= c(2.1, 2.1, 2.1),
                  var1 = c(1, 1, 0))
df1$Sequence.Name %>% 
  as.character %>%
  str_split_fixed(".",fixed=".",2) %>%
  head
#> Error in str_split_fixed(., ".", fixed = ".", 2): unused argument (fixed = ".")

Created on 2021-04-05 by the reprex package (v0.3.0)

4

4 Answers

3
votes

something like this?

df1 %>% separate(Sequence.Name, into = c("Col1", "Col2"))

  Col1 Col2 var1
1    2    1    1
2    2    1    1
3    2    1    0
3
votes

It is also possible to do this in base R with read.table

cbind(read.table(text = as.character(df1$Sequence.Name), sep=".", 
         header = FALSE, col.names = c("Col1", "Col2")), df1['var1'])
#  Col1 Col2 var1
#1    2    1    1
#2    2    1    1
#3    2    1    0
2
votes

Here is a data.table option using tstrsplit

> setDT(df1)[, c(lapply(tstrsplit(Sequence.Name, "\\."), as.numeric), .(var1))]
   V1 V2 V3
1:  2  1  1
2:  2  1  1
3:  2  1  0
2
votes

This one also helps:

library(tidyr)

df1 %>%
  extract(col = Sequence.Name, into = c("Sequence", "Name"), regex = "(.).(.)")

  Sequence Name var1
1        2    1    1
2        2    1    1
3        2    1    0