I have a character verctor I want to transform into a data frame.
It's mostly clean but I can't figure out how to finish the cleaning. Notice that the real data are a Date
column as yyyy-mm-dd
and a Variable
column as a number (in this case four digits but not always) separated by a comma.
class(myvec)
[1] "character"
myvec
[1] " \"2016-01-01,8631n\" " " \"2016-01-02,8577n\" "
[3] " \"2016-01-03,8476n\" " " \"2016-01-04,8365n\" "
[5] " \"2016-01-05,8331n\" " " \"2016-01-06,8801n\" "
[7] " \"2016-01-07,5020n\""
The space and backslash" (' \"') should be removed. The same with the n\" The expected output should be a data frame like this
Date Variable
[1,] "2016-01-01" "8631"
[2,] "2016-01-02" "8577"
[3,] "2016-01-03" "8476"
[4,] "2016-01-04" "8365"
[5,] "2016-01-05" "8331"
[6,] "2016-01-06" "8801"
[7,] "2016-01-07" "5020"
Once the vector is clan, I think this does the job
do.call(rbind,strsplit(clean_vector,","))
I think I can convert to date with lubridate
and the var
to numeric with as.numeric on my own, the question is about getting the character vector clean and in the correct format.
gsub("[n \"]","",x) # "2016-01-01,8631"
works fine for the first one. You could also just usesubstr
since all your objects seem to be fixed-width. - Frank