0
votes

I have discovered that some strings within my data frame contain hidden line break characters, though I can't tell exactly which (when loaded into gVim they simply show up as line breaks). The following code:

gsub("[\r\n]", "", x)

successfully removes the line breaks from within the strings. However, it also removes the line breaks separating the cells, making my data frame atomic instead of recursive. How can I target only the line breaks within the strings while keeping my data frame intact?

Here's a sample of the data:

sample data frame

1
Share some sample data.Gregor Thomas
Without the sample data I can't test it, but you could try something like dataframe$string_column <- sapply(dataframe$string_column, function(x) { gsub("[\r\n]", "", x) }). That way it gets applied to the elements of the column rather than the entire dataframePunintended
Thank you both so much for taking the time to respond! Punintended, your suggested solution worked perfectly.Beth Snyder

1 Answers

4
votes

copying the comments above to close the question,

dataframe <- data.frame(ID = 1:2, Name = 'XX',
  string_column = c('Hi \r\nyou\r\n', 'Always \r\nshare\r\n some \r\nsample\r\n data!'))
  dataframe$string_column  
#> [1] Hi \r\nyou\r\n                                
#> [2] Always \r\nshare\r\n some \r\nsample\r\n data!
#> Levels: Always \r\nshare\r\n some \r\nsample\r\n data! Hi \r\nyou\r\n

dataframe$string_column <- sapply(dataframe$string_column,
                                    function(x) { gsub("[\r\n]", "", x) })
dataframe$string_column
#> [1] "Hi you"                         "Always share some sample data!"