1
votes

I can't get fread in the data.table package to handle new lines (\n) as intended. They comes out as "\n" rather than a new line (head show "\\n" instead of "\n"). According to this post below I understand that fread should be able to handle this situation: fread and a quoted multi-line column value

I have tried quoting ("string") the values column with the same result. Is there a simple solution or parameter that I have missed? Should they be escaped somehow? Here is an example illustrating the problem, as well as my implementation:

[Edit:] Some clarification, so you don't need to read the code to follow. The content of the strings.txt is shown in the code comment below # strings.txt. The file is a tab separated text file with four columns and three rows plus a header row. The first entry in the file, strMsg1, is identical to strAsIntended. However, fread adds an additional backslash to \n when reading from the file, which makes the new line character into a literal \n. How can this be avoided? I just need to be able to code new lines into my strings. Hope that was understandable.

[Edit2:] The result is shown in this figure. enter image description here

library(data.table)
library(gWidgets2)

# strings.txt
# scope order   key value
# test_gui  1   strMsg1 Text with new line characters:\n1) The first point and the\n2) second point should be on separate lines\n\nThen perhaps some text below, separated by an empty line.
# test_gui  2   strMsg2 Some text does not contain new line characters.
# test_gui  3   strMsg3 Expand window to see text and button widgets

strAsIntended <- "Text with new line characters:\n1) The first point and the\n2) second point should be on separate lines\n\nThen perhaps some text below, separated by an empty line."
filePath <- "C:\\path\\to\\strings.txt"

# Read file.
dt <- fread(file = filePath, sep = "\t", encoding = "UTF-8")
head(dt) # \n has become \\n

# Set key column.
setkey(dt, key = "key")

# Get strings for the specific function.
dt <- dt[dt$scope == "test_gui", ]

# Get strings.
strText <- dt["strMsg1"]$value
strButton <- dt["strMsg2"]$value
strWinTitle <- dt["strMsg3"]$value

# Construct gui.
w <- gwindow(title = strWinTitle)
g <- ggroup(horizontal = FALSE, container = w, expand = TRUE, fill = "both")
gtext(text = strText, container = g)
gtext(text = strAsIntended, container = g)
gbutton(text = strButton, container = g)

[Edit3:] @user2554330 Thanks for the explanation and solution. That was indeed not how I thought it worked.

Here is an updated working code example with screenshot below:

library(data.table)
library(gWidgets2)

# strings.txt
# scope order   key value
# test_gui  1   strMsg1 Text with new line characters:\n1) The first point and the\n2) second point should be on separate lines\n\nThen perhaps some text below, separated by an empty line.
# test_gui  2   strMsg2 Some text does not contain new line characters.
# test_gui  3   strMsg3 Expand window to see text and button widgets

strAsIntended <- "Text with new line characters:\n1) The first point and the\n2) second point should be on separate lines\n\nThen perhaps some text below, separated by an empty line."
filePath <- "C:\\Users\\oskar\\OneDrive\\Dokument\\R\\win-library\\3.6\\strvalidator\\extdata\\languages\\strings.txt"

# Read file.
dt <- fread(file = filePath, sep = "\t", encoding = "UTF-8")

# Check data. Not identical.
print(dt[1]$value) # print adds backslash \\n
print(strAsIntended) # prints \n
cat(dt[1]$value) # cat prints as is \n
cat(strAsIntended) # prints with new line

# Set key column.
setkey(dt, key = "key")

# Get strings for the specific function.
dt <- dt[dt$scope == "test_gui", ]

# Fix new line character.
dt[ , value:=gsub("\\n", "\n", value, fixed = TRUE)]

# Cehck data. Now identical and prints \n
print(dt[1]$value)
print(strAsIntended) 
# Now identical and prints with a new line.
cat(dt[1]$value)
cat(strAsIntended)

# Get strings.
strText <- dt["strMsg1"]$value
strButton <- dt["strMsg2"]$value
strWinTitle <- dt["strMsg3"]$value

# Construct gui.
w <- gwindow(title = strWinTitle)
g <- ggroup(horizontal = FALSE, container = w, expand = TRUE, fill = "both")
gtext(text = strText, container = g)
gtext(text = strAsIntended, container = g)
gbutton(text = strButton, container = g)

enter image description here

Running:

R version 3.6.2 (2019-12-12) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18362)

locale: 1 LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

1
We don't know what is in your strings.txt file.user2554330
It's in the comment in the code example below # strings.txtOskar Hansson

1 Answers

0
votes

You are misinterpreting what fread is doing. Your input file contains a backslash followed by n, and that's what the string from fread contains. However, when you print a string containing a backslash, it is doubled. (Use cat() to print it if you don't want this.) Your strAsIntended variable doesn't contain a backslash, it contains a single newline character, which is displayed as \n when printed.

If you want to convert the \n in your input file into a newline character, used gsub or another substitution function. For example,

dt[,3] <- gsub("\\n", "\n", dt[,3], fixed = TRUE)