Looks like you just need to skip a few lines. I used readLines(textConnection(examdata))
to determine where the actual data table began. Turns out it starts on the 32nd line. Therefore we can use the skip
argument in read.csv
to skip the first 31 lines. I used the strip.white
argument because there seems to be some erroneous whitespace in the table.
(df <- read.csv(text = examdata, skip = 31L, strip.white = TRUE))
# Type Cash Check Credit Debit Electronic Other Total
# 1 Average Number of Purchases 23.7 3.9 10.1 14.4 4.4 2.3 58.7
# 2 Average Transaction Value $21 $168 $56 $44 $216 $69 $59
# 3 Value of Payments in % 14 19 16 18 27 5 100
Since you'll probably want those numbers to be numeric, you'll need to remove the $
sign and convert the columns to numeric so you'll be able to use them for any calculations you may do later.
df[-1] <- lapply(df[-1], function(x) as.numeric(sub("[$]", "", x)))
df
# Type Cash Check Credit Debit Electronic Other Total
# 1 Average Number of Purchases 23.7 3.9 10.1 14.4 4.4 2.3 58.7
# 2 Average Transaction Value 21.0 168.0 56.0 44.0 216.0 69.0 59.0
# 3 Value of Payments in % 14.0 19.0 16.0 18.0 27.0 5.0 100.0
Now all the columns except the first are numeric.
examdata2 <- read.table(textConnection(examdata), sep = ",", header = TRUE, skip=31, stringsAsFactors=FALSE)
– akrun