0
votes
examdata <- RCurl::getURL("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt")

examdata2 <- read.table(textConnection(examdata), sep = ",", header = T)

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 1 did not have 2 elements

3
Try examdata2 <- read.table(textConnection(examdata), sep = ",", header = TRUE, skip=31, stringsAsFactors=FALSE)akrun
Why did you delete the URL? It's important to the answer to the questionRich Scriven

3 Answers

7
votes

Looks like you just need to skip a few lines. I used readLines(textConnection(examdata)) to determine where the actual data table began. Turns out it starts on the 32nd line. Therefore we can use the skip argument in read.csv to skip the first 31 lines. I used the strip.white argument because there seems to be some erroneous whitespace in the table.

(df <- read.csv(text = examdata, skip = 31L, strip.white = TRUE))
#                          Type Cash Check Credit Debit Electronic Other Total
# 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
# 2   Average Transaction Value  $21  $168    $56   $44       $216   $69   $59
# 3      Value of Payments in %   14    19     16    18         27     5   100

Since you'll probably want those numbers to be numeric, you'll need to remove the $ sign and convert the columns to numeric so you'll be able to use them for any calculations you may do later.

df[-1] <- lapply(df[-1], function(x) as.numeric(sub("[$]", "", x)))
df
#                          Type Cash Check Credit Debit Electronic Other Total
# 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
# 2   Average Transaction Value 21.0 168.0   56.0  44.0      216.0  69.0  59.0
# 3      Value of Payments in % 14.0  19.0   16.0  18.0       27.0   5.0 100.0

Now all the columns except the first are numeric.

0
votes

read.table and read.csv will take a URL as a path and handle the connection for you, so you don't really need RCurl:

read.csv("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt", 
         skip = 31)

##                          Type Cash Check Credit Debit Electronic Other Total
## 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
## 2   Average Transaction Value  $21  $168    $56   $44       $216   $69   $59

Further, if you use readr::read_csv, you can tell it to parse columns to numbers, stripping out $ characters as it reads:

library(readr)

read_csv("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt", 
         skip = 31, 
         col_types = cols(Type = 'c', .default = 'n'))    # c = character, n = number

## # A tibble: 2 × 8
##                          Type  Cash Check Credit Debit Electronic Other Total
##                         <chr> <dbl> <dbl>  <dbl> <dbl>      <dbl> <dbl> <dbl>
## 1 Average Number of Purchases  23.7   3.9   10.1  14.4        4.4   2.3  58.7
## 2   Average Transaction Value  21.0 168.0   56.0  44.0      216.0  69.0  59.0
0
votes

Try:

df <- read.csv("x.csv",... ,**quote = "", fill=TRUE**)