0
votes

I want to read a csv where one of the columns contains quotes, and within the quotes the string contains commas.

CSV Header:

id,name,promo,categories,price,unit_price

CSV example row:

142251, TULI,,"Men√∫,Limpieza,Limpiadores,Pisos,Curadores",$ 73.65,$81.83 x Lt

I want to have a dataFrame like this:

id=142251

name=TULI

promo=

categories=Men√∫,Limpieza,Limpiadores,Pisos,Curadores

price=$ 73.65

unit_price=$81.83 x Lt

I've tried doing data <- read.csv(file="data.csv", sep=",", quote="")

but I get "Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names" ERROR.

I am aware that the solution will be very simple, but I cannot find the solution. Thanks in advance.

2
Welcome to StackOverflow! Please read the info about how to ask a good question and how to give a reproducible example. This will make it much easier for others to help you.Sotos

2 Answers

1
votes

You can also use readr from the tidyverse for easy csv-reading:


library(readr)
txt <- 'id,name,promo,categories,price,unit_price
142251, TULI,,"Men√∫,Limpieza,Limpiadores,Pisos,Curadores",$ 73.65,$81.83 x Lt'


df <- read_csv(txt)
df
#> # A tibble: 1 x 6
#>       id name  promo categories                               price   unit_price
#>    <dbl> <chr> <lgl> <chr>                                    <chr>   <chr>     
#> 1 142251 TULI  NA    Men√∫,Limpieza,Limpiadores,Pisos,Curado… $ 73.65 $81.83 x …

# Bonus
library(dplyr)
library(tidyr)

df2 <- df %>% 
  mutate(categories = strsplit(categories, ","))
df2
#> # A tibble: 1 x 6
#>       id name  promo categories price   unit_price 
#>    <dbl> <chr> <lgl> <list>     <chr>   <chr>      
#> 1 142251 TULI  NA    <chr [5]>  $ 73.65 $81.83 x Lt

df2 %>% unnest(categories)
#> # A tibble: 5 x 6
#>       id name  promo categories  price   unit_price 
#>    <dbl> <chr> <lgl> <chr>       <chr>   <chr>      
#> 1 142251 TULI  NA    Men√∫       $ 73.65 $81.83 x Lt
#> 2 142251 TULI  NA    Limpieza    $ 73.65 $81.83 x Lt
#> 3 142251 TULI  NA    Limpiadores $ 73.65 $81.83 x Lt
#> 4 142251 TULI  NA    Pisos       $ 73.65 $81.83 x Lt
#> 5 142251 TULI  NA    Curadores   $ 73.65 $81.83 x Lt

Created on 2019-11-29 by the reprex package (v0.3.0)

Of course you can also use df <- read_csv("data.csv") as well!

0
votes

data <- read.csv(file="data.csv", sep=",", quote="\"")

Should work, if you look at the help file To disable quoting altogether, use quote = "". You need to include the quoting characters with an escape \ before.