0
votes

I have a CSV file with two columns order_id and product_id. The file has about 140k rows.

Here is some sample data from the file:

"order_id","product_id"
"801135853641","1410535456841"
"778925670473","120742871041"
"889236947017","54238412801"
"774614614089","1410073886793"
"810056155209","1293186957385"

I want to run apriori on this, so reading it in as a transactions object using read.transactions. The code is

library(arules)
trans = read.transactions(file_location,
                          format = "single",
                          sep = ",",
                          cols = c("order_id", "product_id"))

When I run this, I get the error

Error in validObject(.Object) : invalid class “ngTMatrix” object: all row indices (slot 'i') must be between 0 and nrow-1 in a TsparseMatrix

Tried a couple of searches, but couldn't find any solutions. Any help would be appreciated.

1
It seems to work for me with the sample data. Make sure you have the latest version of arules. Please open an issue at github.com/mhahsler/arules/issues with the complete data.Michael Hahsler
Gave up on debugging this and used as(...,"transactions") instead. That said, it is likely that it was simply a version issue because I did upgrade the R version soon after posting this.Moon_Watcher

1 Answers

0
votes

I had similar issue, and after removing the record with Null values, it was solved. :D