0
votes

Please help with converting a character vector of shopping items into "transactions" for arules. The original data is something like:

shopping_items <- c("apple banana", "orange", "tea orange beef")

Each element of the vector represents items bought in a single transaction, and the items are separated by a space " ", for example transaction 1 includes two items which are apple and banana. How can I convert this into "transactions" type so that I can work with it in arules?

Thank you in advance !

2

2 Answers

2
votes

This is the short version:

library(arules)
shopping_items <- c("apple banana", "orange", "tea orange beef")    

trans <- as(strsplit(shopping_items, " "), "transactions")

inspect(trans)
    items            
[1] {apple,banana}   
[2] {orange}         
[3] {beef,orange,tea}
2
votes

The implementation is probably not optimal, but you can try to improve it.

library(stringi)
library(arules)
library(purrr)

shopping_items <- c("apple banana", "orange", "tea orange beef")

str <- paste(shopping_items,collapse = ' ')

# unique items
str_un <- unique(unlist(stri_split_fixed(str,' ')))

# create a dataframe with dimensions:
# length(shopping_items) x length(str_un)
df <- as.data.frame(matrix(rep(0,length(str_un)*length(shopping_items )),ncol=length(str_un)))
names(df) <- str_un

# positions of 1's in each column
vecs <- map(str_un,grep,shopping_items)

sapply(1:length(str_un), function(x) df[,x][vecs[[x]]] <<- 1)
df[] <- lapply(df,as.factor)

# Generate a transactions dataset.
tr <- as(df, "transactions")

# Generate the association rules.
# rules <- apriori(tr, ...