0
votes

Problem:

Unable to generate apriori rules using library(arules) on PC. When I run the following function:

rules <- apriori(df, parameter = list(supp = 0.01, conf = 0.5))

RStudio throws back the following error:

Error in asMethod(object) : column(s) 1, 2, 3, 4, 5 not logical or a factor. Discretize the columns first.

Suspected Solution:

I am almost certain the dataset must be formatted to conform to apriori's expected input.

Dataset:

df

Code:

#Load and install packages
#install.packages("arules")
library(arules)

#Assign to dataframe
df <- read.csv("C:/Titanic.csv", header = TRUE, stringsAsFactors = FALSE)

#generate rules
rules <- apriori(df, parameter = list(supp = 0.01, conf = 0.5))

Attempted Solutions:

#One solution on SO was to factor
df<- sapply(df, as.factor)
#failed.


#What if I discretize the columns?
df$Passenger <- discretize(df$Passenger)
#After discretizing this column and running apriori, still get an error.
df$Class <- discretize(df$Class)
#discretize does not work on column Class


#could column 1 be a problem? Try dropping it.
df$Passenger <- NULL
#this did not work!
1

1 Answers

1
votes

It seems to me that your logic is correct, some fine adjustment is necessary only.

First of all, you need to read characters as factors which mean that stringsAsFactors should be switched on when reading your data:

df <- read.csv("C:/Titanic.csv", header = TRUE, stringsAsFactors = TRUE)

Then the problem should be with the first column only. If you want to drop the first column, you may do it directly in an argument of apriory():

rules <- apriori(df[ , -1], parameter = list(supp = 0.01, conf = 0.5))

If you prefer to handle the first column like a factor, you make do like this

df$Passenger <- as.factor(df$Passenger)

Then your initial statement rules <- apriori(df, parameter = list(supp = 0.01, conf = 0.5)) works perfectly.