1
votes

I am trying to implement a simple Multi-layer feed forward neural network using "neuralnet" package available in R for the "iris" dataset.

The code that I am using is as follows-

library(neuralnet)
data(iris)

D <- data.frame(iris, stringsAsFactors=TRUE)

# create formula-
f <- as.formula(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width)

# convert qualitative variables to dummy (binary) variables-
m <- model.matrix(f, data = D)

# create neural network-
iris_nn <- neuralnet(f, data = m, hidden = 4, learningrate = 0.3)

I have two questions at this point of time-

1.) How do I use the "hidden" parameter? According to the manual pages, its saying-

hidden: a vector of integers specifying the number of hidden neurons (vertices) in each layer

How should I supply the vector of integer? Say if I wanted to have 1 hidden layer of 4 neurons/perceptrons in each layer Or if I wanted to have 3 hidden layers of 5 neurons in each layer.

2.) The last line of code gives me the error-

Error in eval(predvars, data, env) : object 'Species' not found

If I remove the "hidden" parameter, this error still persists.

What am I doing wrong here?

Edit: after adding the line-

m <- model.matrix(f, data = D)

The matrix 'm' no longer contains "Species" variable/attribute which I am trying to predict.

Output of

str(D)

str(D) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

I have coded this with "nnet" successfully. Posting my code for reference-

data(iris)
library(nnet)

# create formula-
f <- as.formula(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width)

# create a NN with hidden layer having 4 neurons/node and
# maximum number of iterations = 3
iris_nn <- nnet(f, data = iris, size = 4, maxit = 3)

# create a test data-
new_obs <- data.frame(Sepal.Length = 5.5, Sepal.Width = 3.1, Petal.Length = 1.4, Petal.Width = 0.4)

# make prediction-
predict(iris_nn, new_obs)   # gives percentage of which class it may belong
predict(iris_nn, new_obs, type = "class")   # gives the class instead of percentages of which 'class' this data type may belong to


# create a 'confusion matrix' to measure accuracy of model-
# rows are actual values and columns are predicted values-
# table(iris$Species, predict(iris_nn, iris[, 1:4], type = "class"))
cat("\n\nConfusion Matrix for # of iters = 3\n")
print(table(iris$Species, predict(iris_nn, iris[, 1:4], type = "class")))
cat("\n\n")

rm(iris_nn)

# setting 'maxit' to 1000, makes the model coverge-
iris_nn <- nnet(f, data = iris, size = 4, maxit = 1000)

# create a new confusion matrix to check model accuracy again-
cat("\n\nConfusion Matrix for # of iters = 1000\n")
print(table(iris$Species, predict(iris_nn, iris[, 1:4], type = "class")))
# table(iris$Species, predict(iris_nn, iris[, 1:4], type = "class"))


# to plot 'iris_nn' trained NN-
# library("NeuralNetTools")
# plotnet(iris_nn)

Thanks!!

1
@SamFlynn I have edited my post to include the matrix 'm'. But now the variable which I am trying to predict "Species" is gone! Hence the last line of code is giving error that "Species" could not be found! Any ideas?Arun
I too tried that, couldn't figure it out kept getting some error. Add the output of str(d) in the question. What I did was change all factorial columns manually to dummy variables and it worked.SamFlynn
Will normalization of the attributes help?Arun

1 Answers

1
votes

No clue how NN runs and what's the best way to run it. Don't know much about the iris dataset as well.

Just pointing out why its not running - the column Species

str(d)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

Species is a factor NN doesnt take factors.

Convert to dummy varibles -

d$set <-0
d$set[d$Species == "setosa"] <- 1

d$versi <-0 
d$versi[d$Species == "versicolor"] <- 1



f <- as.formula(set+versi ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width)

iris_nn <- neuralnet(f, data = d, hidden = 4, learningrate = 0.3)

EDIT:

So when you say hidden = c(5,3) then the neural network diagram would have your input nodes, 5 side by side hidden nodes(a layer), 3 side by side hidden nodes(another layer), output node/nodes

No clue how they impact the accuracy.

The compute for neuralnet is like predict for all other machine learning models.

library(neuralnet)
library(caret) #has the confusionmatrix function in it
#for some reason compute needs to be called like that, calling normally was producing some error
nnans <- neuralnet::compute(NN, test)
confusionMatrix(nnans, test_labels))