1
votes

I've just written a knn model in R. However, I don't know how to use the output to predict new data.

# split into train (treino) and test (teste)
treino_index <- sample(seq_len(nrow(iris)), size = round(0.75*nrow(iris)))
treino <- iris[treino_index, ]
teste <- iris[-treino_index, ]

# take a look at the sample
head(treino)
head(teste)

# save specie from later
treino_especie = treino$Species
teste_especie = teste$Species

# exclude species from train and test dataset
treino = treino[-5]
teste = teste[-5]

# runs knn
library(class)
iris_teste_knn <- knn(train = treino, test = teste, cl= treino_especie,k = 3,prob=TRUE) 


# model performance using cross table
install.packages('gmodels')
library('gmodels')
CrossTable(x=teste_especie, y=iris_teste_knn, prop.chisq=FALSE)

How do I apply this to new data. Suppose I have a specie with the following parameters: Sepal.Length = 5.0, Sepal.Width = 3.3, Petal.Length = 1.3, Petal.Width = 0.1. How do I know from which specie this come from?

1

1 Answers

1
votes

Knn is a lazy classifier. It doesn't creates a fit to predict later, as in case of other classifiers like logistic regression, tree based algorithms etc. It fits and evaluates at the same time. When you are done with tuning of performance parameters, feed the optimized parameters to knn along with new test cases. Use:

  x = c(5.0, 3.3, 1.3, 0.1)                          # test case
  knn(train = treino , test = x , cl= treino_especie, k = 3,prob=TRUE)