1
votes

I am starting off with Keras in R and want to build a model for text classification. However I am stuck with an error which most likely is due to my limited understanding of Deep Learning and Keras. Any help would be great. Sharing the code below. The data in the code snippet is limited, so that it may be as quickly reproducible by the gurus.

library(keras)
library(tm)

data <- data.frame("Id" = 1:10, "Text" = c("the cat was mewing","the cat was black in color","the dog jumped over the wall","cat cat cat everywhere","dog dog cat play style","cat is white yet it is nice","dog is barking","cat sweet","angry dog","cat is nice nice nice"), "Label" = c(1,1,2,1,2,1,2,1,2,1))
corpus <- VCorpus(VectorSource(data$Text))
tdm <- DocumentTermMatrix(corpus, list(removePunctuation = TRUE, stopwords = TRUE,removeNumbers = TRUE))
data_t <- as.matrix(tdm)
data <- cbind(data_t,data$Label) 
dimnames(data) = NULL
#Normalize data
data[,1:(ncol(data)-1)] = normalize(data[,1:(ncol(data)-1)])
data[,ncol(data)] = as.numeric(data[,ncol(data)]) - 1
set.seed(123)
ind = sample(2,nrow(data),replace = T,prob = c(0.8,0.2))
training = data[ind==1,1:(ncol(data)-1)]
test = data[ind==2,1:(ncol(data)-1)]
traintarget = data[ind==1,ncol(data)]
testtarget = data[ind==2,ncol(data)]
# One hot encoding
trainLabels = to_categorical(traintarget)
testLabels = to_categorical(testtarget)
print(testLabels)
#Create sequential model
model = keras_model_sequential()
model %>% 
  layer_dense(units=8,activation='relu',input_shape=c(16)) 
summary(model)
model %>%
compile(loss='categorical_crossentropy',optimizer='adam',metrics='accuracy')
history = model %>%
  fit(training,
      trainLabels,
      epoch=200,
      batch_size=2,
      validation_split=0.2)

In this example one hot encoding may be unnecessary. Along with that, there may be several areas where I have gone wrong. However, the last line of the code is throwing me an error with the shape. I have used shape as 16, due to 16 columns in my data.

The error that I am getting is

Error in py_call_impl(callable, dots$args, dots$keywords) : ValueError: Error when checking target: expected dense_32 to have shape (None, 8) but got array with shape (7, 2)

Any guidance in this would be really helpful

1

1 Answers

1
votes

It is due to the fact that your first layer is also your output layer. Your output layer should have the same amount of units as the number of classes that you are trying to predict. Here, it has 8 neurons while you have only 2 classes (trainLabels has two columns). In your case you could edit your model like this:

model %>% 
  layer_dense(units = 8, activation = 'relu', input_shape = 16) %>%
  layer_dense(units = 2, activation = 'softmax')