Im'using CNN for short text classification (classify the production title). The code is from http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
The accuracy in trainning set, test set, validatino set is blow:
and loss is different. The loss of validation is double than the loss of trainning set and test set.(I can't upload more than 2 pictures. sorry!)
The trainning set and test set are from web by crawler, then split them with 7:3.And the validation is from real app message and tagged by manual marking.
I have tried almost every hyper-parameters.
I have tried up-sampling, down-sampling, none-sampling.
batch size of 1024, 2048, 5096
dropout with 0.3, 0.5, 0.7
embedding_size with 30, 50, 75
But none of these work!
Now I use the param below:
batch size is 2048.
embedding_size is 30.
sentence_length is 15
filter_size is 3,4,5
dropout_prob is 0.5
l2_lambda is 0.005
At first I think it is overfit.But the model performs well in test set then trainning set.So I confused!
Is it the distribution between test set and trainning set is much different?
How can I increase the performance in validation set?