I am quite new to Caffe and Deep Learning so please bear with my inexperience and naive questions. Now to the point, I want to train GoogleNet using the FER2013 dataset(it consists of faces and it's purpose is to recognise one of the 7 categories the faces fall into). However the data isn't in an image format rather than an array of 48x48=2304 values with each value being between 0 and 255. So in order to create the lmdb files needed to feed the Caffe I wrote the following Python script to transform the arrays into real images.
import numpy as np
from PIL import Image
import csv
import itertools
with open('fer2013.csv', 'rb') as f:
mycsv = csv.reader(f)
i=0
for row in itertools.islice(mycsv, 340):
data = row[1]
data = data.split()
data = map(int, data)
data = np.array(data)
im = Image.fromarray(data.reshape((48,48)).astype('uint8')*255)
directory='imagestotest/'
path_to_save = directory+"image"+str(i)+".jpg"
path = "image"+str(i)+".jpg"
im.save(path_to_save)
i=i+1
with open("testset.txt", "a") as myfile:
myfile.write(path+" "+row[0]+"\n")
I then prepare my lmdb files with the following command
GLOG_logtostderr=1 ./deep-learning/caffe/build/tools/convert_imageset --resize_height=256 --resize_width=256 --shuffle /home/panos/Desktop/images/ /home/panos/Desktop/trainingset.txt /home/panos/Desktop/train_lmdb
Finally, I compute the image_mean, I change the train_val.prototxt and set loss1,loss2,loss3 layers to have num_output=7 (as I have 7 classes, 0-6).
I run my model (training size: 5000 and test size: 340) and the accuracy is rather disappointing, being close to 23%(top-1), 88.8%(top-5).
Could this be a hyper parameter configuration problem, or is it that my input files are not created correctly? (as I fear my Python cooking skills)
If it helps, my main hyper parameters are: test_iter: 7,test_interval: 40,base_lr: 0.001,momentum: 0.9,weight_decay: 0.0002.
Thanks in advance !
48 X 48
into256 X 256
. This might be a problem. Did you use Googlenet pretrained model for weight initialization? – lnman