Perhaps too general a question, but can anyone explain what would cause a Convolutional Neural Network to diverge?
Specifics:
I am using Tensorflow's iris_training model with some of my own data and keep getting
ERROR:tensorflow:Model diverged with loss = NaN.
Traceback...
tensorflow.contrib.learn.python.learn.monitors.NanLossDuringTrainingError: NaN loss during training.
Traceback originated with line:
tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
hidden_units=[300, 300, 300],
#optimizer=tf.train.ProximalAdagradOptimizer(learning_rate=0.001, l1_regularization_strength=0.00001),
n_classes=11,
model_dir="/tmp/iris_model")
I've tried adjusting the optimizer, using a zero for learning rate, and using no optimizer. Any insights into network layers, data size, etc is appreciated.
tf.losses.sparse_softmax_cross_entropy(y, logits)
instead of my own implementation of Safe Softmax usingtf.nn.Softmax
– Eduardo Reis