The optimizer will not change the output that is actually given by the layers.
The provided example uses ReLu for the layers, which is good for classification but to model probability it wouldn't work. You would be better off with a sigmoid function instead.
The sigmoid function can be used to model probability, whereas ReLu can be used to model positive real number.
In order to make it work for the provided example, change the multilayer_perceptron function to:
def multilayer_perceptron(_X, _weights, _biases):
layer_1 = tf.sigmoid(tf.add(tf.matmul(_X, _weights['h1']), _biases['b1']), name="sigmoid_l1") #Hidden layer with sigmoid activation
layer_2 = tf.sigmoid(tf.add(tf.matmul(layer_1, _weights['h2']), _biases['b2']), name="sigmoid_l2") #Hidden layer with sigmoid activation
return tf.matmul(layer_2, _weights['out'], name="matmul_lout") + _biases['out']
It basically replaces the ReLu activation for a sigmoid one.
Then, for the evaluation, use softmax as follows:
output1 = tf.nn.softmax((multilayer_perceptron(x, weights, biases)), name="output")
avd = sess.run(output1, feed_dict={x: features_t})
It will provide you a range between 0 and 1 for each class. Also, you'll probably have to increase the number of epochs for this to work.