I've got a simple program that is supposed to create a logistic regression training model for some data.
There is one output class y (0 = false, 1 = true)
There are 25 features
I'm struggling to define my variables and placeholders shapes correctly.
Here's the code.
#!/usr/bin/env python3
import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn import preprocessing
from sklearn import model_selection
import matplotlib.pyplot as plt
import seaborn as sns
import sys
sns.set(style='white')
sns.set(style='whitegrid',color_codes=True)
bank_data = pd.read_csv('data/bank.csv',header=0,delimiter = ';')
bank_data = bank_data.dropna()
bank_data.drop(bank_data.columns[[0,3,8,9,10,11,12,13]],axis=1,inplace=True)
data_set = pd.get_dummies(bank_data,columns = ['job','marital','default','housing','loan','poutcome'])
data_set.drop(data_set.columns[[14,27]],axis=1,inplace=True)
data_set_y = data_set['y']
data_set_y.replace(('yes','no'),(1.0,0.0),inplace=True)
data_set_X = data_set.drop(['y'],axis=1)
num_samples = data_set.shape[0]
num_features = data_set_X.shape[1]
print ('num_features = ', num_features)
X = tf.placeholder('float',[None,num_features])
y = tf.placeholder('float',[None,1])
W = tf.Variable(tf.zeros([num_features,1]),dtype=tf.float32)
b = tf.Variable(tf.zeros([1]),dtype=tf.float32)
train_X,test_X,train_y,test_y = model_selection.train_test_split(data_set_X,data_set_y,random_state=0)
print (train_y.head())
print (train_X.head())
prediction = tf.add(tf.matmul(X,W),b)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
num_epochs = 1000
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(num_epochs):
_,l = sess.run([optimizer,cost],feed_dict = {X: train_X, y: train_y})
if epoch % 50 == 0:
print ('loss = %f' % (l))
The current error I'm getting is: ValueError: Cannot feed value of shape (3390,) for Tensor 'Placeholder_1:0', which has shape '(?, 1)'
y_train is a pandas series that simply contains either a 0 or a 1. Do I need to reshape y_train into two one-hot vectors and change my dimensions for the y placeholder accordingly?
Here is the head output for both the y training data. 4384 0.0 2560 0.0 1470 0.0 1771 0.0 2604 0.0
Having to deal with shaping my tensors is becoming a serious nightmare. Any help appreciated.
prediction
is one scalar for every element in batch. How do you think, what is result of applyingsoftmax
to scalar? – Vladimir Bystricky