0
votes

So, I have been learning neural network and have tried coding them from scratch and have been successful in some instances. So, I thought of fitting a simple single layer neural network to a sine wave. I know i can use keras but i want to learn the internal working. My input x is generated using numpy from values ranging from 0 to 10 with 0.1 step and y = sin(x) I initialised weights and biases for the network and also coded the backpropagation. But after fitting the data when i try to predict gives me a straight line. I changed the activations of the layers from sigmoid to tanh as well as their respective gradient but the output doesn't predict a sine wave. After going through forums, it keeps coming up that for such periodic functions, RNN is used.

import numpy as np
from matplotlib import pyplot as plt
from tqdm import tqdm


def init_weight_and_bias_NN(M1, M2):
    W = np.random.randn(M1, M2) / np.sqrt(M1 + M2)
    b = np.zeros(M2)
    return W.astype(np.float32), b.astype(np.float32)


def out(x, w, b):
    return np.add(np.dot(x, w), b)


def softmax(A):
    expA = np.exp(A)
    return expA / expA.sum(axis=1, keepdims=True)


def relu(x):
    return x * (x > 0)


def start(x, y):
    alpha = 0.01
    reg = 0.3
    epochs = 1
    hiddennodes = 3
    M, D = x.shape
    w1, b1 = init_weight_and_bias_NN(D, hiddennodes)
    w2, b2 = init_weight_and_bias_NN(hiddennodes, 1)
    with tqdm(total=epochs, desc="Training") as prog:
        for i in range(epochs):
            hidden = relu(out(x, w1, b1))
            output = softmax(out(hidden, w2, b2))
            w2 = np.subtract(w2, np.multiply(alpha, np.add(np.dot(hidden.T, np.subtract(output, y)), reg * w2)))
            b2 = np.subtract(b2, np.multiply(alpha, np.sum(np.subtract(output, y))))
            hiddenError = np.dot(np.subtract(output, y), w2.T)
            w1 = np.subtract(w1, np.multiply(alpha, np.add(np.dot(x.T, hiddenError), reg * w1)))
            b1 = np.subtract(b1, np.multiply(alpha, np.sum(hiddenError)))
            prog.update(1)
    return w1, b1, w2, b2


def predict(w1, b1, w2, b2, x):
    y = []
    for val in x:
        hidden = relu(out(val, w1, b1))
        y.append(softmax(out(hidden, w2, b2)).tolist().pop().pop())
    return np.array(y)


if __name__ == '__main__':
    x = np.arange(0, 10, 0.1)
    x1 = x.reshape((1, x.shape[0]))
    y = np.sin(x)
    w1, b1, w2, b2 = start(x1, y)
    x2 = np.arange(10, 20, 0.1)
    ynew = predict(w1, b1, w2, b2, x2)
    plt.plot(x, y, c='r')
    plt.plot(x, ynew, c='b')
    plt.title("Original vs machine produced")
    plt.legend(["Original", "Machine"])
    plt.show()

Final plot This is the result i get. I know I shouldn't have used softmax here in the final layer. But i have tried everything and this one is my latest code. Also for different activations, i tried for many epochs and with many hidden nodes with different values for alpha(learningrate) and reg(lambda regularizer) What am i doing wrong? Should i try RNN here? I saw somewhere that keras was used using sequential model and leaky relu was used as an activation function. I have not tried using that activation. Is that something i should try?

1
Have a look at phased LSTM - OverLordGoldDragon
It is a regression problem, not a classification. So, one of the things to be fixed is: output = relu(out(hidden, w2, b2)) - razimbres

1 Answers

0
votes

When training you are showing your neural net x values between 0 and 10. During prediction you are using x values between 10 and 20. Those are values larger than what the net has ever seen before, so it will not be able to give you a nice sine wave.

Neural nets can not extrapolate well when values are outside the range of what they have seen. If you want to learn the neural net what a sine wave looks like between 0 and 10 and predict points it has not seen during training, you can take 1000 random points between 0 and 10 along with their sin(x) values and use 80% of those to train the net and 20% to test the prediction. I think then you will get good results.

However when you want to predict how the sine wave goes beyond 10 on you should train your model on tuples of for example 4 points with the 5th point as label. When predicting you give the net 4 sequential y values and let it predict the fifth. For the next prediction you use 3 old points plus your previous prediction and then again predict the next point. Don't use any x values for training.

Let me know if this is not completely clear. If not, I will make a drawing.