0
votes

Andrew NG's Machine Learning Coursera course uses binary input and scalar output for each neural network, e.g : [101011] -> [2]

Why is a binary digit used for the training data and not scalars ? Is it related to fact that the Theta values for each layer are initially randomised between 0 and 1 ?

1
Can you give more context on this? I've taken the course, and I know that his inputs are not universally restricted to binary. If you can describe an example (features, training method, etc.) I should be able to explain the rationale. - Prune
@Prune your right, exercise 3 and 4 are vectors of gray scale inputs. The question then is perhaps when to use binary for input representation ? - blue-sky
Can you describe where he is using the binary representation? I don't remember the class materials well enough to remember where and why he's used binary. The most common case is when we're splitting a N-choice feature (list your choices) into N binary features (one for each possible choice). - Prune
@Prune this representation is explained in the lectures but I don't think in great detail. I'll think about this some more and post a new question. Thanks. - blue-sky

1 Answers

0
votes

I can't remember the context, but I was once told the answer to this question.

Basically, it is easier for the network to be trained this way, rather than using a scalar. Each bit has its own semantic meaning, so each bit should have its own dedicated neuron. Data representation is always a mess with neural networks, I know!

About the Theta, the answer is no. As you said, the theta is initially sampled from [0;1], but it can grow bigger (or become negative) after some rounds of the algorithm. It is common behaviour.