2
votes

I am totally new in machine learning and I have a problem, which I want to solve using some AI or so. I would appreciate, if you will recommend me some concrete algorithms, neural network architectures or some related reading.

I am doing research about predicting users intent based on mouse movement. Currently I am in a phase of analysis without concrete dataset. The goal is predict target of users's intention (e.g. button, where user will click) by predicting mouse trajectory.

Let me introduce the problem

I have a lot of sequences. The length of each sequence may vary. As an input I will pass some smaller sequence, for which I want to predict next x values. So I want to know next possible sequence (or more possible sequences). The length of output sequence (x) could also be variable. Maybe sequence ends here? Prediction should be done in “real time”.

So what are those sequences?

Sequence represents directions of movement in 2-dimensional space after some preprocessing. Each value is integer of the interval <0,8>. Algorithm should be capable of increasing upper limit of the interval (16, 32, ...). Actually, the value is interpolated angle.

Example how sequences may look like but will be much bigger.

Three example sequences. Real sequences will be much bigger.

How do I imagine the solution?

Sequences will be clustered based on some similarities. When a dataset of sequences is made, some neural network will be trained to retrieve sequences, which contain the input sequence as a subsequence, as quick as possible.

Clustering

Matching subsequence should have some tolerance. Sequence [3, 3, 3, 3, 2] is similar to [3, 3, 4, 3, 2] = deviation tolerance*. Or sequence [4, 3, 3, 2] is also similar to [4, 3, 3, 3, 3, 2] = tolerance on values repeated continuously.

*I can tell the difference between two values as relative number - 0% the same direction => 100% opposite direction.

matching example

If input is [ 1,2,2,2 ] - red - the output should be [ 4,3,2,2 ].

If input is [ 3,3,3,2 ] - blue - the output should be [ 2 ].

Neural network

After some research I found the Hopfield network, which should give the most similar sequence. But then I realized that my sequences lengths are variable and Hopfield network architecture expects binary values.

I could somehow create the binary representation of sequence but I have no idea how to manage the lengths which may vary.

Let’s make it to another level

What if every value in sequence is not a scalar but velocity vector (d, s), where d is direction and s is speed?

Related questions

  1. Can neural networks be trained “online”? So no need to know previous train dataset, just give new dataset.
  2. Can neural networks be trained on server side (e.g. python) but used for prediction on client side (javascript)?
  3. Can neural networks have some kind of “short term memory” - prediction will be affected by 2-3 previous predictions?
  4. Most important - should I use neural networks or some another approach?

Thanks to everyone.

Feel free to correct my English.

1

1 Answers

2
votes

Can neural networks be trained “online”? So no need to know previous train dataset, just give new dataset.

Typically, you dont train an ANN continuously. You train it until your error is within tolerance, then use that model going forward to make predictions. You could store the information off and retrain the network every night if you want to periodically adjust the model, but odds are that's not going to offer much improvement, and runs the risk of prolonged bad data of skewing your model.

Can neural networks be trained on server side (e.g. python) but used for prediction on client side (javascript)?

It depends. Do you intend to use a trained model for the client prediction, or do you intend for the user actions to live-train the model which are immediately used for prediction? If the model is already trained, you can use it for prediction of user events. If the model is not trained, you run the risk of bad data corrupting the model. Live-training like that would also require a constant update of the model settings on the client side with the new model generated by the server.

Can neural networks have some kind of “short term memory” - prediction will be affected by 2-3 previous predictions?

Using previous predictions as input is not recommended. It introduces entropy to the system which can allow the model to drastically deviate from reliable predictions if it makes a few bad predictions in a row. You can try it, in which case you'll need n*k additional nodes on your input layer, where n is the number of previous predictions you want to use, and k is the number of output values in a prediction.

Most important - should I use neural networks or some another approach?

ANNs are very useful for predicting things. The biggest problem is defining the scope, and relevant reliable data necessary to make a prediction. I've made ANNs which predict market volatility in video games, with thousands of input values, but predicting mouse movements is going to be a challenge. Nothing is stopping a user from moving the mouse in a circle for hours, or leaving the cursor in one spot. Each time you sample such an action, its going to make your model more likely to predict that type of behavior. Good training data, and a controlled environment is essential. Video games would make for a bad environment for predicting mouse movement, as user behavior is dependent on more than previous mouse movements. Websites would be a favorable environment though, as during a session, a user navigates in predictable ways through a finite space.