4
votes

I want to apply machine learning to a classification problem in a parallel environment. Several independent nodes, each with multiple on/off sensors, can communicate their sensor data with the goal of classifying an event as defined by a heuristic, training data or both.

Each peer will be measuring the same data from their unique perspective and will attempt to classify the result while taking into account that any neighbouring node (or its sensors or just the connection to the node) could be faulty. Nodes should function as equal peers and determine the most likely classification by communicating their results.

Ultimately each node should make a decision based on their own sensor data and their peers' data. If it matters, false positives are OK for certain classifications (albeit undesirable) but false negatives would be totally unacceptable.

Given that each final classification will receive good or bad feedback, what would be an appropriate machine learning algorithm to approach this problem with if the nodes could communicate with each other to determine the most likely classification?

2

2 Answers

1
votes

If the sensor data in each individual node is generally sufficient to make a reasonable decision, they could just communicate the result and take a majority vote. If majority vote is not appropriate, you could train an additional classifier that uses the outputs of the nodes as its feature vector.

Since you want to have on-line supervised learning with feedback, you could use a neural network with backpropagation or an incremental support vector machine that adds the errors to the training set. Look into classifier biasing to deal with false-positive/false-negative trade-off.

1
votes

In this instance, a neural network could be very appropriate. The inputs to the network would be each of the sensors onboard the node, along with that of its neighbors. You would calculate weights based on your feedback.

Another option (that is simpler, but can achieve good results as well) is a Gossip Algorithm. You would have to look into incorporating feedback though.