I have a very large corpus with each element consisting of a large amount of high dimensional data. Elements are constantly being added to the corpus. Potentially, only a portion of the corpus needs to be considered for each interaction. Elements are labeled, potentially with multiple labels and weights associated with the strength of those labels. The data is not sparse as far as I understand.
The input data is a set of parameters in the range of -1...1 between around (10-1000) inputs. This may be somewhat flexible depending on what machine learning method is most appropriate.
I am targeting high end smart phone devices. Ideally the processing could be done on the same device but I'm open to the possibility of transmitting it to a modest server.
What would be an appropriate machine learning approach for this kind of situation?
I've been reading about random forrest decision trees, restricted boltzmann machines, deep learning boltzmann machines etc, but I could really use the advice of an experienced hand to direct me towards a few approaches to research that would work well give the conditions.
If my description seems wonky please let me know as I am still getting to grips with the ideas and may be fundamentally misunderstanding some aspect.