
I am trying to implement a recommender system uisng Mahout framework. As I do not have Linux machine, I cannot use Hadoop to deal with large martrix. (The tutorial for installing Hadoop on Windows does not work for me.)

My Users have three types of features, each type contains 5 to 9 features. I wonder whether I can build these features into one FileDataModel, or whether I can process each group of features separately and combine the results.

If the latter one works, then I need to have the UserIds from the previous processing result to create a new FileDataModel for the next group of features. Is it doable?

Here I still have questions for which I have not found answer, hope anyone can help: how many features can Mahout within Windows environment, without Hadoop, handle actually? And how often do we need to re-optimize a system after one algorithm being put in production? Does it work automatically? Thanks.


1 Answers


Well..mahout can run without hadoop if it doesn't find HADOOP_HOME env variable set..On the other hand..I am not sure if mahout can run on Window right away as I didnt find any .bat file or conf for windows...You need to install cygwin for both hadoop and mahout if you want to run them on windows.

It is very similar to the fact how Pig works when HADOOP_HOME is not set.