after loading a saved MatrixFactorizationModel I get the warnings: MatrixFactorizationModelWrapper: Product factor does not have a partitioner. Prediction on individual records could be slow. MatrixFactorizationModelWrapper: Product factor is not cached. Prediction could be slow.
and indeed the computation is slow and will not scale well
how do I set a partitioner and cache the Product factor?
adding code that demonstrates the problem:
from pyspark import SparkContext
import sys
sc = SparkContext("spark://hadoop-m:7077", "recommend")
from pyspark.mllib.recommendation import ALS, MatrixFactorizationModel, Rating
model = MatrixFactorizationModel.load(sc, "model")
model.productFeatures.cache()
i get:
Traceback (most recent call last): File "/home/me/recommend.py", line 7, in model.productFeatures.cache() AttributeError: 'function' object has no attribute 'cache'