0
votes

When using ml-engine for online prediction we send a request and get the prediction results, that's cool but Request is usually different compared to model input, for example:

  • A categorical variable can be in request but model is expecting and integer mapped to that that category
  • also for a given feature we may need to create multiple features, like splitting text into two or more features
  • And we might need to exclude some of the features in the request like a constant feature that's useless for the model

How do you handle this process? My solution is to get the request with an appengine app, send it to pub/sub , process it in dataflow, save it to gcs and trigger a cloud function to send the processed request to ml-engine endpoint and get the predicted result. This can be an over-engineering and I want to avoid that, If you have any advice regarding to Xgboost models I'll be appreciated.

1

1 Answers

0
votes

We are testing out a feature that allows a user to provide some Python code to be run server-side. This will allow you to do the types of transformations you are trying to do, either as a scikit learn pipeline or as a Python function. If you'd like to test it out, please contact [email protected].