I have done a few tests with apache-beam using both auto-scale workers and 1 worker, and each time I see a startup time of around 2 minutes. Is it possible to reduce that time, and if so, what are the suggested best practices for reducing the startup time?
1 Answers
3
votes
IMHO: Two minutes is very fast for a product like Cloud Dataflow. Remember, Google is launching a powerful Big Data service for you that autoscales.
Compare that time to the other cloud vendors. I have seen some clusters (Hadoop) take 15 minutes to come live. In any event, you do not control the initialization process for Dataflow so there is nothing for you to improve.