4
votes

We are exploring various programming/library options (on Java side of the world) for faster batch processing as well as be able to be deployed on cloud. We came across Spring batch/XD/cloud data flow. From the quick review of documentation on http://cloud.spring.io/spring-cloud-dataflow/, we could not assess whether Spring cloud data flow also has all the batch processing features that spring batch would offer. For example, here is what SPring batch documentation (http://projects.spring.io/spring-batch/) says: "Spring Batch provides reusable functions that are essential in processing large volumes of records, including logging/tracing, transaction management, job processing statistics, job restart, skip, and resource management."

If someone has any idea about the batch processing capabilities in spring cloud data flow, could you please post here. Many thanks!

1

1 Answers

6
votes

Please review Spring Cloud Task project. This project offers the framework and programming model to develop "short-lived" microservice applications.

At a high-level, a Task may be any process that does not run indefinitely, including Spring Batch jobs. This gives you the flexibility to develop Spring Batch jobs, using all its core capabilities, and you can run them as standalone Spring Boot applications. There are some samples here.

Spring Cloud Data Flow builds upon Spring Cloud Task to provide orchestration capability for batch data pipelines. A wide range of options including Shell, DSL, Admin UI, and Flo UI are available to orchestrate batch workloads. You can use these utility Task applications in Spring Cloud Data Flow and this list is growing.