Difference between DSMS, Storm and Flink

Question

DSMS corresponds to Data Stream Management Systems. These systems allow users to submit queries that will be continuously executed until being removed by the user.

Can systems such as Storm and Flink be seen as DSMS or are they something more generic?

Thanks

Matthias J. Sax Matthias J. Sax · Accepted Answer · 2016-11-20T23:44:42

Both types of systems are more orthogonal to each other as they try to solve different use cases. Thus, none does subsume or is a generalization of the other.

DSMS are usually:

end-to-end solutions providing storage and computation as a unified solution
required to import external data into system first
often DSMS are SQL orientated what makes them easy to use but often they are less expressive
usually can only handle structured data (schema based tuple format)
DSMS do often not scale

Stream Processing Frameworks (Flink, Storm, Spark):

only provide a computation layer and consumer data from other storage systems
most offer language embedded DSL (some also offer SQL to some extent)
can handle any type of data (flat tuples, JSON, XML, flat files, text)
build to scale to large clusters (many hundreds of nodes)
good for data crunching, machine learning

Streaming Platform (Kafka)

provides storage layer and computation
can handle any type of data as long as imported into the system (flat tuples, JSON, XML, flat files, text)
scalable and elastic
no SQL, only Java DSL (Confluent Platform which is based on Kafka offers KSQL as developer preview)
very good to build micro services

Difference between DSMS, Storm and Flink

1 Answers