Both types of systems are more orthogonal to each other as they try to solve different use cases. Thus, none does subsume or is a generalization of the other.
DSMS are usually:
- end-to-end solutions providing storage and computation as a unified solution
- required to import external data into system first
- often DSMS are SQL orientated what makes them easy to use but often they are less expressive
- usually can only handle structured data (schema based tuple format)
- DSMS do often not scale
Stream Processing Frameworks (Flink, Storm, Spark):
- only provide a computation layer and consumer data from other storage systems
- most offer language embedded DSL (some also offer SQL to some extent)
- can handle any type of data (flat tuples, JSON, XML, flat files, text)
- build to scale to large clusters (many hundreds of nodes)
- good for data crunching, machine learning
Streaming Platform (Kafka)
- provides storage layer and computation
- can handle any type of data as long as imported into the system (flat tuples, JSON, XML, flat files, text)
- scalable and elastic
- no SQL, only Java DSL (Confluent Platform which is based on Kafka offers KSQL as developer preview)
- very good to build micro services