Memsql Spark-Kafka Transform Failure

Question

We have a Spark Cluster running under Memsql, We have different Pipelines running, The ETL setup is as below.

Extract:- Spark read Messages from Kafka Cluster (Using Memsql Kafka-Zookeeper)
Transform:- We have a custom jar deployed for this step
Load:- Data from Transform stage is Loaded in Columnstore

I have below doubts:

What Happens to the Message polled from Kafka, if the Job fails in Transform stage - Does Memsql takes care of loading that Message again - Or, the data is Lost

If the data gets Lost, how can I solve this Problem, is there any configuration changes which needs to done for this?

eklhad eklhad · Accepted Answer · 2016-02-02T17:45:04

As it stands, at least once semantics are not available in MemSQL Ops. It is on the roadmap and will be present in one of the future releases of Ops.

Memsql Spark-Kafka Transform Failure

2 Answers