0
votes

I am working on POC to implement real time analytics where we have following components.

  1. Confluent Kafka : Which gets events from third party services in Avro format (Event contains many fields up to 40). We are also using Kafka-Registry to deal with different kind of event formats.

I am trying to use MemSQL for analytics for which I have to push events to memsql table in specific format.

I have gone through memsql website , blogs etc but most of them are suggesting to use Spark memsql connector in which you can transform data which we are getting from confluent Kafka.

I have few questions.

  1. If I use simple Java/Go application in place of Spark.
  2. Is there any utility provided by Confluent Kafka and memsql

Thanks.

1

1 Answers

1
votes

I recommend using MemSQL Pipelines. https://docs.memsql.com/memsql-pipelines/v6.0/kafka-pipeline-quickstart/ In current versions of MemSQL, you'll need to set up a transform, which will be a small golang or python script which reads in the avro and outputs TSV. Instructions on how to do that is here https://docs.memsql.com/memsql-pipelines/v6.0/transforms/, but the tldr is, you need a script which does

while True: record_size = read_an_8_byte_int_from_stdin() avro_record = stdin.read(record_size) stdout.write(AvroToTSV(avro_record))

Stay tuned for native Avro support in MemSQL.