2
votes

I am developing a real-time streaming application which needs to send information to AWS Kinesis streams and from there to AWS Redshift. Based on my reading and understanding of documentation, following are the options to push information from Kinesis Streams to Redshift:

  1. Kinesis Streams -> Lambda Function -> Redshift
  2. Kinesis Streams -> Lambda Function -> Kinesis Firehose -> Redshift
  3. Kinesis Streams -> Kinesis Connector Library -> Redshift (https://github.com/awslabs/amazon-kinesis-connectors)

I found the Kinesis Connector option to be the best option for moving information from Streams to Redshift. However, I am not able to understand where do we deploy this library and how does this run? Does this need to run as a lambda function or as a java function on an EC2 instance. Based on the readme I am not able to get that information. In case anyone has worked with connectors successfully, I will appreciate the insight very much.

1

1 Answers

6
votes

If you're using the Kinesis Connector Library then you want to deploy it on an EC2 instance, but using a Lambda function without the Connector Library is a lot easier and better in my opinion. It handles batching, scaling up your instance invocation and retries. Dead Letter Queues are probably coming soon too for Lambda + Kinesis.

Basically it's a lot easier to scale and deal with failures in Lambda.