5
votes

When i read about AWS data pipeline the idea immediately struck - produce statistics to kinesis and create a job in pipeline that will consume data from kinesis and COPY it to redshift every hour. All in one go.

But it seems there is no node in pipeline that can consume kinesis. So now i have two possible plans of action:

  1. Create instance where Kinesis's data will be consumed and sent to S3 split by hours. Pipeline will copy from there to Redshift.
  2. Consume from Kinesis and produce COPY directly to Redshift on the spot.

What should I do? Is there no way to connect Kinesis to redshift using AWS services only, without custom code?

3

3 Answers

5
votes

It is now possible to do so without user-code via a new managed service called Kinesis Firehose. It manages the desired buffering intervals, temp uploads to s3, upload to Redshift, error handling and auto throughput management.

2
votes

That is already done for you! If you use the Kinesis Connector Library, there is a built-in connector to Redshift

https://github.com/awslabs/amazon-kinesis-connectors

Depending on the logic you have to process connector can be really easy to implement.

0
votes

You can create and orchestrate complete pipeline with InstantStack to read data from Kinesis, transform it and push it into any Redshift or S3.