4
votes

We consume all our analytics feeds via API-Gateway > Kinesis Streams > Lambda > FireHose > Redshift Tables.

AWS Lambda is our Transformation footprint accepts Kinesis streams records changes the analytics event based on context & drops it to FireHouse to save it to Redshift DB.

In this Journey we wanted to update Redshift records on certain conditions (basically UPSERT ( Insert | Update ) ). is there anything in firehose that is possible to avoid duplicate records in redshift?

1

1 Answers

1
votes

Out of the box, no.

If the table that you want to upsert to is T1, then what you can do is:

  • Let Firehose dump records to another table T2.
  • Run a job, using cron or something, that periodically upserts from T2 to T1. Use transactions, so as to avoid data going bad when both this and Firehose to Redshift queries are running.