1
votes

Currently, we are going to link Redshift and our PostgreSQL RDS database together for our Machine Learning function so that our ML server can query and join the data in a single place.

As I know there are two solutions:

  • Option 1: Dump the whole RDS data into Redshift and sync every day
  • Option 2: Create another RDS and use dblink to create a view to join the two databases together

For option 1, what is the best AWS service we can use (we prefer to use AWS service)?

For option 2, how is the performance (our current redshift volume is 80GB, postgresql is 7GB).

And any other solutions?

1

1 Answers

2
votes

From Amazon Redshift introduces support for federated querying (preview):

The in-preview Amazon Redshift Federated Query feature allows you to query and analyze data across operational databases, data warehouses, and data lakes. With Federated Query, you can now integrate queries on live data in Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL with queries across your Amazon Redshift and Amazon S3 environments.

Federated Query allows you to incorporate live data as part of your business intelligence (BI) and reporting applications. The intelligent optimizer in Redshift pushes down and distributes a portion of the computation directly into the remote operational databases to speed up performance by reducing data moved over the network. Redshift complements query execution, as needed, with its own massively parallel processing capabilities.