I have a producer application which writes to Kinesis stream at rate of 600 records per sec. I have written an Apache flink application to read/process and aggregate this streaming data and write the aggregated output to AWS Redshift.
The average size of each record is 2KB. This application will be running 24 * 7.
I wanted to know what should be the configuration of my AWS EMR Cluster. How many nodes do i require ? What should be the EC2 instance type (R3/C3) that I should be using.
Apart from the performance aspect, cost is also important for us.