Thanks in advance for your time and response.
I have an AWS state machine with the following activities.
- Pull first available data file from an external FTP server
- Process the data (processing time can vary)
- Upload the processed data to another FTP server
I have a java application running in an EC2 instance which has 3 threads and polls the activities using code as shown below. The java application invokes appropriate workers to do the actual work for steps #1,2 and 3. The important point here is that all the 3 activities here should happen in the same server as the steps write and read from a file location in the server.
I have hundreds of files to process in the FTP server and so I have 5 Ec2 servers running copies of the java application.
Now I start 5 executions of the State machine. This would allow me the distribute the file processing across the 5 servers.
However, my problem is this:
How can I ensure that Activities from a given State machine execution are handled by the SAME EC2 instance server.
I don't want a given Execution's activities to be handled by different EC2 instances. In the code below (from https://github.com/goosefraba/aws-step-function-activity-example/blob/master/src/main/java/at/goosefraba/ActivityProcessor.java), I don't see any way to getActivityTask belonging to a particular execution.
final ClientConfiguration clientConfiguration = new ClientConfiguration();
clientConfiguration.setSocketTimeout((int) TimeUnit.SECONDS.toMillis(70));
final AWSStepFunctions client = AWSStepFunctionsClientBuilder
.standard()
.withClientConfiguration(clientConfiguration)
.build();
while (true) {
GetActivityTaskResult getActivityTaskResult =
client.getActivityTask(
new GetActivityTaskRequest().withActivityArn(getArn()));
if (getActivityTaskResult.getTaskToken() != null) {
// Do work
}
}