2
votes

I have 3 S3 buckets:

  1. input-files
  2. in-progress
  3. processed-files

The "input-files" bucket contains a list of CSV files and I want to get each input file (filename format: filename-timestamp) from the bucket one at a time and move it to the "in-progress" bucket and when the workflow is complete I want to move it to "processed-files" bucket. On error all file processing needs to stop.

In my flow I can get the content of the csv file but there is no reference to file name so not sure how I can implement the above because I can't specify the file that needs to be moved.

How can I implement the processing steps outlined above?

XML flow:

<?xml version="1.0" encoding="UTF-8"?>

<mule xmlns="http://www.mulesoft.org/schema/mule/core" xmlns:doc="http://www.mulesoft.org/schema/mule/documentation"
    xmlns:spring="http://www.springframework.org/schema/beans" version="EE-3.8.1"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-current.xsd
http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/current/mule.xsd">

    <flow name="CsvToMongo" >
        <poll doc:name="Poll">
            <s3:get-object-content config-ref="Amazon_S3__Configuration" bucketName="test-file-bucket" key="input-files/TestData.csv" mimeType="application/csv" doc:name="Amazon S3"/>
        </poll>
        <object-to-string-transformer encoding="UTF-8" mimeType="application/csv" doc:name="Object to String"/>
        <logger message="#['**** Start writing CSV to database...']" level="INFO" doc:name="Logger: Start Process"/>
    </flow>
</mule>

Software being used: Anypoint Studio 6.2 Mule 3.8.1

Thanks

1

1 Answers

1
votes

An approach that I used recently was to configure an Amazon Simple Queue Service (SQS) queue to receive S3 events from a bucket. (Configure the bucket to send events to the SQS queue).

Then in my Mule flow, my input source is an SQS poller.

The structure of the S3 event is well documented at AWS and is a JSON string (convert it to JSON object to use it) and contains all the relevant information that I needed to identify the actual file name.

It's working quite nicely.