Requirement here is that in the source bucket we receive historical daily files. The files are of the format -
Source bucket -
s3://sourcebucket/abc/111111_abc_1180301000014_1-3_1180301042833.txt
s3://sourcebucket/abc/111111_cde_1180302000042_2-3_1180302042723.txt
These are sample values as I can't post the exact file name -
111111_abc_1180301000014_1-3_1180301042833.txt
where 1180301000014 is the date and time 180301 - date March 1st 2018 and 000014 is hours, minutes and seconds - hhmmss
Once we receive all the hourly files for March 1st, we need to copy those files to another bucket and then do further processing. Currently, the copy part is working fine. It copies all the files present in the source bucket to the destination. But, I am not sure how to apply filter such that it picks only March 1st days file first and copies it to another bucket. Then it should pick the remaining files in sequential order.
Python script -
import boto3
import json
s3 = boto3.resource('s3')
def lambda_handler(event, context):
bucket = s3.Bucket('<source-bucket>')
dest_bucket = s3.Bucket('<destination-bucket>')
for obj in bucket.objects.filter(Prefix='abc/',Delimiter='/'):
dest_key = obj.key
print(dest_key)
s3.Object(dest_bucket.name, dest_key).copy_from(CopySource = {'Bucket': obj.bucket_name, 'Key': obj.key})
I am not that well versed in python. In fact this is my first python script. Any guidance is appreciated.