since I posted my previous answer in May last year, many of you contacted me asking for pipeline sample to achieve the incremental file copy scenario using the getMetadata-ForEach-getMetadata-If-Copy pattern. This has been important feedback that incremental file copy is a common scenario that we want to further optimize.
Today I would like to post an updated answer - we recently released a new feature that allows a much easier and scalability approach to achieve the same goal:
You can now set modifiedDatetimeStart and modifiedDatetimeEnd on SFTP dataset to specify the time range filters to only extract files that were created/modified during that period. This enables you to achieve the incremental file copy using a single activity:
https://docs.microsoft.com/en-us/azure/data-factory/connector-sftp#dataset-properties
This feature is enabled for these file-based connectors in ADF: AWS S3, Azure Blob Storage, FTP, SFTP, ADLS Gen1, ADLS Gen2, and on-prem file system. Support for HDFS is coming very soon.
Further, to make it even easier to author an incremental copy pipeline, we now release common pipeline patterns as solution templates. You can select one of the templates, fill out the linked service and dataset info, and click deploy – it is that simple!
https://docs.microsoft.com/en-us/azure/data-factory/solution-templates-introduction
You should be able to find the incremental file copy solution in the gallery:
https://docs.microsoft.com/en-us/azure/data-factory/solution-template-copy-new-files-lastmodifieddate
Once again, thank you for using ADF and happy coding data integration with ADF!