Any ideas for the other half of the problem: moving the files from the Source blob to an Archive blob after processing them?
As I known, there is no in-build task for you to achieve this purpose. Based on my test, I assume that you could leverage Script Task and write code (VB or C#) to handle blobs directly. Here are my detailed steps, you could refer to them:
1) Use Azure Blob Source and OLE DB Destination under Data Flow for loading a CSV file from Azure Blob into Azure SQL database.
2) After successfully load the CSV data into SQL table, use a Script Task for moving the source blob to an archive blob.
I would invoke Blob Service REST API copy Blob and Delete Blob with Container SAS token, you could leverage Microsoft Azure Storage Explorer and follow this official tutorial to generate the SAS token for your blob container.
Assuming the source blob and destination blob are under the same container, then I add three variables (SourceBlobUrl
,ContainerSasToken
,ArchiveBlobUrl
) as follows and add them as ReadOnlyVariables in Script Task Editor, you could refer to this tutorial for using Variables in the Script Task.
Click Edit Script button under Script Task Editor to launch the VSTA development environment in which you write your custom script. Here is the Main method under ScriptMain.cs
as follows:
public async void Main()
{
// TODO: Add your code here
string sasToken = Dts.Variables["ContainerSasToken"].Value.ToString();
string sourceBlobUrl = Dts.Variables["SourceBlobUrl"].Value.ToString();
string archiveBlobUrl = Dts.Variables["ArchiveBlobUrl"].Value.ToString();
try
{
HttpClient client = new HttpClient();
client.DefaultRequestHeaders.Add("x-ms-copy-source", sourceBlobUrl + sasToken);
//copy source blob to archive blob
Dts.Log($"start copying blob from [{sourceBlobUrl}] to [{archiveBlobUrl}]...", 0, new byte[0]);
HttpResponseMessage response = await client.PutAsync(archiveBlobUrl + sasToken, null);
if (response.StatusCode == HttpStatusCode.Accepted || response.StatusCode == HttpStatusCode.Created)
{
client.DefaultRequestHeaders.Clear();
Dts.Log($"start deleting blob [{sourceBlobUrl}]...", 0, new byte[0]);
//delete source blob
HttpResponseMessage result = await client.DeleteAsync(sourceBlobUrl + sasToken);
if (result.StatusCode == HttpStatusCode.Accepted || result.StatusCode == HttpStatusCode.Created)
{
Dts.TaskResult = (int)ScriptResults.Success;
return;
}
}
Dts.TaskResult = (int)ScriptResults.Failure;
}
catch (Exception ex)
{
Dts.Events.FireError(-1, "Script Task - Move source blob to an archive blob", ex.Message + "\r" + ex.StackTrace, String.Empty, 0);
Dts.TaskResult = (int)ScriptResults.Failure;
}
}
Result
Additionally, you could also leverage Microsoft Azure Storage Client Library for .NET to access storage blob, at this point, you need to load the assembly in a SSIS script task that is not in the GAC, for more details you could refer to this official blog.