2
votes

I have a file system that contains a variable number of folders These folders may or may not contain a file destined for further processing There are certain folders I wish to exclude from processing The list of excluded folders is comma delimited variable The number of excluded folders contained in the delimited variable may change.
For example, currently the delimited variable is "processed,excluded,unknown"
At a later date, the delimited variable may look like "processed,excluded,unknown,download,"

I found this link "Skipping Items In A Foreach Loop" which works as long as the excluded folders are hard-coded https://www.timmitchell.net/post/2009/09/16/skipping-items-in-a-foreach-loop/

For example, the precedence constraint is currently hard-coded as this: FINDSTRING( @[User::FullFileName], "unknown",1) == 0

The question I have are these: 1) The optimal way to split a delimited string into the discrete values 2) The optimal way to evaluate the foreach variable (in my example, called FullFileName) against the list of excluded folders

3

3 Answers

1
votes

Similar logic to Tim's fine post (I don't have SSIS installed but I did have BimlStudio so please accept different icons for SSIS objects).

enter image description here

3 Variables

  • CurrentFilename string - this will contain the current file name, which may or may not contain the exclusion value
  • ExclusionList string - a delimited string of folders you'd like to exclude
  • IncludeCurrent boolean - False, and we'll override per loop

Inside the Foreach File Loop, I'd use a Script Task as it's going to be the most effective method for splitting the ExclusionList as well as performing the file system tests. The result of running the Script Task is that we will set the value of @[User::IncludeCurrent] to true or false.

The Precedent Constraint logic between the Script Task and the Dataflow Task is going to be simplified from the reference post to just @[User::IncludeCurrent].

If the condition is met, we execute the DFT task. Otherwise, we skip it.

Test for inclusion

You will need to pass the CurrentFilename and ExclusionList as read only variables into the Script task. IncludeCurrent will be passed as a read/write variable.

// use the .net native split method to split on commas
var fileList = Dts.Variables["ExclusionList"].Value.Split(new Char[] {','});

// We could inline this for the next operation but you might have need for this elsewhere
string currentFilename = Dts.Variables["CurrentFilename"].Value.ToString();

// https://stackguides.com/questions/500925/check-if-a-string-contains-an-element-from-a-list-of-strings
bool isFound = fileList.Any(s=>currentFileName.Contains(s));

// Assign the results back to our SSIS scoped variable
Dts.Variables["IncludeCurrent"].Value = isFound;

Notes on Raj More's proposal

SPLIT_STRING assumes SQL Server 2016+. As a consultant, I wish I could count on the newest features available but that often isn't the case.

I believe that a challenge you might experience with the double for each loop approach is that inner recordset (the query's split list) is marked as exhausted after the first pass so you'd need to modify this approach as FELC All files -> OLE DB Query -> FELC Filter

If you have more complex testing that pure string matching, the script approach will provide the maximum flexibility.

0
votes

I would connect to the database, do a SPLIT_STRING, put the result back into an object, and use a for-each loop on each folder name.. so you would have two for-each loops, the outer one for finding matches, and the inner one for ignoring subfolders.

Also, remember to account for case sensitivity in string comparison in SSIS.

-1
votes

Thanks to all that answered my question. Here's where I landed. I'm sure this can be improved. I definitely could not have done this without you all!!!! Thanks!

Create project variables EFTFileSystem_FoldersToBeExcludedFromPolling defined as string. Comma delimited. Example "processed,error" EFTFileSystem_NewFileName - populated from SSIS for each loop container single value. Example "c:\effy\bozo\SideShowBob.txt" EFTFileSystem_IsValidNewFile - script task will return full path if file is in valid folder. Example "c:\effy\bozo\KrustyGood.txt"

Add and configure "Foreach Loop Container" In Collection tab set "Retrieve file name" to fully qualified Select traverse subfolders In Variable Mappings tab Set variable to User::EFTFileSystem_NewFileName

Within "Foreach Loop Container" Create Script task ReadOnlyVariables = User::EFTFileSystem_FoldersToBeExcludedFromPolling,User::EFTFileSystem_NewFileName ReadWriteVariables = User::EFTFileSystem_IsValidNewFile

Edit script task Add to namespace: using System.Linq;

    public void Main()
    {
  // TODO: Add your code here

  // use the .net native split method to split on commas
  string Bob = Dts.Variables["EFTFileSystem_FoldersToBeExcludedFromPolling"].Value.ToString();
  var fileList = Bob.Split(new Char[] { ',' });

  // We could inline this for the next operation but you might have need for this elsewhere
  string currentFilename = Dts.Variables["EFTFileSystem_NewFileName"].Value.ToString();

  // https://stackguides.com/questions/500925/check-if-a-string-contains-an-element-from-a-list-of-strings
  bool isFound = fileList.Any(s => currentFilename.Contains(s));


  // Assign the results back to our SSIS scoped variable
  Dts.Variables["EFTFileSystem_IsValidNewFile"].Value = !isFound;

  Dts.TaskResult = (int)ScriptResults.Success;

}