I have thousands of csv files containing spanning from January 2016 until today. I want to load all files from 25th of November 2016 until 02nd of January 2017.
I know I can use virtual path as below, but wont this load all my data from disk? I only want data from the period above. Will adding the @result query (modifying to my time period) ensure that only the files I am interested in are loaded into memory?
DECLARE @file_set_path2 string = @dir + "{date:yyyy}/{date:MM}/{date:dd}/{date:MM}{date:dd}{date:yyyy}.csv";
@data =
EXTRACT vala int,
valb long,
valc DateTime,
date DateTime // virtual file set column
FROM @file_set_path2
USING Extractors.Csv();
@result =
SELECT *
FROM @data
WHERE date > DateTime.Parse("2016-11-24")
AND date < DateTime.Parse("2017-01-03");