1
votes

I have a input folder in ADLS in the format year/month/date eg: 2017/07/11. I want to pass this input folder as a parameter to my usql script. I am not using ADF. I dont want to generate current date from within Usql script as i am not sure if the input folder is of the current date. How to do it effectively?

One way I thought of was uploading a "done" file after all my input folder is uploaded to ADLS account and that "done" file will contain the date. But i am not able to use that date to form my input data path. Please help.

1
U-SQL stored proc?wBob

1 Answers

1
votes

Let's assume you have several csv files in your folder structure (structured as yyyy/MM/dd) and you want to extract all the files in a folder of a specific date. You can do it in two ways (depending in whether you need to have exact datetime semantics or if you are fine with path concat).

First the path concat example:

DECLARE EXTERNAL @folder = "2017/07/11"; // Script parameter with default value. 
                                         // You can specify the value also with constant-foldable expression on Datetime.Now.

DECLARE @path = "/constantpath/"+@folder+"/{*.csv}";

@data = EXTRACT I int, s string // or whatever your schema is...
        FROM @path
        USING Extractors.Csv();
...

And here is the example with a file set virtual column:

DECLARE EXTERNAL @date = "2017/07/11"; // Script parameter with default value. 
                                         // You can specify the value also with constant-foldable expression on Datetime.Now and string serialization (I am not sure if the ADF parameter model supports DateTime values).

DECLARE @path = "/constantpath/{date:yyyy}/{date:MM}/{date:dd}/{*.csv}";

@data = EXTRACT I int, s string // or whatever your schema is...
              , date DateTime // virtual column for the date pattern
        FROM @path
        USING Extractors.Csv();

// Now apply the requested filter to reduce the files to the requested set
@data = SELECT * FROM @data WHERE date == DateTime.Parse(@date);
...

In both cases, you pass the parameter via the ADF parameterization model and you can decide to wrap the code into a U-SQL stored procedure or TVF as suggested by Bob.