I don't believe parallelism for external data sources is supported yet for U-SQL (although happy to be corrected). If you feel this is an important missing feature you can create a request and vote for it here:
https://feedback.azure.com/forums/327234-data-lake
As a workaround, you could manually parallelise your queries, depending on the columns available in your datasource. eg by date
// External query working
USE DATABASE yourADLADB;
// Create the external query for year 2016
@results2016 =
SELECT *
FROM EXTERNAL yourSQLDBDataSource EXECUTE
@"SELECT * FROM dbo.yourBigTable WITH (NOLOCK) WHERE yourDateCol Between '1 Jan 2016 and 31 Dec 2016'";
// Create the external query for year 2017
@results2017 =
SELECT *
FROM EXTERNAL yourSQLDBDataSource EXECUTE
@"SELECT * FROM dbo.yourBigTable WITH (NOLOCK) WHERE yourDateCol Between '1 Jan 2017 and 31 Dec 2017";
// Output 2016 results
OUTPUT @results2016
TO "/output/bigTable/results2016.csv"
USING Outputters.Csv();
// Output 2017 results
OUTPUT @results2017
TO "/output/bigTable/results2017.csv"
USING Outputters.Csv();
Now, I have created a different issue by breaking up the files into multiple parts. However you could then read these using filesets which will also parallelise, eg:
@input =
EXTRACT
... // your column list
FROM "/output/bigTable/results{year}.csv"
USING Extractors.Csv();
I would ask why you are choosing to move such a large file into your lake given ADLA and U-SQL offer the you ability to query data where it lives. Can you explain further?