0
votes

I've setup pipeline in Azure Data Factory to 1) Copy files from Storage to Lake 2) u-sql to merge / process the copied files and output to single file 3) open & process this merged file (insert to DB).

Whatever I try, permissions wise, step 3 fails. All demos and tutorials for Azure data lake stop at producing the output file claiming success . job done etc..

I'm finding the docs.microsoft on this quite convoluted (it could be due to Gen1/Gen2 Lake??). Surely, what I'm trying to do is a common scenario, take some data files, merge and output , process the output.

It seems that the file created by the u-sql process has different owner from other files, so the most common error is a 403. When setting up the pipelines in ADF, I can browse to the folders on lake storage etc to configure, but can't open the file without setting all permissions on file in Lake storage. When I debug / run the pipeline in ADF, the new file output doesn't have these permissions so the process output file step in the pipeline fails.

All of these resources are setup in same azure subscription.

1

1 Answers

0
votes

I've sorted this permissions issue now. If anyone is interested, some information & guidance here... https://www.sqlservercentral.com/stairways/stairway-to-u-sql

also, this course is quite good introduction.

https://app.pluralsight.com/library/courses/u-sql-azure-data-lake/