1
votes

Without spinning up a VM instance, is it possible to add PGP encryption to data already in Azure Lake Lake Store? Theoretically, it seems this should be possible with a registered c# binary (dll) in U-SQL but theoretically this would require treating files as blobs (or as text), and I'm not sure how one would do that from U-SQL?

The use case is to take data from the lake, encrypt it as PGP/GPG using a public key, and then land the data into an ADLS location for pickup by an external team (subsequent egress from ADLS).

Any ideas?

1
Possibly relates to: stackoverflow.com/questions/4192296 (C#+PGP)aaronsteers

1 Answers

2
votes

You can write a custom extractor and outputter that can then do the decryption/encryption. This would most likely look something like this (at the abstract level):

  • Extractor:

    AtomicFileProcessing=true
    d = decrypt(input.baseStream)
    for each row in d.Split do outputrow end // or whatever the right processing is
    
  • Outputter:

    AtomicFileProcessing=true
    serialize rows into outputstream
    encrypt outputstream and write to output
    

Note that there are some examples on the example section in our U-SQL GitHub page that show how to operate on data at the basestream level.

You will want to avoid having to load more than 500MB of data into main memory though if you can. So it would be good if the encrypt/decrypt could be done in a streaming way.