0
votes

In the documentation for the AdlCopy.exe utility for moving data from Azure blob storage into Azure Data Lake Store, all the examples show swebhdfs: URI prefix.

For example:

AdlCopy /dest swebhdfs://mydatalakestore.azuredatalakestore.net/myfolder/ ...

https://azure.microsoft.com/en-us/documentation/articles/data-lake-store-copy-data-azure-storage-blob/

However, in the Azure Portal page for the Data Lake Store account lists two different "addresses" - a "URL" with https: prefix and an "ADL URI" with adl: prefix.

For example:

URL

https://mydatalakestore.azuredatalakestore.net

ADL URI

adl://mydatalakestore.azuredatalakestore.net

Are all these different "addresses" equivalent and substitutable for each other, particularly for use with the /dest parameter of the AdlCopy.exe utility?

1

1 Answers

2
votes

swebhdfs is the secure WebHDFS URI that provides WebHDFS compliant semantics. adl is the (also secure) Azure Data Lake URI that extends WebHDFS with some additional performance improvements and capabilities. You can also use https since the service has a REST interface. Currently the 3 URI schemes are interchangeable in the adlcopy tool. Going ahead please only use the URI scheme published in the portal for best performance.

Now don't get me started on why the Hadoop ecosystem is misusing the URI scheme for defining operational semantics (I guess someone got confused between the http Protocol and the http URL scheme).