I am using Azure Databricks with Databricks Runtime 5.2 and Spark 2.4.0. I have setup external Hive tables in two different ways: - a Databricks Delta table where the data is stored in Azure Data Lake Storage (ADLS) Gen 2, the table was created using a location setting, which points to a mounted directory in ADLS Gen 2. - a regular DataFrame, saved as a table to ADLS Gen 2, not using the mount this time but instead the OAuth2 credentials I've set on the cluster level using spark.sparkContext.hadoopConfiguration
Both the mount point and the direct access (hadoopConfiguration) have been configured using OAuth2 credentials and an Azure AD Service Principal, which has the necessary access rights to Data Lake.
Both tables show up correctly in Databricks UI and can be queried.
Both tables are also visible in a BI tool (Looker), where I have successfully configured a JDBC connection to my Databricks instance. After this the differences begin:
1) table configured using the mount point does not allow me to run a DESCRIBE operation in the BI tool, not to mention a query. Everything fails with error "com.databricks.backend.daemon.data.common.InvalidMountException: Error while using path /mnt/xxx/yyy/zzz for resolving path '/yyy/zzz' within mount at '/mnt/xxx'."
2) table configured using without the mount point allows me to run DESCRIBE operation, but a query fails with error "java.util.concurrent.ExecutionException: java.io.IOException: There is no primary group for UGI (Basic token) (auth:SIMPLE)".
JDBC connection and querying from the BI tool to a managed table in Databricks works fine.
As far as I know, there isn't anything I could configure differently when creating the external tables, configuring the mounting point or the OAuth2 credentials. It seems to me that when using JDBC, the mount is not visible at all, so the request to the underlying datasource (ADLS Gen 2) can not succeed. On the other hand, the second scenario (number 2 above) is a bit more puzzling and in my mind seems like something somewhere under the hood, deep, and I have no idea about what to do with that.
One peculiar thing is also my username which shows up in scenario 2. I don't know where that comes from, as it is not involved when setting up the ADLS Gen 2 access using the Service Principal.