1
votes

As per the databricks docs- https://docs.databricks.com/data/data-sources/azure/cosmosdb-connector.html, i've downloaded the latest azure-cosmosdb-spark library( azure-cosmosdb-spark_2.4.0_2.11-2.1.2-uber.jar) and placed in the libraries location of dbfs.

When trying to write the data from a dataframe to COSMOS container, i'm getting the below error, any help would be more appreciated.

My Databricks runtime version is: 7.0 (includes Apache Spark 3.0.0, Scala 2.12)

imports from the notebook:

import java.time.LocalDateTime
import java.time.format.DateTimeFormatter
import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}
import org.apache.spark.sql.types.{StructField, _}
import org.apache.spark.sql.functions._
import com.microsoft.azure.cosmosdb.spark.schema._
import com.microsoft.azure.cosmosdb.spark.CosmosDBSpark
import com.microsoft.azure.cosmosdb.spark.config._

val dtcCANWrite = Config(Map(
  "Endpoint" -> "NOT DISPLAYED",
  "Masterkey" -> "NOT DISPLAYED",
  "Database" -> "NOT DISPLAYED",
  "Collection" -> "NOT DISPLAYED",
  "preferredRegions" -> "NOT DISPLAYED",
  "Upsert" -> "true"
))

distinctCANDF.write.mode(SaveMode.Append).cosmosDB(dtcCANWrite)

Error:

    at com.microsoft.azure.cosmosdb.spark.config.CosmosDBConfigBuilder.<init>(CosmosDBConfigBuilder.scala:31)
    at com.microsoft.azure.cosmosdb.spark.config.Config$.apply(Config.scala:259)
    at com.microsoft.azure.cosmosdb.spark.config.Config$.apply(Config.scala:240)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:7)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:69)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:71)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:73)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:75)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:77)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:79)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:81)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:83)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:85)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:87)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:89)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:91)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw.<init>(command-3649834446724317:93)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw.<init>(command-3649834446724317:95)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw.<init>(command-3649834446724317:97)
    at line6d80624d7a774601af6eb962eb59453253.$read.<init>(command-3649834446724317:99)
    at line6d80624d7a774601af6eb962eb59453253.$read$.<init>(command-3649834446724317:103)
    at line6d80624d7a774601af6eb962eb59453253.$read$.<clinit>(command-3649834446724317)
    at line6d80624d7a774601af6eb962eb59453253.$eval$.$print$lzycompute(<notebook>:7)
    at line6d80624d7a774601af6eb962eb59453253.$eval$.$print(<notebook>:6)
    at line6d80624d7a774601af6eb962eb59453253.$eval.$print(<notebook>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:745)
    at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1021)
    at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:574)
    at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:41)
    at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:37)
    at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
    at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:600)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:570)
    at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:215)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:202)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:714)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:667)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:202)
    at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$10(DriverLocal.scala:396)
    at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:238)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
    at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:233)
    at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:230)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:49)
    at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:275)
    at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:268)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:49)
    at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:373)
    at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:653)
    at scala.util.Try$.apply(Try.scala:213)
    at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:645)
    at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:486)
    at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:598)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:391)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337)
    at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: scala.Product$class
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    at com.microsoft.azure.cosmosdb.spark.config.CosmosDBConfigBuilder.<init>(CosmosDBConfigBuilder.scala:31)
    at com.microsoft.azure.cosmosdb.spark.config.Config$.apply(Config.scala:259)
    at com.microsoft.azure.cosmosdb.spark.config.Config$.apply(Config.scala:240)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:7)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:69)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:71)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:73)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:75)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:77)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:79)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:81)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:83)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:85)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:87)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:89)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw$$iw.<init>(command-3649834446724317:91)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw$$iw.<init>(command-3649834446724317:93)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw$$iw.<init>(command-3649834446724317:95)
    at line6d80624d7a774601af6eb962eb59453253.$read$$iw.<init>(command-3649834446724317:97)
    at line6d80624d7a774601af6eb962eb59453253.$read.<init>(command-3649834446724317:99)
    at line6d80624d7a774601af6eb962eb59453253.$read$.<init>(command-3649834446724317:103)
    at line6d80624d7a774601af6eb962eb59453253.$read$.<clinit>(command-3649834446724317)
    at line6d80624d7a774601af6eb962eb59453253.$eval$.$print$lzycompute(<notebook>:7)
    at line6d80624d7a774601af6eb962eb59453253.$eval$.$print(<notebook>:6)
    at line6d80624d7a774601af6eb962eb59453253.$eval.$print(<notebook>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:745)
    at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1021)
    at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:574)
    at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:41)
    at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:37)
    at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
    at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:600)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:570)
    at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:215)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:202)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:714)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:667)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:202)
    at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$10(DriverLocal.scala:396)
    at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:238)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
    at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:233)
    at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:230)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:49)
    at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:275)
    at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:268)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:49)
    at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:373)
    at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:653)
    at scala.util.Try$.apply(Try.scala:213)
    at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:645)
    at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:486)
    at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:598)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:391)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337)
    at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219)
    at java.lang.Thread.run(Thread.java:748)
1
The doc page you linked to states, explicitly: "You cannot access this data source from a cluster running Databricks Runtime 7.0 or above because an Azure Cosmos DB connector that supports Apache Spark 3.0 is not available." In your question, you mention that you are using Runtime v7.0. Seems that would be your issue, no? - David Makogon
Also: the code you wrote doesn't appear the same as the example given in the doc you pointed to: you have no call to CosmosDBSpark.save(). Are you sure your syntax is correct? I don't see any equivalent syntax in the docs. - David Makogon
You have a version conflict here, you're using azure-cosmosdb-spark_2.4.0_2.11-2.1.2-uber.jar lib for Scala 2.11, and Scala 2.12 on Databricks. You'll have to either update the lib version, or downgrade Spark/Scala version on Datbricks. - Rayan Ral
Hello @DavidMakogon, Rayan thanks for the suggestion, i've downgraded spark scala version from Spark 3.0.0, Scala 2.12 to Spark 2.4.4 with Scala 2.11. Everything worked fine now. - chaitra k
Hi, I have update the solution of this question in the answer. Can you mark it as the answer to end this question? That may help others who meet the similar question. You can also post your own answer and mark it to end this question. If so, let me know and I will delete my answer.:) - Cindy Pau

1 Answers

1
votes

Thanks for Rayan Ral's suggestion. So the problem comes from the version conflict.

Solution is downgraded the version. In this cases, you can downgraded from Spark 3.0.0, Scala 2.12 to Spark 2.4.4, Scala 2.11.