0
votes

I am using Apache Spark to write few (<10) JSON Documents into Cosmos DB as a Proof of Concept

But I am getting this error, Does anyone know how to resolve it

DB - ProductRepo Collection: Products pratitionid (shard key) - productid

{"productName": "adipisicing mollit","productid": "39269afd-8139-42b8-ax2a-b46bd711392b","image": "https://picsum.photos/100/100/?random","category": "Shirts","brand": "Silica","styleId": 108897,"age": "0-24M"}
{"productName": "zerwtfsfsfs mollit","productid": "39269afd-8139-42b8-aa2a-b46bc711392b","image": "https://picsum.photos/100/100/?random","category": "Shirts","brand": "Blue","styleId": 108899,"age": "0-24M"}
{"productName": "sasasasasas 23iddt","productid": "39269afd-8139-43b8-aa2a-b46bc711392b","image": "https://picsum.photos/100/100/?random","category": "Shirts","brand": "Blue","styleId": 108899,"age": "0-24M"}

Exception is

com.mongodb.MongoCommandException: Command failed with error 2: 'Shared throughput collection should have a partition key
ActivityId: cafefab3-0000-0000-0000-000000000000, Microsoft.Azure.Documents.Common/2.7.0' on server cdb-ms-prod-southcentralus1-fd10.documents.azure.com:10255. The full response is {"_t": "OKMongoResponse", "ok": 0, "code": 2, "errmsg": "Shared throughput collection should have a partition key\r\nActivityId: cafefab3-0000-0000-0000-000000000000, Microsoft.Azure.Documents.Common/2.7.0", "$err": "Shared throughput collection should have a partition key\r\nActivityId: cafefab3-0000-0000-0000-000000000000, Microsoft.Azure.Documents.Common/2.7.0"}
    at com.mongodb.internal.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:175)
    at com.mongodb.internal.connection.InternalStreamConnection.receiveCommandMessageResponse(InternalStreamConnection.java:303)
    at com.mongodb.internal.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:259)
    at com.mongodb.internal.connection.UsageTrackingInternalConnection.sendAndReceive(UsageTrackingInternalConnection.java:99)
    at com.mongodb.internal.connection.DefaultConnectionPool$PooledConnection.sendAndReceive(DefaultConnectionPool.java:450)
    at com.mongodb.internal.connection.CommandProtocolImpl.execute(CommandProtocolImpl.java:72)
    at com.mongodb.internal.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:218)
    at com.mongodb.internal.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:269)
    at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:131)
    at com.mongodb.operation.MixedBulkWriteOperation.executeCommand(MixedBulkWriteOperation.java:435)
    at com.mongodb.operation.MixedBulkWriteOperation.executeBulkWriteBatch(MixedBulkWriteOperation.java:261)
    at com.mongodb.operation.MixedBulkWriteOperation.access$700(MixedBulkWriteOperation.java:72)
    at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:205)
    at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:196)
    at com.mongodb.operation.OperationHelper.withReleasableConnection(OperationHelper.java:501)
    at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:196)
    at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:71)
    at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:206)
    at com.mongodb.client.internal.MongoCollectionImpl.executeInsertMany(MongoCollectionImpl.java:524)
    at com.mongodb.client.internal.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:508)
    at com.mongodb.spark.MongoSpark$$anonfun$save$1$$anonfun$apply$1$$anonfun$apply$2.apply(MongoSpark.scala:119)
    at com.mongodb.spark.MongoSpark$$anonfun$save$1$$anonfun$apply$1$$anonfun$apply$2.apply(MongoSpark.scala:119)
    at scala.collection.Iterator$class.foreach(Iterator.scala:891)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
    at com.mongodb.spark.MongoSpark$$anonfun$save$1$$anonfun$apply$1.apply(MongoSpark.scala:119)
    at com.mongodb.spark.MongoSpark$$anonfun$save$1$$anonfun$apply$1.apply(MongoSpark.scala:118)
    at com.mongodb.spark.MongoConnector$$anonfun$withCollectionDo$1.apply(MongoConnector.scala:189)
    at com.mongodb.spark.MongoConnector$$anonfun$withCollectionDo$1.apply(MongoConnector.scala:187)
    at com.mongodb.spark.MongoConnector$$anonfun$withDatabaseDo$1.apply(MongoConnector.scala:174)
    at com.mongodb.spark.MongoConnector$$anonfun$withDatabaseDo$1.apply(MongoConnector.scala:174)
    at com.mongodb.spark.MongoConnector.withMongoClientDo(MongoConnector.scala:157)
    at com.mongodb.spark.MongoConnector.withDatabaseDo(MongoConnector.scala:174)
    at com.mongodb.spark.MongoConnector.withCollectionDo(MongoConnector.scala:187)
    at com.mongodb.spark.MongoSpark$$anonfun$save$1.apply(MongoSpark.scala:118)
    at com.mongodb.spark.MongoSpark$$anonfun$save$1.apply(MongoSpark.scala:117)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
    at org.apache.spark.scheduler.Task.run(Task.scala:123)
    at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2

2 Answers

0
votes

Seems you have not specified a shard key hence the error. I was facing the same issue after some head-scratching, I checked the Azure portal to try to create the collections manually, there I noticed that something called a “Shard key” was a required.Try to create a new collection with the shard key

0
votes

I believe the error message is very straight-forward, you just define a partition key of your collection. But if you have not use partition key when you create collection, then you need to migrate data.