There is no out-of-the box solution to clone data from one Azure EventHub to another EventHub. What are possible options to achieve this?
1
votes
1 Answers
1
votes
One simple option for duplicating an Azure EventHub stream is to write a clone-job in PySpark. You just read the stream from your source-Eventhub select the body and if relevant for your scenario also the properties from the source-streaming dataframe and write this stream to your target-EventHub:
df = spark \
.readStream \
.format("eventhubs") \
.options(**ehSource) \
.load() \
.select ("properties", "body") \
.writeStream \
.format("eventhubs") \
.options(**ehTarget) \
.option("checkpointLocation", checkploc) \
.start()