0
votes

I am trying to ingest some data into ADX but I don't see any data appearing:

  • 40 parquet files (ranging from 1Mb to 550Mb, in total 8GB)
  • From blob storage using Event Grid
  • Running on D11 V2 cluster tier with auto-scale
  • Ingestion utilization stays at 100% for 2 days, then drops to 0%
  • Ingestion latency raises to 24h at maximum, then drops
  • Count of rows is always 0, database size does not increase
  • Operations log shows a lot of failures: "Operation": DataIngestPull, "The admin command execution timed out at '2020-09-08T06:39:18.1115065Z'" etc.
  • Diagnostic logging also shows failures: FailedIngestion, Blob has exceeded the '2.00:00:00' retry period or '10' retry attempts, BadRequest_MessageExhausted
  • When I ingest one small file it works and the data shows up

Worst part is that I am not able to cancel the ingestion, but have to wait 2 days. Is there a way to cancel?

How can I succesfully ingest this data? Is it supposed to take this long?

1

1 Answers

1
votes

It looks like the ingestion is timing out, since each ingestion batch is too big. The best way to resolve this would be the add the raw data size of the blob (could be an approximate size) to the blob metadata, as explained here. Alternatively, you can try reducing the database/table batching policy, as explained here (you can start by reducing from 1GB to 500MB and reduce further if this is not sufficient).