I'm running a kafka-connect distributed setup.
I was testing with a single machine/process setup (still in distributed mode), which worked fine, now I'm working with 3 nodes (and 3 connect processes), logs do not contain errors, but when I submit an s3-connector request through the rest-api, it returns: {"error_code":409,"message":"Cannot complete request because of a conflicting operation (e.g. worker rebalance)"}
.
When I stop the kafka-connect process on one of the nodes, I can actually submit the job and everything is running fine.
I have 3 brokers in my cluster, the partition number of the topic is 32.
This is the connector I'm trying to launch:
{
"name": "s3-sink-new-2",
"config": {
"connector.class": "io.confluent.connect.s3.S3SinkConnector",
"tasks.max": "32",
"topics": "rawEventsWithoutAttribution5",
"s3.region": "us-east-1",
"s3.bucket.name": "dy-raw-collection",
"s3.part.size": "64000000",
"flush.size": "10000",
"storage.class": "io.confluent.connect.s3.storage.S3Storage",
"format.class": "io.confluent.connect.s3.format.avro.AvroFormat",
"schema.generator.class": "io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator",
"partitioner.class": "io.confluent.connect.storage.partitioner.TimeBasedPartitioner",
"partition.duration.ms": "60000",
"path.format": "\'year\'=YYYY/\'month\'=MM/\'day\'=dd/\'hour\'=HH",
"locale": "US",
"timezone": "GMT",
"timestamp.extractor": "RecordField",
"timestamp.field": "procTimestamp",
"name": "s3-sink-new-2"
}
}
Nothing in the logs indicate a problem, and I'm really lost here.
rest.advertised.host.name
cannot be resolved over each worker – OneCricketeerrest.port
. But the rebalancing and REST requests communicate between the workers directly, not just through Kafka, from what I've noticed. If this isn't the issue, the consumer group is literally rebalancing and there might be other instabilities in the cluster, not just Connect – OneCricketeer