1
votes

I'm trying to push around 2797 partitions on 467 topics to snowflake via the kafka connector, and the (2 task) kafka connector bombs out with:

    [2020-03-08 10:13:10,884] ERROR WorkerSinkTask{id=snowflake-1} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
    at java.base/java.lang.Thread.start0(Native Method)
    at java.base/java.lang.Thread.start(Thread.java:803)
    at java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:937)
    at java.base/java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1583)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:346)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.scheduleAtFixedRate(ScheduledThreadPoolExecutor.java:632)
    at net.snowflake.ingest.connection.SecurityManager.<init>(SecurityManager.java:105)
    at net.snowflake.ingest.connection.SecurityManager.<init>(SecurityManager.java:120)
    at net.snowflake.ingest.connection.RequestBuilder.<init>(RequestBuilder.java:216)
    at net.snowflake.ingest.SimpleIngestManager.<init>(SimpleIngestManager.java:353)
    at com.snowflake.kafka.connector.internal.SnowflakeIngestionServiceV1.<init>(SnowflakeIngestionServiceV1.java:41)
    at com.snowflake.kafka.connector.internal.SnowflakeIngestionServiceFactory$SnowflakeIngestionServiceBuilder.<init>(SnowflakeIngestionServiceFactory.java:35)
    at com.snowflake.kafka.connector.internal.SnowflakeIngestionServiceFactory$SnowflakeIngestionServiceBuilder.<init>(SnowflakeIngestionServiceFactory.java:25)
    at com.snowflake.kafka.connector.internal.SnowflakeIngestionServiceFactory.builder(SnowflakeIngestionServiceFactory.java:18)
    at com.snowflake.kafka.connector.internal.SnowflakeConnectionServiceV1.buildIngestService(SnowflakeConnectionServiceV1.java:681)
    at com.snowflake.kafka.connector.internal.SnowflakeSinkServiceV1$ServiceContext.<init>(SnowflakeSinkServiceV1.java:271)
    at com.snowflake.kafka.connector.internal.SnowflakeSinkServiceV1$ServiceContext.<init>(SnowflakeSinkServiceV1.java:232)
    at com.snowflake.kafka.connector.internal.SnowflakeSinkServiceV1.startTask(SnowflakeSinkServiceV1.java:70)
    at com.snowflake.kafka.connector.SnowflakeSinkTask.lambda$open$2(SnowflakeSinkTask.java:168)
    at java.base/java.lang.Iterable.forEach(Iterable.java:75)
    at com.snowflake.kafka.connector.SnowflakeSinkTask.open(SnowflakeSinkTask.java:168)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.openPartitions(WorkerSinkTask.java:586)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.access$1100(WorkerSinkTask.java:67)
    at org.apache.kafka.connect.runtime.WorkerSinkTask$HandleRebalance.onPartitionsAssigned(WorkerSinkTask.java:651)
    at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:285)
    at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:424)
    at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:358)
    at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:353)
    at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1251)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1216)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.pollConsumer(WorkerSinkTask.java:443)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:316)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192)
    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
[2020-03-08 10:13:10,885] ERROR WorkerSinkTask{id=snowflake-1} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)

My VM seems ok resourcewise, hence I suspect an interal snowpipe limit spitting at me; any other ideas?

1
Welcome to stackoverflow! Are you sure your VM is ok resourcewise? The error message java.lang.OutOfMemoryError implies the machine is low on RAM.Hami

1 Answers

0
votes

Kafka connect defaults to 2G of heap space.

If you want to use one VM for so many topics and partitions, that seems like a bad idea.

Create a Connect cluster. Distribute the workload