0
votes

I have set up NiFi (1.11.4) & Kafka(2.5) via docker (docker-compose file below, actual NiFi flow definition https://github.com/geoHeil/streaming-reference). When trying to follow up on basic getting started tutorials (such as https://towardsdatascience.com/big-data-managing-the-flow-of-data-with-apache-nifi-and-apache-kafka-af674cd8f926) which combine processors such as:

  • generate flowfile (CSV)
  • update attribute
  • PublishKafka2.0

I run into issues of timeoutException:

nifi_1            | 2020-06-10 11:15:47,311 ERROR [kafka-producer-network-thread | producer-2] o.a.n.p.k.pubsub.PublishKafkaRecord_2_0 PublishKafkaRecord_2_0[id=959f0e64-0172-1000-0000-0000650181a4] Failed to send StandardFlowFileRecord[uuid=944464e4-94ea-48dc-89fa-d19c34f163e7,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1591771086044-1, container=default, section=1], offset=227962, length=422],offset=0,name=c8dd1dd2-0ffe-4875-9d45-902ea331c210,size=422] to Kafka: org.apache.kafka.common.errors.TimeoutException: Expiring 64 record(s) for test-0: 30029 ms has passed since batch creation plus linger time
nifi_1            | org.apache.kafka.common.errors.TimeoutException: Expiring 64 record(s) for test-0: 30029 ms has passed since batch creation plus linger time
nifi_1            | 2020-06-10 11:15:47,311 ERROR [kafka-producer-network-thread | producer-2] o.a.n.p.k.pubsub.PublishKafkaRecord_2_0 PublishKafkaRecord_2_0[id=959f0e64-0172-1000-0000-0000650181a4] Failed to send StandardFlowFileRecord[uuid=944464e4-94ea-48dc-89fa-d19c34f163e7,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1591771086044-1, container=default, section=1], offset=227962, length=422],offset=0,name=c8dd1dd2-0ffe-4875-9d45-902ea331c210,size=422] to Kafka: org.apache.kafka.common.errors.TimeoutException: Expiring 64 record(s) for test-0: 30029 ms has passed since batch creation plus linger time
nifi_1            | org.apache.kafka.common.errors.TimeoutException: Expiring 64 record(s) for test-0: 30029 ms has passed since batch creation plus linger time

However, a:

kafkacat -C -b localhost:9092 -t test #starts listener
kafkacat -P -b localhost:9092 -t test #starts producer

pipes events just fine through the kafka instance.

The docker-compose file looks like:

version: "3"
services:
  nifi:
    image: apache/nifi:1.11.4
    ports:
      - 8080:8080 # Unsecured HTTP Web Port
    environment:
      - NIFI_WEB_HTTP_PORT=8080
      - NIFI_CLUSTER_IS_NODE=true
      - NIFI_CLUSTER_NODE_PROTOCOL_PORT=8082
      - NIFI_ZK_CONNECT_STRING=zookeeper:2181
      - NIFI_ELECTION_MAX_WAIT=1 min
    links:
      - broker
      - zookeeper
    volumes:
      - ./for_nifi/conf:/opt/nifi/nifi-current/conf
  zookeeper: #https://github.com/confluentinc/cp-all-in-one/blob/5.5.0-post/cp-all-in-one/docker-compose.yml#L5
    image: confluentinc/cp-zookeeper:5.5.0
    hostname: zookeeper
    container_name: zookeeper
    ports:
      - "2181:2181"
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
  broker:
    image: confluentinc/cp-kafka:5.5.0
    hostname: broker
    container_name: broker
    depends_on:
      - zookeeper
    ports:
      - "29092:29092"
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
1
What host&port are you using in NiFi to connect to Kafka? - Robin Moffatt
I use the broker:9092, i.e. the DNS of docker - Georg Heiler

1 Answers

2
votes

You're using the wrong port to connect to the broker. By connecting to 9092 you connect to the listener that advertises localhost:9092 to the client for subsequent connections. That's why it works when you use kafkacat from your local machine (because 9092 is exposed to your local machine)

If you use broker:29092 then the broker will give the client the correct address for the connection (i.e. broker:29092).

To understand more about advertised listeners see this blog