0
votes

Setup

We have a 3 node kafka cluster, processing messages coming in through nginx. The nginx hands it off to a php which in turn forks a python process and calls the KafkaClient, SimpleProducer & Send_Message

The zookeeper is running on the same host as kafka, nginx is on a separate host. The ports 2181, 2182, 3888, 9092 are all open. No errors seen in starting zookeeper, kafka. All this setup is on AWS in the same vpc.

Kafka & Zookeeper is running as kafka user, Nginx is running as nginx, php-fpm running as apache

Versions

Kafka: 0.8.2 Python: 2.7.5

Relevant snippets from property files.

zookeeper.properties

dataDir=/data/zookeeper
clientPort=2181
maxClientCnxns=100
tickTime=2000
initLimit=5
syncLimit=2
server.1=172.31.41.78:2888:3888
server.2=172.31.45.245:2888:3888
server.3=172.31.23.101:2888:3888

producer.properties

metadata.broker.list=172.31.41.78:9092,172.31.45.245:9092,172.31.23.101:9092

consumer.properties

zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=6000

server.properties (setup with appropriate IP on other machines)

port=9092
advertised.host.name=172.31.41.78
host.name=172.31.41.78
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=60000

php code


function sendDataToKafka($_data,$_srcType) {
    try{
    $pyKafka = "/usr/bin/python /etc/nginx/html/postMon.py ".$_srcType;
        $dspec = array(
        0 => array("pipe","r"),
        1 => array("pipe","w"),
        2 => array("file","/dev/null", "a")
    );

    $process = proc_open($pyKafka,$dspec,$pipes);
        if (is_resource($process)) {

    if(fwrite($pipes[0],$_data) == true){
    fclose($pipes[0]);
        echo stream_get_contents($pipes[1]);
    fclose($pipes[1]);
        proc_close($process);
    echo "Process completed";

python code


import sys,json,time,ConfigParser
import traceback
sys.path.append("/etc/nginx/html/kafka_python-0.9.4-py2.7.egg")
from kafka import KafkaClient,SimpleProducer
try:
    srcMap = {
        'Alert'          : 'alerts'
    }
    topic = srcMap.get(sys.argv[1],'events')

    data = ''
    data = 'Testing static Kafka message'
    print 'Host: 172.31.23.101:9092'

    kafka = KafkaClient("172.31.23.101:9092")
    producer = SimpleProducer(kafka,random_start=True)
    producer.send_messages(topic,data);
except Exception as e:     # most generic exception you can catch
        print str(e)

Scenarios

Scenario 1:

Running a

bin/kafka-console-producer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts

on 1 shell

and

running

./kafka-console-consumer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts

we are able to view messages

Scenario 2:

Running the python code command line (from the nginx host), able to view messages from the consumer

Scenario 3:

Running the php code command line (from nginx host), able to view messages from the consumer

Scenario 4:

Running from a REST client (as POSTMAN)/ CURL using the REST URL, get the following message:

<html>
<body>
    Host: 172.31.23.101:9092
    All servers failed to process request
    <pre>Process completed</pre></body>
<html>

This shows, traffic going to nginx, nginx executing the php & python scripts, but erroring out when the first call to Kafka is made - KafkaClient happens. Somehow the python is unable to access Kafka.

Don't know if this is user permission/ silly config mistake.

Also......

  1. We have a similar working setup in another vpc
  2. The security groups, config files, codebase properties etc. are consistent
  3. Upgrade options are not a possibility in near term

Any pointers/help/fresh pair of eyes would really help us going.

Thanks !

1

1 Answers

0
votes

Finally figured the apache user did not have "right" permission.

selinuxconlist apache helped fix the issue.