Setup
We have a 3 node kafka cluster, processing messages coming in through nginx. The nginx hands it off to a php which in turn forks a python process and calls the KafkaClient, SimpleProducer & Send_Message
The zookeeper is running on the same host as kafka, nginx is on a separate host. The ports 2181, 2182, 3888, 9092 are all open. No errors seen in starting zookeeper, kafka. All this setup is on AWS in the same vpc.
Kafka & Zookeeper is running as kafka user, Nginx is running as nginx, php-fpm running as apache
Versions
Kafka: 0.8.2 Python: 2.7.5
Relevant snippets from property files.
zookeeper.properties
dataDir=/data/zookeeper
clientPort=2181
maxClientCnxns=100
tickTime=2000
initLimit=5
syncLimit=2
server.1=172.31.41.78:2888:3888
server.2=172.31.45.245:2888:3888
server.3=172.31.23.101:2888:3888
producer.properties
metadata.broker.list=172.31.41.78:9092,172.31.45.245:9092,172.31.23.101:9092
consumer.properties
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=6000
server.properties (setup with appropriate IP on other machines)
port=9092
advertised.host.name=172.31.41.78
host.name=172.31.41.78
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=60000
php code
function sendDataToKafka($_data,$_srcType) {
try{
$pyKafka = "/usr/bin/python /etc/nginx/html/postMon.py ".$_srcType;
$dspec = array(
0 => array("pipe","r"),
1 => array("pipe","w"),
2 => array("file","/dev/null", "a")
);
$process = proc_open($pyKafka,$dspec,$pipes);
if (is_resource($process)) {
if(fwrite($pipes[0],$_data) == true){
fclose($pipes[0]);
echo stream_get_contents($pipes[1]);
fclose($pipes[1]);
proc_close($process);
echo "Process completed";
python code
import sys,json,time,ConfigParser
import traceback
sys.path.append("/etc/nginx/html/kafka_python-0.9.4-py2.7.egg")
from kafka import KafkaClient,SimpleProducer
try:
srcMap = {
'Alert' : 'alerts'
}
topic = srcMap.get(sys.argv[1],'events')
data = ''
data = 'Testing static Kafka message'
print 'Host: 172.31.23.101:9092'
kafka = KafkaClient("172.31.23.101:9092")
producer = SimpleProducer(kafka,random_start=True)
producer.send_messages(topic,data);
except Exception as e: # most generic exception you can catch
print str(e)
Scenarios
Scenario 1:
Running a
bin/kafka-console-producer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts
on 1 shell
and
running
./kafka-console-consumer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts
we are able to view messages
Scenario 2:
Running the python code command line (from the nginx host), able to view messages from the consumer
Scenario 3:
Running the php code command line (from nginx host), able to view messages from the consumer
Scenario 4:
Running from a REST client (as POSTMAN)/ CURL using the REST URL, get the following message:
<html>
<body>
Host: 172.31.23.101:9092
All servers failed to process request
<pre>Process completed</pre></body>
<html>
This shows, traffic going to nginx, nginx executing the php & python scripts, but erroring out when the first call to Kafka is made - KafkaClient happens. Somehow the python is unable to access Kafka.
Don't know if this is user permission/ silly config mistake.
Also......
- We have a similar working setup in another vpc
- The security groups, config files, codebase properties etc. are consistent
- Upgrade options are not a possibility in near term
Any pointers/help/fresh pair of eyes would really help us going.
Thanks !