So here's the big picture: my objective is to index large amounts of (.txt) data using the ELK stack + filebeat.
Basically, my problem is that filebeat seems to be unable to send logs to logstash. My guess is, some docker networking config is off...
The code for my project is available at https://github.com/mhyousefi/elk-docker.
THE ELK CONTAINER
To do so, I have one docker-compose.yml
to run a container from the image sebp/elk
, which looks like this:
version: '2'
services:
elk:
container_name: elk
image: sebp/elk
ports:
- "5601:5601"
- "9200:9200"
- "5045:5044"
volumes:
- /path/to/volumed-folder:/logstash
networks:
- elk_net
networks:
elk_net:
driver: bridge
Once the container is created, I go to the container bash terminal and run the command:
/opt/logstash/bin/logstash --path.data /tmp/logstash/data -f /logstash/config/filebeat-config.conf
Running this command, I get the following logs and it will then just start waiting without printing any further logs:
$ /opt/logstash/bin/logstash --path.data /tmp/logstash/data -f /logstash/config/filebeat-config.conf
Sending Logstash's logs to /opt/logstash/logs which is now configured via log4j2.properties
[2018-08-14T11:51:11,693][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.queue", :path=>"/tmp/logstash/data/queue"}
[2018-08-14T11:51:11,701][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.dead_letter_queue", :path=>"/tmp/logstash/data/dead_letter_queue"}
[2018-08-14T11:51:12,194][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-08-14T11:51:12,410][INFO ][logstash.agent ] No persistent UUID file found. Generating new UUID {:uuid=>"3646b6e4-d540-4c9c-a38d-2769aef5a05e", :path=>"/tmp/logstash/data/uuid"}
[2018-08-14T11:51:13,089][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.3.2"}
[2018-08-14T11:51:15,554][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>6, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-08-14T11:51:16,088][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2018-08-14T11:51:16,101][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[2018-08-14T11:51:16,291][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2018-08-14T11:51:16,391][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-08-14T11:51:16,398][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}
[2018-08-14T11:51:16,460][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}
[2018-08-14T11:51:16,515][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2018-08-14T11:51:16,559][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2018-08-14T11:51:16,688][INFO ][logstash.outputs.elasticsearch] Installing elasticsearch template to _template/logstash
[2018-08-14T11:51:16,899][INFO ][logstash.inputs.beats ] Beats inputs: Starting input listener {:address=>"0.0.0.0:5045"}
[2018-08-14T11:51:16,925][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x54ab986e run>"}
[2018-08-14T11:51:17,170][INFO ][org.logstash.beats.Server] Starting server on port: 5045
[2018-08-14T11:51:17,187][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2018-08-14T11:51:17,637][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9601}
Now, this is what the filebeat-config.conf
looks like:
input {
beats {
port => "5044"
}
}
output {
elasticsearch {
hosts => [ "localhost:9200" ]
index => "%{[@metadata][beat]}"
}
}
THE FILEBEAT CONTAINER
My filebeat
container is created using the bellow docker-compose.yml
file:
version: "2"
services:
filebeat:
container_name: filebeat
hostname: filebeat
image: docker.elastic.co/beats/filebeat:6.3.0
user: root
# command: ./filebeat -c /usr/share/filebeat-volume/config/filebeat.yml -E name=mybeat
volumes:
# "volumed-folder" lies under ${PROJECT_DIR}/filebeat or could be anywhere else you wish
- /path/to/volumed-folder:/usr/share/filebeat/filebeat-volume:ro
networks:
- filebeat_net
networks:
filebeat_net:
external: true
Once the container is created, I go to the container bash terminal, replace the existing filebeat.yml
under /usr/share/filebeat
with the one I have volumed, and run the command:
./filebeat -e -c ./filebeat.yml -E name="mybeat"
The terminal immediately displays the following logs:
root@filebeat filebeat]# ./filebeat -e -c ./filebeat.yml -E name="mybeat"
2018-08-14T12:13:16.325Z INFO instance/beat.go:492 Home path: [/usr/share/filebeat] Config path: [/usr/share/filebeat] Data path: [/usr/share/filebeat/data] Logs path: [/usr/share/filebeat/logs]
2018-08-14T12:13:16.325Z INFO instance/beat.go:499 Beat UUID: 3b4b3897-ef77-43ad-b982-89e8f690a96e
2018-08-14T12:13:16.325Z INFO [beat] instance/beat.go:716 Beat info {"system_info": {"beat": {"path": {"config": "/usr/share/filebeat", "data": "/usr/share/filebeat/data", "home": "/usr/share/filebeat", "logs": "/usr/share/filebeat/logs"}, "type": "filebeat", "uuid": "3b4b3897-ef77-43ad-b982-89e8f690a96e"}}}
2018-08-14T12:13:16.325Z INFO [beat] instance/beat.go:725 Build info {"system_info": {"build": {"commit": "a04cb664d5fbd4b1aab485d1766f3979c138fd38", "libbeat": "6.3.0", "time": "2018-06-11T22:34:44.000Z", "version": "6.3.0"}}}
2018-08-14T12:13:16.325Z INFO [beat] instance/beat.go:728 Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":6,"version":"go1.9.4"}}}
2018-08-14T12:13:16.327Z INFO [beat] instance/beat.go:732 Host info {"system_info": {"host": {"architecture":"x86_64","boot_time":"2018-08-04T17:34:15Z","containerized":true,"hostname":"filebeat","ips":["127.0.0.1/8","172.28.0.2/16"],"kernel_version":"4.4.0-116-generic","mac_addresses":["02:42:ac:1c:00:02"],"os":{"family":"redhat","platform":"centos","name":"CentOS Linux","version":"7 (Core)","major":7,"minor":5,"patch":1804,"codename":"Core"},"timezone":"UTC","timezone_offset_sec":0}}}
2018-08-14T12:13:16.328Z INFO [beat] instance/beat.go:761 Process info {"system_info": {"process": {"capabilities": {"inheritable":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"permitted":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"effective":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"bounding":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"ambient":null}, "cwd": "/usr/share/filebeat", "exe": "/usr/share/filebeat/filebeat", "name": "filebeat", "pid": 93, "ppid": 28, "seccomp": {"mode":"filter"}, "start_time": "2018-08-14T12:13:15.530Z"}}}
2018-08-14T12:13:16.328Z INFO instance/beat.go:225 Setup Beat: filebeat; Version: 6.3.0
2018-08-14T12:13:16.329Z INFO pipeline/module.go:81 Beat name: mybeat
2018-08-14T12:13:16.329Z WARN [cfgwarn] beater/filebeat.go:61 DEPRECATED: prospectors are deprecated, Use `inputs` instead. Will be removed in version: 7.0.0
2018-08-14T12:13:16.330Z INFO [monitoring] log/log.go:97 Starting metrics logging every 30s
2018-08-14T12:13:16.330Z INFO instance/beat.go:315 filebeat start running.
2018-08-14T12:13:16.330Z INFO registrar/registrar.go:112 Loading registrar data from /usr/share/filebeat/data/registry
2018-08-14T12:13:16.330Z INFO registrar/registrar.go:123 States Loaded from registrar: 0
2018-08-14T12:13:16.331Z WARN beater/filebeat.go:354 Filebeat is unable to load the Ingest Node pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the Ingest Node pipelines or are using Logstash pipelines, you can ignore this warning.
2018-08-14T12:13:16.331Z INFO crawler/crawler.go:48 Loading Inputs: 1
2018-08-14T12:13:16.331Z INFO log/input.go:111 Configured paths: [/usr/share/filebeat-volume/data/Shakespeare.txt]
2018-08-14T12:13:16.331Z INFO input/input.go:87 Starting input of type: log; ID: 1899165251698784346
2018-08-14T12:13:16.331Z INFO crawler/crawler.go:82 Loading and starting Inputs completed. Enabled inputs: 1
And the every 30 seconds, it displays the following:
2018-08-14T12:13:46.334Z INFO [monitoring] log/log.go:124 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":20,"time":{"ms":24}},"total":{"ticks":30,"time":{"ms":36},"value":30},"user":{"ticks":10,"time":{"ms":12}}},"info":{"ephemeral_id":"16c484f0-0cf8-4c10-838d-b39755284af9","uptime":{"ms":30017}},"memstats":{"gc_next":4473924,"memory_alloc":3040104,"memory_total":3040104,"rss":21061632}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"logstash"},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"cpu":{"cores":6},"load":{"1":1.46,"15":1.52,"5":1.66,"norm":{"1":0.2433,"15":0.2533,"5":0.2767}}}}}}
And no index-patterns are created in Kibana.
This is what my filebeat.yml
looks like:
filebeat.inputs:
- type: log
paths:
- /path/to/a/log/file
output.logstash:
hosts: ["elk:5044"]
setup.kibana:
host: "localhost:5601"
I have used this stackoverflow question to define the networks
section of my docker-compose
files, so that my containers can talk to each other using their container_name
s.
So, when I do
output.logstash:
hosts: ["elk:5044"]
I expect filebeat to send logs to port 5044 of the elk container, where logstash is listening for incoming messages.
After I run filebeat inside its terminal, I actually do see the following logs in the terminal in which I did docker-compose up elk
:
elk |
elk | ==> /var/log/elasticsearch/elasticsearch.log <==
elk | [2018-08-14T11:51:16,974][INFO ][o.e.c.m.MetaDataIndexTemplateService] [fZr_LDR] adding template [logstash] for index patterns [logstash-*]
which I am assuming some sort of communication has been made between logstash and filebeat.
However, on the other hand, despite following the mentioned stackoverflow response, I cannot do ping elk
in my filebeat container. The hostname is not resolved.
I appreciate any help!
UPDATE (Aug 15, 2018)
I think I don't even need to open a port for my ELK
container. What happens is that Logstash
is listening on port 5044 inside the container. As long as the filebeat.yml
file inside theFilebeat
container can resolve the ELK
host and then send its logs over to 5044 port there ("elk:5044"), they should all work fine.
That's why I deleted the "5045:5044"
line, and fixed the networks
section inside the docker-compose.yml
file for my Filebeat
container to include the following:
networks:
filebeat_net:
external:
name: elk_elk_net
And it seems to work, since when I do ping elk
, I am getting a connection.
While the networking issue is resolved (I can ping!), the connection between Logstash
and Filebeat
remains troublesome, and keep getting the following message every 30 seconds.
2018-08-14T12:13:46.334Z INFO [monitoring] log/log.go:124 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":20,"time":{"ms":24}},"total":{"ticks":30,"time":{"ms":36},"value":30},"user":{"ticks":10,"time":{"ms":12}}},"info":{"ephemeral_id":"16c484f0-0cf8-4c10-838d-b39755284af9","uptime":{"ms":30017}},"memstats":{"gc_next":4473924,"memory_alloc":3040104,"memory_total":3040104,"rss":21061632}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"logstash"},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"cpu":{"cores":6},"load":{"1":1.46,"15":1.52,"5":1.66,"norm":{"1":0.2433,"15":0.2533,"5":0.2767}}}}}}
In the terminal of my filebeat container, I am also getting the following logs periodically when running the filebeat command in verbose mode:
2018-08-15T16:26:41.986Z DEBUG [input] input/input.go:124 Run input
2018-08-15T16:26:41.986Z DEBUG [input] log/input.go:147 Start next scan
2018-08-15T16:26:41.986Z DEBUG [input] log/input.go:168 input states cleaned up. Before: 0, After: 0, Pending: 0