Collect, parse and visualize your logs with LumberMill, Elasticsearch and Kibana on CentOS, Part I

With the arrival of lucene as search platform for all kinds of data, Admins around the world started to put their log data into these datastores.
Logshippers were created, that received raw logdata, parsed this data and then stored it in a lucene driven search platform. Solr was quite popular for this kind of job, but had a major drawback: real time indexing was not as easy as one could wish for. Setting up a cluster of redundant nodes for replication was also not for the faint hearted, although this got easier with subsequent releases of solr. But then elasticsearch came to the rescue. Fast real time indexing, easy clustering and lots of nice features more. Together with the incredible powerful log manager logstash, indexing large amounts of log events in near-real time became a reality. And with kibana as visualization frontend, analysing log data nearly became a fun task to do.

In this how-to I used LumberMill as an alternative to logstash, mostly because I am the one developing LumberMill ;)
Since I’m more fluent in Python than in Ruby and we already had a simple shipper to a solr backend written in Python, I just added some functionality to it. Still, Logstash is way more powerful. But if the features LumberMill provides suffice your needs, feel free to read on ;)

Installing elasticsearch

The box you are running elasticsearch on should have at least 1GB ram.
For compatibility reasons, install the oracle jre for i586:
[bash]wget -O jre-7u45-linux-i586.rpm –no-cookies –no-check-certificate –header "Cookie:" ""
rpm -i jre-7u45-linux-i586.rpm[/bash]
or for x86_64:
[bash]wget -O jre-7u45-linux-x64.rpm –no-cookies –no-check-certificate –header "Cookie:" ""
rpm -i jre-7u45-linux-x64.rpm[/bash]
Activate the jre via alternatives:
[bash]alternatives –install /usr/bin/java java /usr/java/latest/bin/java 2000
alternatives –set java /usr/java/latest/bin/java[/bash]
Now install elasticsearch. Friendly as those people are, they provide an rpm for our convenience ;)
[bash]rpm –import
cat > /etc/yum.repos.d/elasticsearch.repo << EOF
name=Elasticsearch repository for 1.3.x packages
yum -y install elasticsearch
Next, we install the es head plugin.
Elasticsearch-head is a web front end for browsing and interacting with an Elastic Search cluster.
[bash]/usr/share/elasticsearch/bin/plugin -install mobz/elasticsearch-head[/bash]
We need to restart elasticsearch to load the new plugin:
[bash]/etc/init.d/elasticsearch restart[/bash]
If elasticsearch complains that it „Can’t start up: not enough memory“, edit /etc/sysconfig/elasticsearch and adjust ES_HEAP_SIZE. Default is 256m, increase this to a value that will let you start es successfully.

Open up the head plugin in you favorite browser by visiting http://your_server:9200/_plugin/head/

If you have trouble connecting, check your iptables rulebase.

Installing pypy (optional)

I heartly recommend running LumberMill with pypy. The performance boost is more than worth the little afford involved in installing pypy.
For CentOS you can follow these simple steps described here.

Installing LumberMill

via pip

[bash]pip install LumberMill[/bash]


If easy_install is not present on your system, install it via:
[bash]yum install python-setuptools[/bash]
Install the city version of the maxmind geolocation databases:
[bash]mkdir /usr/share/GeoIP
cd /usr/share/GeoIP
wget ""
gunzip GeoLiteCity.dat.gz[/bash]

Clone the github repository to /opt/LumberMill (or any other location that fits you better :):
[bash]git clone /opt/LumberMill[/bash]

Install the dependencies with pip:
cd /opt/LumberMill
python install
and for pypy:
[bash]pypy install[/bash]

Now you can give LumberMill a testdrive with:
python /opt/LumberMill/lumbermill/ -c /opt/LumberMill/conf/example-tcp.conf[/bash]
or for pypy:
[bash]pypy /opt/LumberMill/lumbermill/ -c /opt/LumberMill/conf/example-tcp.conf[/bash]

Again, if you have connection problems, check your iptables rulebase. By default LumberMill will listen on port 5151.

To check, if indexing works without problems, send some logdata to LumberMill.
Just open up another shell and execute:
[bash]python /opt/LumberMill/scripts/ -c 100 localhost 5151[/bash]
LumberMill should now show the incoming events in its statistics output.
If you open up the elasticsearch-head plugin, you should see that a new
lumbermill index was created.

Installing Kibana

For this how-to I choose a very simple setup for kibana.

First we need a webserver. I choose nginx, since it is lightweight and fast.
For i586:
[bash]rpm -i[/bash]
For x86_64:
[bash]rpm -i[/bash]

Get kibana:
[bash]mkdir -p /var/www/html/kibana
mkdir -p /var/www/log/
touch /var/www/log/kibana-error.log
chown nginx:nginx -R /var/www/
git clone /var/www/html/kibana[/bash]

Configure nginx:
[bash]rm -f /etc/nginx/conf.d/default.conf
echo "server {
listen 80;
root /var/www/html/kibana/src;
index index.html index.htm;
error_log /var/www/log/ error;
" > /etc/nginx/conf.d/kibana.conf[/bash]

Restart nginx:
[bash]/etc/init.d/nginx restart[/bash]

Now just open a browser with your servers ip address as url. You should see the kibana welcome page.
Clicking on the „Sample Dashboard“ link will take you to a preconfigured dashboard, showing the sample data you send during spam_tcp test.

Well, that’s about it for today. In the next howto, i’d like to show how syslog-ng can be configured to send data to LumberMill and how to loadbalance LumberMill instances via haproxy.

Dieser Beitrag wurde unter /dev/administration veröffentlicht. Setze ein Lesezeichen auf den Permalink.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.