Collect, parse and visualize your logs with LumberMill, Elasticsearch and Kibana on CentOS, Part I

With the arrival of lucene as search platform for all kinds of data, Admins around the world started to put their log data into these datastores.
Logshippers were created, that received raw logdata, parsed this data and then stored it in a lucene driven search platform. Solr was quite popular for this kind of job, but had a major drawback: real time indexing was not as easy as one could wish for. Setting up a cluster of redundant nodes for replication was also not for the faint hearted, although this got easier with subsequent releases of solr. But then elasticsearch came to the rescue. Fast real time indexing, easy clustering and lots of nice features more. Together with the incredible powerful log manager logstash, indexing large amounts of log events in near-real time became a reality. And with kibana as visualization frontend, analysing log data nearly became a fun task to do.

In this how-to I used LumberMill as an alternative to logstash, mostly because I am the one developing LumberMill ;)
Since I’m more fluent in Python than in Ruby and we already had a simple shipper to a solr backend written in Python, I just added some functionality to it. Still, Logstash is way more powerful. But if the features LumberMill provides suffice your needs, feel free to read on ;)

Installing elasticsearch

The box you are running elasticsearch on should have at least 1GB ram.
For compatibility reasons, install the oracle jre for i586:

wget -O jre-7u45-linux-i586.rpm --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F" "http://download.oracle.com/otn-pub/java/jdk/7u45-b18/jre-7u45-linux-i586.rpm"
rpm -i jre-7u45-linux-i586.rpm

or for x86_64:

wget -O jre-7u45-linux-x64.rpm --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F" "http://download.oracle.com/otn-pub/java/jdk/7u45-b18/jre-7u45-linux-x64.rpm"
rpm -i jre-7u45-linux-x64.rpm

Activate the jre via alternatives:

alternatives --install /usr/bin/java java /usr/java/latest/bin/java 2000
alternatives --set java /usr/java/latest/bin/java

Now install elasticsearch. Friendly as those people are, they provide an rpm for our convenience ;)

rpm --import http://packages.elasticsearch.org/GPG-KEY-elasticsearch
cat > /etc/yum.repos.d/elasticsearch.repo << EOF
[elasticsearch-1.3]
name=Elasticsearch repository for 1.3.x packages
baseurl=http://packages.elasticsearch.org/elasticsearch/1.3/centos
gpgcheck=1
gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch
enabled=1
EOF
yum -y install elasticsearch

Next, we install the es head plugin.
Elasticsearch-head is a web front end for browsing and interacting with an Elastic Search cluster.

/usr/share/elasticsearch/bin/plugin -install mobz/elasticsearch-head

We need to restart elasticsearch to load the new plugin:

/etc/init.d/elasticsearch restart

If elasticsearch complains that it „Can’t start up: not enough memory“, edit /etc/sysconfig/elasticsearch and adjust ES_HEAP_SIZE. Default is 256m, increase this to a value that will let you start es successfully.

Open up the head plugin in you favorite browser by visiting http://your_server:9200/_plugin/head/

If you have trouble connecting, check your iptables rulebase.

Installing pypy (optional)

I heartly recommend running LumberMill with pypy. The performance boost is more than worth the little afford involved in installing pypy.
For CentOS you can follow these simple steps described here.

Installing LumberMill

via pip

pip install LumberMill

manually

If easy_install is not present on your system, install it via:

yum install python-setuptools

Install the city version of the maxmind geolocation databases:

mkdir /usr/share/GeoIP
cd /usr/share/GeoIP
wget "http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz"
gunzip GeoLiteCity.dat.gz

Clone the github repository to /opt/LumberMill (or any other location that fits you better :):

git clone https://github.com/dstore-dbap/LumberMill.git /opt/LumberMill

Install the dependencies with pip:

cd /opt/LumberMill
python setup.py install

and for pypy:

pypy setup.py install

Now you can give LumberMill a testdrive with:

python /opt/LumberMill/lumbermill/LumberMill.py -c /opt/LumberMill/conf/example-tcp.conf

or for pypy:

pypy /opt/LumberMill/lumbermill/LumberMill.py -c /opt/LumberMill/conf/example-tcp.conf

Again, if you have connection problems, check your iptables rulebase. By default LumberMill will listen on port 5151.

To check, if indexing works without problems, send some logdata to LumberMill.
Just open up another shell and execute:

python /opt/LumberMill/scripts/spam_tcp.py -c 100 localhost 5151

LumberMill should now show the incoming events in its statistics output.
If you open up the elasticsearch-head plugin, you should see that a new
lumbermill index was created.

Installing Kibana

For this how-to I choose a very simple setup for kibana.

First we need a webserver. I choose nginx, since it is lightweight and fast.
For i586:

rpm -i http://nginx.org/packages/rhel/6/i386/RPMS/nginx-1.4.4-1.el6.ngx.i386.rpm

For x86_64:

rpm -i http://nginx.org/packages/rhel/6/x86_64/RPMS/nginx-1.4.4-1.el6.ngx.x86_64.rpm

Get kibana:

mkdir -p /var/www/html/kibana
mkdir -p /var/www/log/
touch /var/www/log/kibana-error.log
chown nginx:nginx -R /var/www/
git clone https://github.com/elasticsearch/kibana.git /var/www/html/kibana

Configure nginx:

rm -f /etc/nginx/conf.d/default.conf
echo "server {
   listen 80;
   root /var/www/html/kibana/src;
   index index.html index.htm;
   error_log  /var/www/log/kibana.dbap.de-error.log error;
}
" > /etc/nginx/conf.d/kibana.conf

Restart nginx:

/etc/init.d/nginx restart

Now just open a browser with your servers ip address as url. You should see the kibana welcome page.
Clicking on the „Sample Dashboard“ link will take you to a preconfigured dashboard, showing the sample data you send during spam_tcp test.

Well, that’s about it for today. In the next howto, i’d like to show how syslog-ng can be configured to send data to LumberMill and how to loadbalance LumberMill instances via haproxy.

Dieser Beitrag wurde unter /dev/administration veröffentlicht. Setze ein Lesezeichen auf den Permalink.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.