Updating elasticsearch to 2.0 and de-dotting es indexes [FIX]

Updating elasticsearch to a newer version has been quite a breeze in the past. But with the arrival of the new major es-release 2.0, I thought a bit more thorough test were in order. A good read for breaking changes can be found here. A very useful es-plugin to check your existing indexes for compatibility issues can be found here.

One of the major changes that did hit us, is the fact that field names may no longer contain dots. We use the elk stack to mainly log webserver logs. For some log types we also parse the URL get params to param_name => param_value pairs. Sadly, some of those param_names contain a dot, e.g. „document.x=123&document.y=345“. So, to migrate the exiting indexes we need to get rid of the dots in those fields.

Here is the LumberMill configuration that will reindex and replace dots with underscores:
# Sets number of parallel LumberMill processes.
– Global:
workers: 3

– ElasticSearch:
nodes: [‚ELASTICSEARCH_HOST:9200‘]
search_type: scan
index_name: ‚INDEX_NAME‘

# Recursively replace dots with underscores in all fieldnames below field "params".
– ModifyFields:
action: rename_replace
source_field: params
recursive: True
old: .
new: _

# Copy old event type to new event.
– ModifyFields:
action: insert
target_field: lumbermill.event_type
value: $(_type)

# Drop internal es fields prior to reindex.
– ModifyFields:
action: delete
source_fields: [‚_uid‘, ‚_id‘, ‚_type‘, ‚_source‘, ‚_all‘, ‚_parent‘, ‚_field_names‘, ‚_routing‘, ‚_index‘, ‚_size‘, ‚_timestamp‘, ‚_ttl‘, ‚_score‘]

– SimpleStats

– ElasticSearchSink:
nodes: [‚ELASTICSEARCH_HOST:9200‘]
index_name: ‚NEW_INDEX_NAME‘

Dieser Beitrag wurde unter /dev/administration veröffentlicht. Setze ein Lesezeichen auf den Permalink.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.