ElasticSearch

Drew Raines
draines.com
@drewr

Principal Engineer
Sonian, Boston, MA
sonian.com

NoSQL?

{
    "facets": {
        "city": {
            "terms": {
                "field": "city.raw",
                "order": "term",
                "size": 1000
            }
        }
    },
    "fields": [
        "id"
    ],
    "query": {
        "filtered": {
            "filter": {
                "and": [
                    {"term": {"state": "ga"}},
                    {"term": {"functional_area": "Sales"}},
                    {"not": {
                      "filter": {
                        "or": [
                          {"query": {"text": {"title": "START YOUR JOURNEY"}}},
                          {"query": {"text": {"title": "IT ALL STARTS HERE"}}}
                              ]
                            }
                        }
                    }
                ]
            },
            "query": {
                "match_all": {}
            }
        }
    }
}

NoSQL?

{"facets": {"city": {"terms": {"field": "city.raw","order":
"term", "size": 1000}}},"fields": ["id"],"query": {"filtered":
{"filter": {"and": [{"term": {"state": "ga"}},{"term":
{"functional_area": "Sales"}},{"not": {"filter": {"or": [{"query":
{"text": {"title": "START YOUR JOURNEY"}}},{"query": {"text":
{"title": "IT ALL STARTS HERE"}}}]}}}]},"query": {"match_all": {}}}}
}

Shard config is static

Replicas do not speed up indexing

Replicas can greatly speed up searching

Shards balance by count

Create an index

% curl -XPUT localhost:9200/foo -d '
{
    "settings": {
        "number_of_replicas": 0, 
        "number_of_shards": 2
    }
}
'
{"ok":true,"acknowledged":true}
%

Add replicas!

% curl -XPUT localhost:9200/foo/_settings -d '
{
    "number_of_replicas": 1
}
'
{"ok":true}
%

Add a document!

% curl -XPUT localhost:9200/foo/t/1 -d '{
    "what": "hack day", 
    "when": "2012-08-11", 
    "where": "centresource"
}
'
{"ok":true,"_index":"foo","_type":"t","_id":"1","_version":6}
% 

Get it back

% curl localhost:9200/foo/t/1
{
    "_id": "1", 
    "_index": "foo", 
    "_source": {
        "what": "hack day", 
        "when": "2012-08-11", 
        "where": "centresource"
    }, 
    "_type": "t", 
    "_version": 6, 
    "exists": true
}
% 

What about schemas?

% curl localhost:9200/foo/_mapping
{
    "foo": {
        "t": {
            "properties": {
                "what": {
                    "type": "string"
                }, 
                "when": {
                    "format": "dateOptionalTime", 
                    "ignore_malformed": false, 
                    "type": "date"
                }, 
                "where": {
                    "type": "string"
                }
            }
        }
    }
}
% 

So much more!

Thank you!

ElasticSearch: www.elasticsearch.org

elasticsearch-jetty: github.com/sonian/elasticsearch-jetty

Clojure client: github.com/drewr/esperanto

Emacs tips: goo.gl/1w2RZ