Small elasticsearch Notes

From PaskvilWiki
Revision as of 11:06, 18 January 2013 by Admin (Talk | contribs)

Jump to: navigation, search

I was really surprised by elasticsearch (ES further on) - the simplicity of setup and configuration, and by it's powers and options.

Installation

Download, unpack, and run es/bin/elasticsearch. Yes, that's it. Amazing, isn't it?

What You Get

After the above 30-sec setup, you have a search engine running on http://localhost:9200/, with automatic sharding (unlike with other systems, ES is sharded always - even on a single machine), replication, and much much more.

Few highlights:

  • ES sports a neat RESTful API that communicates (almost) entirely in JSON,
  • ES is schemaless, unless you want it to be,
  • you can hint ES on many tasks - e.g. hint what shards to search in, etc.
  • indices are created on the fly, no need to precreate (yes, might be tougher to find a bug, but installation of a new system is a breeze),
  • you can specify what indices to search, or what document types, you can search over a group or all or just one,
  • documents are versioned; also, adding a document with the same ID does not replace the old document - this might or might not be what you want,

Indexing

Example

Lets start with an add-get example:

# lets add (type) _user_ to _twitter_ index, with ID _kimchy_
$ curl -XPUT 'http://localhost:9200/twitter/user/kimchy' -d '{ "name" : "Shay Banon" }'
> {"ok":true,"_index":"twitter","_type":"user","_id":"kimchy","_version":2}

$ curl -XGET 'http://localhost:9200/twitter/user/kimchy?pretty=true'
> {
>   "_index" : "twitter",
>   "_type" : "user",
>   "_id" : "kimchy",
>   "_version" : 1,
>   "exists" : true, "_source" : { "name" : "Shay Banon" }
> }

# lets add one more _user_ to _twitter_ with the same ID
$ curl -XPUT 'http://localhost:9200/twitter/user/kimchy' -d '{ "name" : "Shay Baror" }'
> {"ok":true,"_index":"twitter","_type":"user","_id":"kimchy","_version":2}

# note the increase in version number
$ curl -XGET 'http://localhost:9200/twitter/user/kimchy?pretty=true'
> {
>   "_index" : "twitter",
>   "_type" : "user",
>   "_id" : "kimchy",
>   "_version" : 2,
>   "exists" : true, "_source" : { "name" : "Shay Baror" }
> }

# now, lets search for "shay" users
$ curl -XGET 'http://localhost:9200/twitter/user/_search?q=name:shay&pretty=true'
> {
>   "took" : 491,
>   "timed_out" : false,
>   "_shards" : {
>     "total" : 5,
>     "successful" : 5,
>     "failed" : 0
>   },
>   "hits" : {
>     "total" : 1,
>     "max_score" : 0.625,
>     "hits" : [ {
>       "_index" : "twitter",
>       "_type" : "user",
>       "_id" : "kimchy",
>       "_score" : 0.625, "_source" : { "name" : "Shay Baror" }
>     } ]
>   }
> }

# to search only among all types in _twitter_ index
$ curl -XGET 'http://localhost:9200/twitter/_search?q=name:shay'

# finally, you may search all indices
$ curl -XGET 'http://localhost:9200/_search?q=name:shay'

# or just selected indices - _twitter_ and _facebook_
$ curl -XGET 'http://localhost:9200/twitter,facebook/_search?q=name:shay'

# or on all indices starting with _t_, excluding _twitter_

Note that you get most of the useful information, and very little superfluous. Of course, without the pretty=true parameter, you get the "normal" more compressed version of JSON.

Creating Documents

You create/index documents by PUT'ing them to index as type with docid ID:

$ curl -XPUT 'http://localhost:9200/index/type/docid' -d '{"content":"trying out Elastic Search"}'

Note that documents are versioned rather than replace if PUT'ed more than once.

Indices and Types

Index is automatically created if it does not exist. Data type mapping is also automatically created/updated.

Indices can also be created "manually", as well as type mappings.

By setting action.auto_create_index to false in configuration, indices need to be created manually before use. Same goes for type mapping - index.mapper.dynamic.

You can also white/black-list indices by name, which are to be created automatically and manually, by setting action.auto_create_index to +aaa*,-bbb*,+ccc*,-*.

Routing