Small elasticsearch Notes
From PaskvilWiki
I was really surprised by elasticsearch (ES further on) - the simplicity of setup and configuration, and by it's powers and options.
Installation
Download, unpack, and run es/bin/elasticsearch. Yes, that's it. Amazing, isn't it?
What You Get
After the above 30-sec setup, you have a search engine running on http://localhost:9200/, with automatic sharding (unlike with other systems, ES is sharded always - even on a single machine), replication, and much much more.
Few highlights:
- ES sports a neat RESTful API that communicates (almost) entirely in JSON,
- ES is schemaless, unless you want it to be,
- you can hint ES on many tasks - e.g. hint what shards to search in, etc.
- indices are created on the fly, no need to precreate (yes, might be tougher to find a bug, but installation of a new system is a breeze),
- you can specify what indices to search, or what document types, you can search over a group or all or just one,
- documents are versioned; also, adding a document with the same ID does not replace the old document - this might or might not be what you want,
Indexing
Example
Lets start with an add-get example:
# lets add (type) _user_ to _twitter_ index, with ID _kimchy_ $ curl -XPUT 'http://localhost:9200/twitter/user/kimchy' -d '{ "name" : "Shay Banon" }' > {"ok":true,"_index":"twitter","_type":"user","_id":"kimchy","_version":2} $ curl -XGET 'http://localhost:9200/twitter/user/kimchy?pretty=true' > { > "_index" : "twitter", > "_type" : "user", > "_id" : "kimchy", > "_version" : 1, > "exists" : true, "_source" : { "name" : "Shay Banon" } > } # lets add one more _user_ to _twitter_ with the same ID $ curl -XPUT 'http://localhost:9200/twitter/user/kimchy' -d '{ "name" : "Shay Baror" }' > {"ok":true,"_index":"twitter","_type":"user","_id":"kimchy","_version":2} # note the increase in version number $ curl -XGET 'http://localhost:9200/twitter/user/kimchy?pretty=true' > { > "_index" : "twitter", > "_type" : "user", > "_id" : "kimchy", > "_version" : 2, > "exists" : true, "_source" : { "name" : "Shay Baror" } > } # now, lets search for "shay" $ curl -XGET 'http://localhost:9200/twitter/user/_search?q=name:shay&pretty=true' > { > "took" : 491, > "timed_out" : false, > "_shards" : { > "total" : 5, > "successful" : 5, > "failed" : 0 > }, > "hits" : { > "total" : 1, > "max_score" : 0.625, > "hits" : [ { > "_index" : "twitter", > "_type" : "user", > "_id" : "kimchy", > "_score" : 0.625, "_source" : { "name" : "Shay Baror" } > } ] > } > }
Note that you get most of the useful information, and very little superfluous. Of course, without the pretty=true parameter, you get the "normal" more compressed version of JSON.