Difference between revisions of "Small elasticsearch Notes"

From PaskvilWiki
Jump to: navigation, search
(Created page with "I was really surprised by [http://www.elasticsearch.org/ '''elasticsearch'''] (ES further on) - the simplicity of setup and configuration, and by it's powers and options. == Ins...")
(No difference)

Revision as of 10:40, 18 January 2013

I was really surprised by elasticsearch (ES further on) - the simplicity of setup and configuration, and by it's powers and options.

Installation

Download, unpack, and run es/bin/elasticsearch. Yes, that's it. Amazing, isn't it?

What You Get

After the above 30-sec setup, you have a search engine running on http://localhost:9200/, with automatic sharding (unlike with other systems, ES is sharded always - even on a single machine), replication, and much much more.

Few highlights:

  • ES sports a neat RESTful API that communicates (almost) entirely in JSON,
  • ES is schemaless, unless you want it to be,
  • you can hint ES on many tasks - e.g. hint what shards to search in, etc.
  • indices are created on the fly, no need to precreate (yes, might be tougher to find a bug, but installation of a new system is a breeze),
  • you can specify what indices to search, or what document types, you can search over a group or all or just one,
  • documents are versioned; also, adding a document with the same ID does not replace the old document - this might or might not be what you want,

Indexing

Example

Lets start with an add-get example:

# lets add (type) _user_ to _twitter_ index, with ID _kimchy_
$ curl -XPUT 'http://localhost:9200/twitter/user/kimchy' -d '{ "name" : "Shay Banon" }'
> {"ok":true,"_index":"twitter","_type":"user","_id":"kimchy","_version":2}

$ curl -XGET 'http://localhost:9200/twitter/user/kimchy?pretty=true'
> {
>   "_index" : "twitter",
>   "_type" : "user",
>   "_id" : "kimchy",
>   "_version" : 1,
>   "exists" : true, "_source" : { "name" : "Shay Banon" }
> }

# lets add one more _user_ to _twitter_ with the same ID
$ curl -XPUT 'http://localhost:9200/twitter/user/kimchy' -d '{ "name" : "Shay Baror" }'
> {"ok":true,"_index":"twitter","_type":"user","_id":"kimchy","_version":2}

# note the increase in version number
$ curl -XGET 'http://localhost:9200/twitter/user/kimchy?pretty=true'
> {
>   "_index" : "twitter",
>   "_type" : "user",
>   "_id" : "kimchy",
>   "_version" : 2,
>   "exists" : true, "_source" : { "name" : "Shay Baror" }
> }

# now, lets search for "shay"
$ curl -XGET 'http://localhost:9200/twitter/user/_search?q=name:shay&pretty=true'
> {
>   "took" : 491,
>   "timed_out" : false,
>   "_shards" : {
>     "total" : 5,
>     "successful" : 5,
>     "failed" : 0
>   },
>   "hits" : {
>     "total" : 1,
>     "max_score" : 0.625,
>     "hits" : [ {
>       "_index" : "twitter",
>       "_type" : "user",
>       "_id" : "kimchy",
>       "_score" : 0.625, "_source" : { "name" : "Shay Baror" }
>     } ]
>   }
> }

Note that you get most of the useful information, and very little superfluous. Of course, without the pretty=true parameter, you get the "normal" more compressed version of JSON.