Difference between revisions of "ElasticSearch Query DSL"
| Line 17: | Line 17: | ||
By default, terms are OR'ed; to '''AND''' them: | By default, terms are OR'ed; to '''AND''' them: | ||
| − | |||
<pre>{ | <pre>{ | ||
"match" : { | "match" : { | ||
| Line 137: | Line 136: | ||
Uses query parser in order to parse its content. | Uses query parser in order to parse its content. | ||
| − | |||
<pre>{ | <pre>{ | ||
"query_string" : { | "query_string" : { | ||
| Line 162: | Line 160: | ||
=== range === | === range === | ||
| + | |||
| + | Matches documents by a provided range. For string fields, the ''TermRangeQuery'' is used, while for number/date fields, the query is a ''NumericRangeQuery''. | ||
| + | <pre>{ | ||
| + | "range" : { | ||
| + | "age" : { | ||
| + | "from" : 10, | ||
| + | "to" : 20, | ||
| + | "include_lower" : true, | ||
| + | "include_upper": false, | ||
| + | "boost" : 2.0 | ||
| + | } | ||
| + | } | ||
| + | }</pre> | ||
| + | You can also use the following abbreviations: | ||
| + | * ''gt'' = ''from'' + ''include_lower=false'', | ||
| + | * ''gte'' = ''from'' + ''include_lower=true'', | ||
| + | * ''lt'' = ''to'' + ''include_upper=false'', | ||
| + | * ''lte'' = ''to'' + ''include_upper=true''. | ||
== Filters == | == Filters == | ||
Revision as of 12:45, 23 January 2013
ES's Query DSL is a language for specifying queries in JSON.
This is by far not an exhaustive documentation, it's just stuff I use the most; see official documentation for more. Especially the boosting and scoring functionality is not documented here to proper extent.
Contents
Queries
match, multi_match
The match queries accept, analyze, and construct query out of text/numeric/date. The match family of queries does not go through a "query parsing" process. It does not support field name prefixes, wildcard characters, or other "advance" features.
Here, message is name of the field to match in (can be also _all):
{
"match" : {
"message" : "this is a test"
}
}
By default, terms are OR'ed; to AND them:
{
"match" : {
"message" : {
"query" : "this is a test",
"operator" : "and"
}
}
}
To match a phrase:
{
"match_phrase" : {
"message" : "this is a test"
}
}
or using the last word as prefix (the "as you type" search):
{
"match_phrase_prefix" : {
"message" : "this is a test"
}
}
To match in multiple fields, with optional boosting, use:
{
"multi_match" : {
"query" : "this is a test",
"fields" : [ "subject^2", "message" ]
}
}
where matches in subject are "twice as important" as matched in message.
bool
The bool query provides a Boolean combination of queries with typed occurrence:
- must - clause must appear in matching documents,
- should - should appear; is no must clause is provided, at least one should clause must be matched; you can also specify minimum_number_should_match parameter,
- must_not appear.
{
"bool" : {
"must" : {
"term" : { "user" : "kimchy" }
},
"must_not" : {
"range" : {
"age" : { "from" : 10, "to" : 20 }
}
},
"should" : [
{
"term" : { "tag" : "wow" }
},
{
"term" : { "tag" : "elasticsearch" }
}
],
"minimum_number_should_match" : 1,
"boost" : 1.0
}
}
boosting
Boosting can be used to promote or demote search results:
{
"boosting" : {
"positive" : {
"term" : {
"field1" : "value1"
}
},
"negative" : {
"term" : {
"field2" : "value2"
}
},
"negative_boost" : 0.2
}
}
ids
Match by ID:
{
"ids" : {
"type" : "my_type",
"values" : ["1", "4", "100"]
}
}
Note: type field is optional, and may contain array of values.
field
Query only on a specified field (equivalent of query_string with default_field):
{
"field" : {
"name.first" : "+something -else"
}
}
filtered
Filters results of a query; may be much faster than querying, as no scoring is done, and may be cached:
{
"filtered" : {
"query" : {
"term" : { "tag" : "wow" }
},
"filter" : {
"range" : {
"age" : { "from" : 10, "to" : 20 }
}
}
}
}
query_string
Uses query parser in order to parse its content.
{
"query_string" : {
"default_field" : "content",
"query" : "this AND that OR thus"
}
}
Parameters
- query - actual query to be parsed.
- default_field - default field for query terms (if no prefix field specified); default index.query.default_field settings, which defaults to _all,
- fields - run query against multiple fields (provided as array):
- "fields" : ["content", "name"],
- optionally with boosting: "fields" : ["content", "name^5"],
- wildcards may be used for fields: "fields" : ["city.*"] if document contains object city,
- to check for existence of nonexistence of fields, use: _exists_:field1 and _missing_:field,
- default_operator - default operator used (if none explicitly specified); e.g. with default operator OR, the query "capital of Hungary" is translated to "capital OR of OR Hungary"; default is OR,
- allow_leading_wildcard - are * or ? allowed as the first character? default true,
- lowercase_expanded_terms - should terms of wildcard, prefix, fuzzy, and range queries be automatically lower-cased? (since they are not analyzed); default true,
- boost - boost value of the query; default 1.0,
- minimum_should_match - percent value ("20%") controlling how many "should" clauses in the resulting boolean query should match,
- lenient - if true, format based failures (like providing text to a numeric field) to be ignored.
range
Matches documents by a provided range. For string fields, the TermRangeQuery is used, while for number/date fields, the query is a NumericRangeQuery.
{
"range" : {
"age" : {
"from" : 10,
"to" : 20,
"include_lower" : true,
"include_upper": false,
"boost" : 2.0
}
}
}
You can also use the following abbreviations:
- gt = from + include_lower=false,
- gte = from + include_lower=true,
- lt = to + include_upper=false,
- lte = to + include_upper=true.