ES高级查询
- 2020 年 1 月 21 日
- 筆記
高级查询
范围查询
{ "query": { "range": { "amount": { "gte" :1, "lte":100 } } } }
相当于 amount>=1 and amount<=100
短语查询
{ "query": { "match_phrase": { "desc": "收入" } } }
script查询
script查询 可以对查询出的字段进行再次计算。
GET order/_search { "query": { "match_all": {} }, "script_fields": { "test1": { "script": { "lang": "painless", "source": "doc['amount'].value * 2" } }, "test2": { "script": { "lang": "painless", "params": { "factor": 2 }, "source": "doc['amount'].value + params.factor" } } } }
过滤和查询
区别:
Filter:在查询过程中,Filter只判断该文档是否满足条件,只有YES或者NO。 ES会对它的结果进行缓存,所以相较于Query而言Filter的速度会更快一些。
Query: 除了问YES或NO,还会问匹配的程度。
过滤查询已被弃用,并在ES 5.0中删除。现在使用bool查询代替。
bool 查询是一个组合查询,返回一个bool值。 包含must,should,filter等查询
must:相当于and,必须满足
should:相当于or,代表或者意思
filter:是一个bool表达式。
{ "query": { "bool": { "must": { "match":{"desc": "收入"} }, "should": { "match":{"name": "user"} }, "filter":{ "range":{ "amount":{ "gte" :10, "lte":50 } } } } } }
相当于mysql中的 1=1 and ((desc like ‘%收入%’ and amount>=10 and amount<=50>) or name =’user’)
聚合
在mysql中,聚合用group by,对于聚合后的计算用sum,avg等聚合函数计算,在es中,groupby 后面的字段称为桶,sum等聚合函数称为指标。 如:
select sex,sum(age) from user group by sex
上面的sql中,sex和sum都是查询的指标,sex是桶。
聚合的写法:
{ "size": 0, //不显示原来数据 "aggs": { "buckets": { "terms": { "field": "orderId", //需要聚合的字段,桶 "order":{"sumAmount" : "asc" }//按查询结果排序 }, "aggs": { //聚合后需要怎么处理字段,成为指标 "sumAmount": { // 字段 "sum": { "field": "amount" } } } } } }
对于聚合来说,es中的聚合函数
有,sum
,avg
,stats
,max
,min
等,聚合方式可以归为以下几类:
单字段单指标聚合
单字段多指标聚合
多字段单指标聚合
聚合后筛选
单字段单指标聚合
{ "size": 0, "aggs": { "buckets": { "terms": { "field": "orderId", "order":{"sumAmount.avg" : "asc" } }, "aggs": { "sumAmount" : { "stats" : { "field" : "amount" } } } } } }
单字段多指标聚合
{ "size": 0, "aggs": { "bucket1": { "terms": { "field": "orderId", "order":{"sumAmount" : "asc" } }, "aggs": { "sumAmount": { "sum": { "field": "amount" } }, "avgAmount":{ "avg": { "field": "amount" } } } } } }
多字段单指标聚合
对索引中的两个字段一起聚合
,相当于group by xxx,yyy
{ "size": 0, "aggs": { "bulket1": { "terms": { "field": "orderId" }, "aggs": { "bulket2": { "terms": { "field": "name" }, "aggs": { "sumAmount": { "sum": { "field": "amount" } } } } } } } }
聚合后的筛选:
{ "size": 0, "aggs": { "groupUserId": { "terms": { "field": "shortTime" }, "aggs": { "sumAmount": { "sum": { "field": "amount" } }, "having": { "bucket_selector": { "buckets_path": { "orderCount": "_count", "sumAmount": "sumAmount" }, "script": { "source": "params.sumAmount >= 100 && params.orderCount >=2" } } } } } } }
ES中有一个区别于传统DB的聚合方式,对索引中的两个字段分别聚合
,相当于mysql中group by 'xxx'
, group by 'yyy'
,统计后的结果分布在各个桶里面
{ "size": 0, "aggs": { "bulket1": { "terms": { "field": "shortTime" } }, "bulket2": { "terms": { "field": "name" } } } }
"aggregations" : { "bulket2" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "user", "doc_count" : 4 }, { "key" : "user1", "doc_count" : 2 } ] }, "bulket1" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : 1546905600000, "key_as_string" : "2019-01-08T00:00:00.000Z", "doc_count" : 3 }, { "key" : 1546646400000, "key_as_string" : "2019-01-05T00:00:00.000Z", "doc_count" : 2 }, { "key" : 1546992000000, "key_as_string" : "2019-01-09T00:00:00.000Z", "doc_count" : 1 } ] } }
也可以多个字段各自统计
{ "size": 0, "aggs": { "bucket1": { "terms": { "field": "orderId", "order":{"sumAmount" : "asc" } }, "aggs": { "sumAmount": { "sum": { "field": "amount" } } } }, "bucket2": { "terms": { "field": "name", "order":{"avgAmount" : "asc" } }, "aggs": { "avgAmount": { "sum": { "field": "amount" } } } } } }
聚合后结果如下:
"aggregations" : { "bucket2" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "user1", "doc_count" : 2, "avgAmount" : { "value" : 137.07 } }, { "key" : "user", "doc_count" : 4, "avgAmount" : { "value" : 246.18 } } ] }, "bucket1" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "10000", "doc_count" : 3, "sumAmount" : { "value" : 51.16 } }, { "key" : "10001", "doc_count" : 3, "sumAmount" : { "value" : 332.09000000000003 } } ] } }