ES高级查询

  • 2020 年 1 月 21 日
  • 筆記

高级查询

范围查询  

{      "query": {          "range": {               "amount": {                    "gte" :1,                    "lte":100                }          }      }  }

相当于 amount>=1 and amount<=100

短语查询 

{      "query": {          "match_phrase": {              "desc": "收入"          }      }  }

script查询

script查询 可以对查询出的字段进行再次计算。

GET order/_search  {    "query": {      "match_all": {}    },    "script_fields": {      "test1": {        "script": {          "lang": "painless",          "source": "doc['amount'].value * 2"        }      },      "test2": {        "script": {          "lang": "painless",          "params": {            "factor": 2          },          "source": "doc['amount'].value + params.factor"        }      }    }  }

过滤和查询


区别:

Filter:在查询过程中,Filter只判断该文档是否满足条件,只有YES或者NO。 ES会对它的结果进行缓存,所以相较于Query而言Filter的速度会更快一些。

Query: 除了问YES或NO,还会问匹配的程度。

过滤查询已被弃用,并在ES 5.0中删除。现在使用bool查询代替。

bool 查询是一个组合查询,返回一个bool值。 包含must,should,filter等查询

must:相当于and,必须满足

should:相当于or,代表或者意思  

filter:是一个bool表达式。    

{    "query": {      "bool": {        "must": {          "match":{"desc": "收入"}        },         "should": {          "match":{"name": "user"}        },        "filter":{           "range":{             "amount":{                "gte" :10,                "lte":50             }           }        }      }    }  }

相当于mysql中的 1=1 and ((desc like ‘%收入%’ and amount>=10 and amount<=50>) or name =’user’)

聚合  

在mysql中,聚合用group by,对于聚合后的计算用sum,avg等聚合函数计算,在es中,groupby 后面的字段称为桶,sum等聚合函数称为指标。 如:

select sex,sum(age) from user group by sex 

上面的sql中,sex和sum都是查询的指标,sex是桶。

聚合的写法:

{      "size": 0, //不显示原来数据      "aggs": {          "buckets": {              "terms": {                  "field": "orderId", //需要聚合的字段,桶                  "order":{"sumAmount" : "asc" }//按查询结果排序              },              "aggs": { //聚合后需要怎么处理字段,成为指标                  "sumAmount": { // 字段                      "sum": {                          "field": "amount"                      }                  }              }          }      }  }

对于聚合来说,es中的聚合函数有,sum,avg,stats,max,min等,聚合方式可以归为以下几类:

  1. 单字段单指标聚合
  2. 单字段多指标聚合
  3. 多字段单指标聚合
  4. 聚合后筛选

单字段单指标聚合

{      "size": 0,      "aggs": {          "buckets": {              "terms": {                  "field": "orderId",                  "order":{"sumAmount.avg" : "asc" }              },              "aggs": {                  "sumAmount" : {                    "stats" : { "field" : "amount" }                    }              }          }      }  }

单字段多指标聚合

{      "size": 0,      "aggs": {          "bucket1": {              "terms": {                  "field": "orderId",                  "order":{"sumAmount" : "asc" }              },              "aggs": {                  "sumAmount": {                      "sum": {                          "field": "amount"                      }                  },                  "avgAmount":{                  	 "avg": {                          "field": "amount"                      }                  }              }          }      }  }

多字段单指标聚合

对索引中的两个字段一起聚合,相当于group by xxx,yyy

{    "size": 0,    "aggs": {      "bulket1": {        "terms": {          "field": "orderId"        },        "aggs": {          "bulket2": {            "terms": {              "field": "name"            },            "aggs": {              "sumAmount": {                "sum": {                  "field": "amount"                }              }            }          }        }      }    }  }

聚合后的筛选:

{    "size": 0,    "aggs": {      "groupUserId": {        "terms": {          "field": "shortTime"        },        "aggs": {          "sumAmount": {            "sum": {              "field": "amount"            }          },          "having": {            "bucket_selector": {              "buckets_path": {                "orderCount": "_count",                "sumAmount": "sumAmount"              },              "script": {                "source": "params.sumAmount >= 100 && params.orderCount >=2"              }            }          }        }      }    }  }

ES中有一个区别于传统DB的聚合方式,对索引中的两个字段分别聚合,相当于mysql中group by 'xxx', group by 'yyy',统计后的结果分布在各个桶里面

{      "size": 0,      "aggs": {          "bulket1": {              "terms": {                  "field": "shortTime"              }          },           "bulket2": {              "terms": {                  "field": "name"              }          }      }  }
"aggregations" : {      "bulket2" : {        "doc_count_error_upper_bound" : 0,        "sum_other_doc_count" : 0,        "buckets" : [          {            "key" : "user",            "doc_count" : 4          },          {            "key" : "user1",            "doc_count" : 2          }        ]      },      "bulket1" : {        "doc_count_error_upper_bound" : 0,        "sum_other_doc_count" : 0,        "buckets" : [          {            "key" : 1546905600000,            "key_as_string" : "2019-01-08T00:00:00.000Z",            "doc_count" : 3          },          {            "key" : 1546646400000,            "key_as_string" : "2019-01-05T00:00:00.000Z",            "doc_count" : 2          },          {            "key" : 1546992000000,            "key_as_string" : "2019-01-09T00:00:00.000Z",            "doc_count" : 1          }        ]      }    }

也可以多个字段各自统计

{      "size": 0,      "aggs": {          "bucket1": {              "terms": {                  "field": "orderId",                  "order":{"sumAmount" : "asc" }              },              "aggs": {                  "sumAmount": {                      "sum": {                          "field": "amount"                      }                  }              }          },          "bucket2": {              "terms": {                  "field": "name",                  "order":{"avgAmount" : "asc" }              },              "aggs": {                  "avgAmount": {                      "sum": {                          "field": "amount"                      }                  }              }          }      }  }

聚合后结果如下:

"aggregations" : {      "bucket2" : {        "doc_count_error_upper_bound" : 0,        "sum_other_doc_count" : 0,        "buckets" : [          {            "key" : "user1",            "doc_count" : 2,            "avgAmount" : {              "value" : 137.07            }          },          {            "key" : "user",            "doc_count" : 4,            "avgAmount" : {              "value" : 246.18            }          }        ]      },      "bucket1" : {        "doc_count_error_upper_bound" : 0,        "sum_other_doc_count" : 0,        "buckets" : [          {            "key" : "10000",            "doc_count" : 3,            "sumAmount" : {              "value" : 51.16            }          },          {            "key" : "10001",            "doc_count" : 3,            "sumAmount" : {              "value" : 332.09000000000003            }          }        ]      }    }