ES 中join的使用
- 2020 年 1 月 21 日
- 筆記
引出问题
在mysql中,可以使用join
来实现表与表之间的数据连接,在es中如何实现这个问题?
相对于mysql来说,es有几个不同的地方
- 不支持跨index的join
- 一个index只能包含一个类型
- 分布式的存储方式,对于数据的搜寻造成障碍
对于上面的几个问题,es的解决方案是**在一个索引下,借助父子关系,实现类似Mysql中多表关联的操作**
定义类型和join索引
PUT myorder { "mappings": { "_doc": { "properties": { "order_join": { "type": "join", "relations": { "order": "suborder" } } } } } }
定义join关系为order_join
,其中order
是父文档,suborder
是子文档。
put myorder/_mapping/_doc { "properties": { "orderId": { "type": "keyword" }, "shortTime": { "type": "date" }, "name": { "type": "keyword" }, "amount": { "type": "double" }, "desc": { "type": "text" } } }
插入主单数据
PUT myorder/_doc/10001 { "shortTime": "2019-01-05", "orderId": "10001", "name": "user2", "amount": 123.09, "desc": "其他收入", "order_join": "order" }
order_join
定义为order
类型
插入子单数据
使用自定义ID用PUT方法
POST myorder/_doc?routing=1 { "shortTime": "2019-01-05", "orderId": "10001", "name": "user2", "amount": 12.09, "desc": "收入", "order_join": { "name": "suborder", "parent":"10001" } }
POST myorder/_doc?routing=1 { "shortTime": "2019-01-05", "orderId": "10002", "name": "user2", "amount": 122.09, "desc": "收入", "order_join": { "name": "suborder", "parent":"10001" } }
建立父子关系索引,
routing
参数是必须的,因为父子文档必须在同一个分片上
查询主单
GET myorder/_search { "query": { "has_child" : { "type" : "suborder", "query" : { "match_all" : { } } } } }
查询子单
GET myorder/_search { "query": { "has_parent" : { "parent_type" : "order", "query" : { "match_all" : { } } } } }
聚合查询
- 主单聚合
GET myorder/_search { "query": { "parent_id": { "type": "suborder", "id": "10001" } }, "aggs": { "parents12312": { "terms": { "field": "order_join#order" }, "aggs": { "sumAmount": { "stats": { "field": "amount" } } } } } }
- 子单聚合
GET myorder/_search { "size": 0, "aggs": { "parent": { "children": { "type": "suborder" }, "aggs": { "sumAmount": { "stats": { "field": "amount" } } } } } }
- 聚合加筛选:
GET myorder/_search { "query": { "has_child" : { "type" : "suborder", "query" : { "match_all" : { } } } }, "aggs": { "parent": { "children": { "type": "suborder" }, "aggs": { "fields": { "terms": { "field": "orderId" }, "aggs": { "sumAmount": { "sum": { "field": "amount" } }, "having": { "bucket_selector": { "buckets_path": { "orderCount": "_count", "sumAmount": "sumAmount" }, "script": { "source": "params.sumAmount >= 100 && params.orderCount >=0" } } } } } } } } }
定义一对多的索引
一对一的索引模型很难满足日常业务的数据处理,es也支持一对多的join
PUT myorder { "mappings": { "_doc": { "properties": { "order_join": { "type": "join", "relations": { "order": ["suborder1", "suborder2"], "suborder2":"suborder3" } } } } } }
上面的索引的关联的关系如下:
order / suborder1 suborder2 suborder3