es的查询、排序查询、分页查询、布尔查询、查询结果过滤、高亮查询、聚合函数、python操作es

  • 2022 年 5 月 27 日
  • 筆記

今日内容概要

  • es的查询
  • Elasticsearch之排序查询
  • Elasticsearch之分页查询
  • Elasticsearch之布尔查询
  • Elasticsearch之查询结果过滤
  • Elasticsearch之高亮查询
  • Elasticsearch之聚合函数
  • Python操作es

内容详细

1、es的查询

1.1 准备数据

# 准备数据
PUT lqz/_doc/1
{
  "name":"顾老二",
  "age":30,
  "from": "gu",
  "desc": "皮肤黑、武器长、性格直",
  "tags": ["黑", "长", "直"]
}

PUT lqz/_doc/2
{
  "name":"大娘子",
  "age":18,
  "from":"sheng",
  "desc":"肤白貌美,娇憨可爱",
  "tags":["白", "富","美"]
}

PUT lqz/_doc/3
{
  "name":"龙套偏房",
  "age":22,
  "from":"gu",
  "desc":"mmp,没怎么看,不知道怎么形容",
  "tags":["造数据", "真","难"]
}

PUT lqz/_doc/4
{
  "name":"石头",
  "age":29,
  "from":"gu",
  "desc":"粗中有细,狐假虎威",
  "tags":["粗", "大","猛"]
}

PUT lqz/_doc/5
{
  "name":"魏行首",
  "age":25,
  "from":"广云台",
  "desc":"仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp,最后竟然没有嫁给顾老二!",
  "tags":["闭月","羞花"]
}

1.2 match和term

match 和 term 中必须要加条件,但是我们有时候需要查询所有,不带条件,需要用到 match_all

1.3 match_all

# 查询所有
GET lqz/_search
{
  "query": {
    "match_all": {}
  }
}

1.4 前缀查询match_phrase_prefix

# 查英文   beautiful   --->be开头的---》能查到

GET lqz/_search
{
  "query": {
    "match_phrase_prefix": {
      "name": "顾"
    }
  }
}

1.5 match_phrase

# 会分词,分词完成后,如果写了slop,会按分词之间间隔是slop数字去抽

GET t1/doc/_search
{
  "query": {
    "match_phrase": {
      "title": {
        "query": "中国世界",
        "slop": 2
      }
    }
  }
}

1.6 多条件查询,或的关系

# 只要name或者desc中带龙套就查出来

GET lqz/_search  
{
  "query": {
    "multi_match": {
      "query": "龙套",
      "fields": ["name", "desc"]
    }
  }
}

2、Elasticsearch之排序查询

# 结构化查询 
GET lqz/_search
{
  "query": {
    "match": {
      "name": ""
    }
  }
}


# 排序查询---》可以按多个排序条件-->sort的列表中继续加
GET lqz/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ]
}


# 不是所有字段都可以排序--->只能是数字字段

3、Elasticsearch之分页查询

# from 和 size

GET lqz/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ], 
  "from": 3,
  "size": 2
}

4、Elasticsearch之布尔查询

#   must(and)  should(or)  must_not(not)    filter

# must条件  and条件
GET lqz/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "from": "sheng"
          }
        },
        {
          "match": {
            "age": 18
          }
        }
      ]
    }
  }
}


"""
# 咱们认为的and查询,但是实际不行
GET lqz/_search
{
  "query": {
    "match": {
      "from": "sheng",
      "age":18
    }
  }
}

# 查询课程标题或者课程介绍中带python的数据
"""


# shoud   or 条件---》搜索框,就是用它
GET lqz/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "from": "sheng"
          }
        },
        {
          "match": {
            "age": 22
          }
        }
      ]
    }
  }
}



# must_not  既不是也不是
GET lqz/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "from": "gu"
          }
        },
        {
          "match": {
            "tags": "可爱"
          }
        },
        {
          "match": {
            "age": 18
          }
        }
      ]
    }
  }
}



# 查询 from为gu,age大于25的数据怎么查 filter  /gt> lt< lte=
GET lqz/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "from": "gu"
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "lt": 25
          }
        }
      }
    }
  }
}

5、Elasticsearch之查询结果过滤

# 只查某几个字段

GET lqz/_search
{
  "query": {
    "match": {
      "name": "顾老二"
    }
  },
  "_source": ["name", "age"]
}

6、Elasticsearch之高亮查询

# 默认高亮
GET lqz/_search
{
  "query": {
    "match": {
      "name": "石头"
    }
  },
  "highlight": {
    "fields": {
      "name": {}
    }
  }
}


# 自定义高亮
GET lqz/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "highlight": {
    "pre_tags": "<b class='key' style='color:red'>",
    "post_tags": "</b>",
    "fields": {
      "from": {}
    }
  }
}

7、Elasticsearch之聚合函数

# avg   max   min   sum

# avg 求平均年龄
GET lqz/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "aggs": {
    "my_avg": {
      "avg": {
        "field": "age"
      }
    }
  },
  "_source": ["name", "age"]
}

8、Python操作es

#  官方提供了python操作es包,基础包--》建议你用---》类似于pymysql
	pip install elasticsearch


'''
# 如果官方没提供,使用requests模块,发送http请求即可
# 例如:
PUT lqz/_doc/2
{
  "name":"大娘子",
  "age":18,
  "from":"sheng",
  "desc":"肤白貌美,娇憨可爱",
  "tags":["白", "富","美"]
}

# 用python操作
import requests

data = {
    "name": "大娘子",
    "age": 18,
    "from": "sheng",
    "desc": "肤白貌美,娇憨可爱",
    "tags": ["白", "富", "美"]
}
res=requests.put('//127.0.0.1:9200/lqz/_doc/6', json=data)
print(res.text)
'''

### 使用官方的包:
from elasticsearch import Elasticsearch

# Instantiate a client instance
client = Elasticsearch("//localhost:9200")

# Call an API, in this example `info()`
# resp = client.info()
# print(resp)



# 创建索引(Index)
# result = client.indices.create(index='user')
# print(result)
# 删除索引
# result = client.indices.delete(index='user')


# 插入数据
# data = {'userid': '1', 'username': 'lqz','password':'123'}
# result = client.create(index='news', doc_type='_doc', id=1, body=data)
# print(result)


# 更新数据
'''
不用doc包裹会报错
ActionRequestValidationException[Validation Failed: 1: script or doc is missing
'''
# data ={'doc':{'userid': '1', 'username': 'lqz','password':'123ee','test':'test'}}
# result = client.update(index='news', doc_type='_doc', body=data, id=1)
# print(result)


# 删除数据
# result = client.delete(index='news',id=1)


# 查询
# 查找所有文档
# query = {'query': {'match_all': {}}}

#  查找名字叫做jack的所有文档
# query = {'query': {'term': {'username': 'lqz'}}}

# 查找年龄大于11的所有文档
query = {'query': {'range': {'age': {'gt': 28}}}}

allDoc = client.search(index='lqz',  body=query)
# print(allDoc['hits']['hits'][0]['_source'])
print(allDoc)


# python操作es的包---》类似于django的orm包

#  用的最多的是查询

# 写个脚本,把课程表的数据,同步到es中(建立索引---》插入数据)
# High level Python client for Elasticsearch
# //github.com/elastic/elasticsearch-dsl-py
from datetime import datetime
from elasticsearch_dsl import Document, Date, Nested, Boolean,analyzer, InnerDoc, Completion, Keyword, Text,Integer

from elasticsearch_dsl.connections import connections

connections.create_connection(hosts=["localhost"])


class Article(Document):
    title = Text(fields={'title': Keyword()})
    author = Text()

    class Index:
        name = 'myindex'  # 索引名

if __name__ == '__main__':
    # Article.init()  # 创建映射
    # 保存数据
    # article = Article()
    # article.title = "测试测试"
    # article.author = "刘清政"
    # article.save()  # 数据就保存了

    # 查询数据
    # s=Article.search()
    # s = s.filter('match', title="测试")
    # results = s.execute()
    # print(results)

    # 删除数据
    # s = Article.search()
    # s = s.filter('match', title="测试").delete()
    
    # 修改数据
    s = Article().search()
    s = s.filter('match', title="测试")
    results = s.execute()
    print(results[0])
    results[0].title="xxx"
    results[0].save()

# es数据和mysql数据同步
	-方案一:第三方同步脚本,实时同步mysql和es数据,软件运行,配置一下就可以了,后台一直运行
	-//github.com/go-mysql-org/go-mysql-elasticsearch
	-编译成可执行文件,启动就行----》linux上编译,go sdk---》跨平台编译
    
	-方案二:(简历里说---》es和mysql的同步工具)
	-自己用python写---》三个表,pymysql---》es中---》后台一直运行
	-三个表,删除数据呢?公司里的数据都不删,都是软删除
	-三个表增加---》记录每个表同步到的位置id号---》
	-pymysql  es
		-免费课表(id,name,price),实战课表(id,name,price,teacher)
		id_1=0
		id_2=0
	pymyql打开免费课,查询id如果大于id_1,把大于id_1的取出来组装成字典,存入es
	pymyql打开实战课,查询id如果大于id_2,把大于id_2的取出来,存入es
	-同步到es中---》es中只存id,name,price
  
	-方案三:使用celery,只要mysql相应的表里插入一条数据,就使用celery的异步,把这条记录插入到es中去
    
	-方案四:信号---》监控到哪个表发生了变化---》只要xx表发生变化,就插入es