索引创建
非结构化创建索引
创建索引的时候,不定义 mapping 的格式
# 非结构化方式创建索引
# 设置 settings 属性
PUT /employee
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
}
}
PUT /employee/_doc/1
{
"name": "夹克",
"age": 30
}
PUT 操作对应的 id 不存在,则 create 数据;
如存在,则更新数据。
PUT 是全量更新,需要将所有的字段(要修改的和不要修改的)都传过来,否则未列出字段将丢失。
执行结果:

ES 7 之后 _type 属性被废弃,使用 _doc 占位符
如下图可以看到,ES 会根据插入的值,自动推断出表结构

指定操作为创建,如果已存在,则失败,而不是更新数据。
# 强制指定创建,若已存在,则失败
POST /employee/_create/1
{
"name": "123",
"age": 30
}
执行结果:

结构化创建索引
# 使用结构化的方式创建索引
PUT /employee
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"name": { "type": "text" },
"age": { "type": "integer" }
}
}
}
执行结果:

employee 的 mapping 结构就会如我们所有设计的

如果插入的值没有与现有的索引结构冲突,ES 则为索引推断出新的字段,否则抛出 number_format_exception
索引更新
指定字段修改
# 指定字段修改
POST /employee/_update/1
{
"doc": {
"name": "夹克1"
}
}
执行结果:

索引删除
# 删除索引
delete employee
# 删除某个文档
DElETE /employee/_doc/1
索引简单查询
查询某条文档
# 查询某条文档
GET /employee/_doc/1
执行结果:

查询全部文档
# 查询全部文档
GET /employee/_search
# 不带条件查询所有记录
GET /employee/_search
{
"query": {
"match_all": {}
}
}
执行结果:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "employee",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "肉丝",
"age" : 30
}
},
{
"_index" : "employee",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "夹克",
"age" : 30
}
}
]
}
}
分页查询
# 分页查询
GET /employee/_search
{
"query": {
"match_all": {}
},
"from": 0,
"size": 1
}
form 从某页开始(第一页为 0),size 每页的记录数
默认以 id 为倒叙进行排序
执行结果:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "employee",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "肉丝",
"age" : 30
}
}
]
}
}
索引复杂查询
带关键字条件的查询
# 带关键字条件的查询
GET /employee/_search
{
"query": {
"match": {
"name": "肉丝"
}
}
}
match:默认分词
term:不会分词
执行结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.3862944,
"hits" : [
{
"_index" : "employee",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.3862944,
"_source" : {
"name" : "肉丝",
"age" : 30
}
}
]
}
}
ES 默认分词器会将每个中文作为分词,所以只要包含 “肉” 和 “丝” 都会命中 “肉丝” 记录。
带排序
# 带排序
GET /employee/_search
{
"query": {
"match": { "name": "夹" }
},
"sort": [
{"age": { "order": "desc" }}
]
}
执行结果:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "employee",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"name" : "夹子",
"age" : 31
},
"sort" : [
31
]
},
{
"_index" : "employee",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"name" : "夹克",
"age" : 30
},
"sort" : [
30
]
}
]
}
}
注意:_score 字段变为 null,因为我们使用了 sort 关键字进行定制化排序。
带 filter
# 带 filter
GET /employee/_search
{
"query": {
"bool": {
"filter": [
{ "term": {"age": 30}}
]
}
}
}
filter:不打分
term:不进行分词的分析,直接去索引内查询
match:按照字段上的定义的分词分析后去索引内查询
执行结果:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "employee",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.0,
"_source" : {
"name" : "夹克",
"age" : 30
}
}
]
}
}
带聚合
# 带聚合
GET /employee/_search
{
"query": {
"match": {
"name": "夹"
}
},
"sort": [
{
"age": {
"order": "desc"
}
}
],
"aggs": {
"goroup_by_age": {
"terms": {
"field": "age"
}
}
}
}
goroup_by_age:为自定义字段
执行结果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "employee",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"name" : "夹子",
"age" : 31
},
"sort" : [
31
]
},
{
"_index" : "employee",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"name" : "夹克",
"age" : 30
},
"sort" : [
30
]
},
{
"_index" : "employee",
"_type" : "_doc",
"_id" : "4",
"_score" : null,
"_source" : {
"name" : "夹克2",
"age" : 30
},
"sort" : [
30
]
}
]
},
"aggregations" : {
"goroup_by_age" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 30,
"doc_count" : 2
},
{
"key" : 31,
"doc_count" : 1
}
]
}
}
}
聚合操作多了 aggregations 字段
高级查询语法
我们先来看一个问题,如下的索引为什么搜索 eat 不能命中?
# 新建一个索引
PUT /movie/_doc/1
{
"name": "Eating an apple a day & keeps the doctor away"
}
GET /movie/_search
{
"query": {
"match": {
"name": "eat"
}
}
}
执行结果:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
hits 为空,结果未命中。
首先我们得了解 ES 的运行机制,如下图,我们可以看到 ES 实际上是根据索引的分词去命中结果。

这就说明来分词中没有 eat。我们也可以利用 analyze api 来查询分词的情况。
# 使用 analyze api 查看分词状态
GET /movie/_analyze
{
"field": "name",
"text": "Eating an apple a day & keeps the doctor away"
}
执行结果:
{
"tokens" : [
{
"token" : "eating",
"start_offset" : 0,
"end_offset" : 6,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "an",
"start_offset" : 7,
"end_offset" : 9,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "apple",
"start_offset" : 10,
"end_offset" : 15,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "a",
"start_offset" : 16,
"end_offset" : 17,
"type" : "<ALPHANUM>",
"position" : 3
},
{
"token" : "day",
"start_offset" : 18,
"end_offset" : 21,
"type" : "<ALPHANUM>",
"position" : 4
},
{
"token" : "keeps",
"start_offset" : 24,
"end_offset" : 29,
"type" : "<ALPHANUM>",
"position" : 5
},
{
"token" : "the",
"start_offset" : 30,
"end_offset" : 33,
"type" : "<ALPHANUM>",
"position" : 6
},
{
"token" : "doctor",
"start_offset" : 34,
"end_offset" : 40,
"type" : "<ALPHANUM>",
"position" : 7
},
{
"token" : "away",
"start_offset" : 41,
"end_offset" : 45,
"type" : "<ALPHANUM>",
"position" : 8
}
]
}
果然如此,因此,当 ES 的自动分词系统不能完全满足我们的需求,那么我们必须得自己设置分词条件。
其实对于英文而言,ES 也有默认实现的分词器 english,创建索引时候字段指定分词器
PUT /movie
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"name": { "type": "text", "analyzer": "english"}
}
}
}
# 使用 analyze api 查看分词状态
GET /movie/_analyze
{
"field": "name",
"text": "Eating an apple a day & keeps the doctor away"
}
执行结果:
{
"tokens" : [
{
"token" : "eat",
"start_offset" : 0,
"end_offset" : 6,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "appl",
"start_offset" : 10,
"end_offset" : 15,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "dai",
"start_offset" : 18,
"end_offset" : 21,
"type" : "<ALPHANUM>",
"position" : 4
},
{
"token" : "keep",
"start_offset" : 24,
"end_offset" : 29,
"type" : "<ALPHANUM>",
"position" : 5
},
{
"token" : "doctor",
"start_offset" : 34,
"end_offset" : 40,
"type" : "<ALPHANUM>",
"position" : 7
},
{
"token" : "awai",
"start_offset" : 41,
"end_offset" : 45,
"type" : "<ALPHANUM>",
"position" : 8
}
]
}
现在查询 eat,就会发现可以成功命中了。
GET /movie/_search
{
"query": {
"match": {
"name": "eat"
}
}
}
// 执行结果
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "movie",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"name" : "Eating an apple a day & keeps the doctor away"
}
}
]
}
}
自定义 Analyze
analyze = 分词的过程
分词步骤:
- 字符过滤器(上面的例子中 “&” 没有被作为分词处理)
- 字符处理(默认标准字符处理:以空格和标点符号做分割)
- 分词过滤(变小写)
未完待续。。。
分词 and、or
# match 查询分词默认是 or
GET /movie/_search
{
"query": {
"match": {
"title": "basketball with cartoom aliens"
}
}
}
// 添加关键词 operator: "and",分词之间关系是 and 关系,精确匹配
GET /movie/_search
{
"query": {
"match": {
"title": {
"query": "basketball with cartoom aliens",
"operator": "and"
}
}
}
}
最小词匹配项
“minimum_should_match”: 2
表示最少命中两个分词
# 最小词匹配项
GET /movie/_search
{
"query": {
"match": {
"title": {
"query": "basketball with cartoom aliens",
"operator": "or",
"minimum_should_match": 2
}
}
}
}
短语查询
# 短语查询
GET /movie/_search
{
"query": {
"match_phrase": {
"title": "steve zissou"
}
}
}
多字段查询
# 多字段查询
GET /movie/_search
{
"query": {
"multi_match": {
"query": "basketball with cartoom aliens",
"fields": ["title","overview"]
}
}
}