ES的常用命令
文章的命令都是基于kibana模式下的命令,目前尝试所有命令都是可以执行成功的。
kibana模式下增删改查
PUT 类似于SQL中的 增
DELETE 类似于SQL中的 删
POST 类似于SQL中的 改
GET 类似于SQL中的 查
基本命令
占位行…
查看集群健康状况
GET _cat/health
查询ES中所有的index
GET /_cat/indices?v
GET _all
删除名称为eg_index的索引
DELETE /eg_index
ES的一些设置
设置es最大返回记录数(size)
PUT /ecommerce/_settings
{"index":{"max_result_window":"50000000"}}
查看索引的mapping
GET ecommerce/_mapping
ES的CURD操作
插入数据
使用
PUT /index/type/id
PUT /ecommerce/product/1
{"name":"zhangsan",
"customer_full_name":{"firstname":"zhang","lastname":"san"},
"gender":"man"}
PUT /ecommerce/product/2
{"name":"隔壁老王",
"customer_full_name":{"firstname":"wang","lastname":"wu"},
"gender":"man"}# 也可以使用 POST
POST /ecommerce/product/1
{"name":"张三",
"gender":"男"}# 如果使用 POST+update 方式,则只会更改对应的字段,其它字段不变,是局部更新;否则使用put或post方式将导致其它数据变化,数据全局更新。
POST /ecommerce/product/1/_update
{"doc":{"name":"张三"}}
注意:我们插入数据的时候,如果我们的语句中指明了index和type,如果ES里面不存在,默认帮我们自动创建
删除数据
DELETE /ecommerce/product/1
# 执行结果{"_index":"ecommerce",
"_type":"product",
"_id":"1",
"_version":11,
"result":"deleted",
"_shards":{"total":2,
"successful":1,
"failed":0},
"_seq_no":11,
"_primary_term":2}# 发现version不是1,这就说明跟hbase是类似的,不会立刻删除,会在合适的时机进行删除。
查看所有数据
GET /ecommerce/product/_search
DSL语言
ES最主要是用来做搜索和分析的。所以DSL还是对于ES很重要的。
下面我们写的代码都是RESTful风格。
query DSL: domain Specialed Lanaguage 在特定领域的语言
执行查询之前,我们先插入一些数据:
POST /ecommerce/product/13
{"base_price" : 24.99,"discount_percentage" : 0,"quantity" : 1,"manufacturer" : "Champion Arts","tax_amount" : 0,"product_id" : 11238,"category" : "Women's Clothing","sku" : "ZO0489604896","taxless_price" : 24.99,"unit_discount_amount" : 0,"min_price" : 11.75,"discount_amount" : 0,"created_on" : "2016-12-25T21:59:02+00:00","product_name" : "Denim dress - black denim","price" : 24.99,"taxful_price" : 24.99,"base_unit_price" : 24.99
}
PUT /ecommerce/product/14
{"base_price" : 11.99,"discount_percentage" : 0,"quantity" : 1,"manufacturer" : "Elitelligence","tax_amount" : 0,"product_id" : 6283,"category" : "Men's Clothing","sku" : "ZO0549605496","taxless_price" : 11.99,"unit_discount_amount" : 0,"min_price" : 6.35,"discount_amount" : 0,"created_on" : "2016-12-26T09:28:48+00:00","product_name" : "Basic T-shirt - dark blue/white","price" : 11.99,"taxful_price" : 11.99,"base_unit_price" : 11.99
}
PUT /ecommerce/product/17
{"base_price" : 24.99,"discount_percentage" : 0,"quantity" : 1,"manufacturer" : "Champion Arts","tax_amount" : 0,"product_id" : 11238,"category" : "Women's Clothing","sku" : "ZO0489604896","taxless_price" : 24.99,"unit_discount_amount" : 0,"min_price" : 11.75,"discount_amount" : 0,"created_on" : "2016-12-25T21:59:02+00:00","product_name" : "Denim dress - black denim","price" : 24.99,"taxful_price" : 24.99,"base_unit_price" : 24.99
}
PUT /ecommerce/product/15
{"base_price" : 99.99,"discount_percentage" : 0,"quantity" : 1,"manufacturer" : "Low Tide Media","tax_amount" : 0,"product_id" : 22794,"category" : "Women's Shoes","sku" : "ZO0374603746","taxless_price" : 99.99,"unit_discount_amount" : 0,"min_price" : 46.01,"discount_amount" : 0,"created_on" : "2016-12-25T22:32:10+00:00","product_name" : "Boots - Midnight Blue","price" : 99.99,"taxful_price" : 99.99,"base_unit_price" : 99.99
}
PUT /ecommerce/product/16
{"base_price" : 74.99,"discount_percentage" : 0,"quantity" : 1,"manufacturer" : "Primemaster","tax_amount" : 0,"product_id" : 12304,"category" : "Women's Shoes","sku" : "ZO0360303603","taxless_price" : 74.99,"unit_discount_amount" : 0,"min_price" : 34.5,"discount_amount" : 0,"created_on" : "2016-12-25T22:58:05+00:00","product_name" : "High heeled sandals - argento","price" : 74.99,"taxful_price" : 74.99,"base_unit_price" : 74.99
}
match_all
使用
match_all
可以查询到所有文档,是没有查询条件下的默认语句。
GET /ecommerce/product/_search
{"query":{"match_all": {}}}
match 匹配(全文检索)
match
查询是一个标准查询,不管你需要全文本查询还是精确查询基本上都要用到它。
GET /ecommerce/product/_search
{"query":{"match":{"category":"Clothing"}}}
match_phrase 精确匹配
GET /ecommerce/product/_search
{"query":{"match_phrase":{"category":"Women's Clothing"}}}
match匹配时,如果检索字段是多个单词,则检索逻辑是将单词拆分(分词),然后独立检索,之后取并集。 eg:
"match": {"category":"Women's Clothing"}# 先分词,按照 Women's 和 Clothing 两个单词进行检索,取并集
sort 排序
我们按照价格进行排序:因为不属于查询的范围了,所以要写一个 逗号。
GET /ecommerce/product/_search
{"query":{"match":{"category":"Clothing"}},
"sort":[{"price":{"order":"desc"}}]}#解析"order":"desc"#按价格倒序排序
size 分页
GET /ecommerce/product/_search
{"query":{"match":{"category":"Clothing"}},
"sort":[{"price":{"order":"desc"}}],
"from":0,
"size":2
}#解析"from":0 #从第几个数据开始"size":2 #每页多少数据
_source 返回指定字段
很多时候,我们不需要全部数据,部分字段数据足矣
GET /ecommerce/product/_search
{"query":{"match_all":{}},
"sort":[{"price":{"order":"desc"}}],
"_source":["price","min_price","base_price"]}#解析"_source":["price","min_price","base_price"]:#只显示价格相关数据
条件查询
eg:搜索名称里面包含Women’s Clothing,并且价格大于20元且小于50元的商品
相当于:相当于 select * form product where name like %Women’s Clothing% and price >50;
因为有两个查询条件,我们就需要使用下面的查询方式
如果需要多个查询条件拼接在一起就需要使用bool
bool
过滤可以用来合并多个过滤条件查询结果的布尔逻辑,它包含以下操作符:
must
:: 多个查询条件的完全匹配,相当于
and
。
must_not
:: 多个查询条件的相反匹配,相当于
not
。
should
:: 至少有一个查询条件匹配, 相当于
or
。
GET /ecommerce/product/_search
{"query":{"bool":{"must":[{"match":{"category":"Women's Shoes"}}],
"filter":[{"range":{"price":{"gte":20,
"lte":50}}}]}}}#先查询(条件must),再过滤
结果进行高亮展示
#高亮展示
GET /ecommerce/product/_search
{"query":{"match_phrase":{"category":"Women's Clothing"}},
"highlight":{"fields":{"category":{}}}}"hits":[{"_index":"ecommerce",
"_type":"product",
"_id":"3",
"_score":0.73748296,
"_source":{"base_price":24.99,
"discount_percentage":0,
"quantity":1,
"manufacturer":"Champion Arts",
"tax_amount":0,
"product_id":11238,
"category":"Women's Clothing",
"sku":"ZO0489604896",
"taxless_price":24.99,
"unit_discount_amount":0,
"min_price":11.75,
"discount_amount":0,
"created_on":"2016-12-25T21:59:02+00:00",
"product_name":"Denim dress - black denim",
"price":24.99,
"taxful_price":24.99,
"base_unit_price":24.99},
"highlight":{"category":["<em>Women's</em> <em>Clothing</em>"]}}]#输出结果带:"highlight":{"category":["<em>Women's</em> <em>Clothing</em>"]}<em>Women's</em> <em>Clothing</em> 这个标签是默认的标签,是可以自定义的进行替换的,比如我们可以替换成
<span style="color:red">Women's Clothing</span>,把这个输出到网页上,自然而然就是红色的了。
#注:高亮展示好像只能展示match里的关键词,比如此处只能高亮 Women's Clothing
Women’s Clothing
聚合分析
查询 每种衣服种类的数量
GET /ecommerce/_search
{"aggs":{"group_by_category":{"terms":{"field":"category.keyword"}}}}# 按照 category 进行分组统计(统计每种category下的商品数)
group_by_category:本次查询的名称,自己随便取
field:一定要加上 `.keyword`,因为 category是text字段,默认没有索引,而text分词之后的keyword是有索引的,因此可以用 `category.keyword` 进行聚合。
# 聚合结果会在查询结果的底部显示"aggregations":{"group_by_category":{"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{"key":"Women's Clothing",
"doc_count":11},
{"key":"Women's Shoes",
"doc_count":4},
{"key":"Men's Clothing",
"doc_count":1},
{"key":"Men's Shoes",
"doc_count":1}]}}# 这个结果一搬是符合我们的业务预期的#如果要想直接对 category 进行聚合,也可以将 "fielddata" 设置为 true (一般不推荐,因为设置之后,category如果又几个单词组成,也会被分词)#设置方法为:
PUT /ecommerce/_mapping
{"properties":{"category" :{"type":"text",
"fielddata":true
}}}#设置之后就可以进行分组聚合了
GET /ecommerce/product/_search
{"aggs":{"group_by_category":{"terms":{"field":"category"}}}}#结果为:"aggregations":{"group_by_category":{"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{"key":"women's",
"doc_count":15},
{"key":"clothing",
"doc_count":12},
{"key":"shoes",
"doc_count":5},
{"key":"men's",
"doc_count":2}]}}# 很明显,不符合我们一般的业务需求了
查询 每种衣服种类的数量,并计算其平均价格
GET /ecommerce/_search
{"aggs":{"group_by_category":{"terms":{"field":"category.keyword"},
"aggs":{"avg_price":{"avg":{"field":"price"}}}}}}
range
range
过滤允许我们按照指定范围查找一批数据
eg:查询出ecommerce里面包含Women’s Clothing的数据,按照指定的价格区间进行分组,在每个组内再按category进行分组,分完组以后再求每个组的平均价格,并且按照降序进行排序。
GET /ecommerce/product/_search
{
"query":{
"match": {
"category": "Women's Clothing"
}
},
"aggs":{
"range_in_price":{
"range": {
"field": "price",
"ranges": [
{
"from": 0,
"to": 30
},
{
"from": 30,
"to": 50
},
{
"from": 50,
"to": 100
}
]
},
"aggs": {
"group_in_category": {
"terms": {
"field": "category.keyword",
"order": {
"avg_price":"desc"
}
},
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
}
}
ES搜索结果解析
GET /ecommerce/product/_search
{"took":0,
"timed_out": false,
"_shards":{"total":1,
"successful":1,
"skipped":0,
"failed":0},
"hits":{"total":{"value":1,
"relation":"eq"},
"max_score":1.0,
"hits":[{"_index":"ecommerce",
"_type":"product",
"_id":"1",
"_score":1.0,
"_source":{"name":"zhangsan",
"customer_full_name":{"firstname":"zhang",
"lastname":"san"},
"gender":"man"}}]}}
took
第2行,took表示Elasticsearch执行搜索所用的时间,单位是毫秒。timed_out
第3行,timed_out 用来指示搜索是否超时。_shards
第4行,_shards 指示搜索了多少分片,以及搜索成功和失败的分片的计数。hits
第10行,hits 用来实际搜索结果集。hits.total
第11行,hits.total 是包含与搜索条件匹配的文档总数信息的对象hits.total.value
第12行,hits.total.value 表示总命中计数的值(必须在hits.total.relation上下文中解释)。hits.total.relation
第13行,确切来说默认情况下,hits.total.value是不确切的命中计数,在这种情况下,当hits.total.relation的值是eq时,hits.total.value的值是准确计数。当hits.total.relation的值是gte时,hits.total.value的值是不准确的。hits.hits
第16行,hits.hits 是存储搜索结果的实际数组(默认为前10个文档)。hits.sort
表示结果排序键(如果请求中没有指定,则默认按分数排序)。hits.total
解析
如果我们在请求的参数中加入 track_total_hits,并设置为true,那么我们可以看到在返回的参数中,它正确地显示了所有满足条件的文档个数。
# 请求头:
body ={"track_total_hits":"true",
"query":{}}
# 返回结果"total":{"value":1,
"relation":"eq"},
如何设置
track_total_hits
那么track_total_hits这个参数到底如何设置才是最合理的呢?这要结合具体的业务需求和应用场景。可以遵循如下三个原则:
- 保持默认值:10000,不变,这足以满足一般的业务需求,就算是淘宝、京东这样的大型电商网站,一页展示40个结果,10000个结果可以展示250页,相信没有用户会看250页后的商品,大多数情况下用户基本上都是浏览前10也的商品。
- 如果需要精确知道命中的文档数量,此时应把track_total_hits设置为true,但用户需要清楚的明白,如果命中的文档数量很大,会影响查询性能,而且会消耗大量的内存,甚至存在内存异常的风险。
- 如果你确切知道不需要知道命中的结果数,则把track_total_hits设为false,这会提升查询性能。
版权归原作者 IT小鸟鸟 所有, 如有侵权,请联系我们删除。