本文,本文重在扫盲,从简单的添加索引、创建mapping、增、删、改、查、分页、聚合、嵌套查询等常用的语法切入,把大家带入到Elastic search的使用,让我们对es有个大概的体感。
目录
一、添加索引
二、设置mapping
属性映射类型可参考:
示例
三、插入数据
四、查询数据
以上mapping的设置,涵盖了四种常用的数据类型。对每种数据类型查询做下分析
1.简单数据类型:如上述示例中的goodsId,goodsName
2.对于type为long,数组形式存储的但属性值的查询。如上述示例中的buIds,主要为java对象中List buIds 的用法
3.object类型和nested类型的查询(上述示例中的manager和managers)
五、更新索引
1.通过put插入索引的方式,指定id后,把要改的字段,全都写一遍,重新进行索引,如果是新增的字段,如果索引模式设置了自动适配就会新增该字段(默认是新增)。相同的字段进行更新,缺少的字段更新为空
2.通过update by query语句进行更新,示例如下
六.删除语句 delete by query 语句进行,务必要加条件,不然会删除所有数据,慎用
七、分页
八、聚合
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html
插入数据,只要根据mapping设置的数据类型,组装相应的json串即可,见示例代码
##插入数据第一条数据,这里的1表示的是索引中文档的id,可根据业务语义设置即可,如使用商品id PUT goods_for_test_use/_doc/1 { "goodsId":"222", "goodsName":"测试商品名称2", "buIds":[1,2,3], "gmv":333, "manager":[ { "firstName":"lei", "secondName":"teng" }, { "fistName":"zhang", "secondName":"san" } ], "managers":[ { "firstName":"lei", "secondName":"teng", "age":30 }, { "firstName":"zhang", "secondName":"san", "age":18 }, { "firstName":"lei", "secondName":"san", "age":18 } ] }
现在如果我们要查询管理者firstName="lei"并且secondName="san"的商品记录,如果manager属性为object类型,则查询时,即使没有名字叫lei、san的管理者,但是也会把记录匹配出来。
## object 类型的对象查询 GET goods_for_test_use/_search { "query": { "bool": { "must": [ { "match": { "manager.firstName": "lei" } },{ "match": { "manager.secondName": "san" } } ] } } } ## 查询到的结果 { "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 1.3862944, "hits" : [ { "_index" : "goods_for_test_use", "_type" : "goods_for_test", "_id" : "2", "_score" : 1.3862944, "_source" : { "goodsId" : "goodsId222", "goodsName" : "测试商品名称2", "buIds" : [ 1, 2, 3 ], "manager" : [ { "firstName" : "lei", "secondName" : "teng" }, { "fistName" : "zhang", "secondName" : "san" } ], "managers" : [ { "firstName" : "lei", "secondName" : "teng", "age" : 30 }, { "fistName" : "zhang", "secondName" : "san", "age" : 18 }, { "fistName" : "lei", "secondName" : "san", "age" : 18 } ] } } ] } }发生这种情况的原因是因为,type为object类型在elastic内部存储类似如下:
{ "goodsId" : "goodsId222", "goodsName" : "测试商品名称2", "manager.firstName" : [ "lei", "zhang" ] "manager.secondName": ["teng","san] } manager 为object 类型,内部存储结构被扁平化为多值字段,类似示例中的buIds,因此可以查询到lei、san的管理者数据,这是不满足查询语义的。因此如果为了保障各对象的相对独立性,需要采用managers的方式,定义类型为nested,其内部是作为独立对象存储的,可以用于查询等操作,具体查询语句如下 GET goods_for_test_use/_search { "query": { "nested": { "path": "managers", "query": { "bool": { "must": [ { "match": { "managers.firstName": "lei" } },{ "match": { "managers.secondName": "san" } } ] } } } } }执行以上嵌套查询,就可以查到真实的姓名leisan的数据了,因为确实存储了。
PUT goods_for_test_use/_doc/1 { "goodsId":"111", "goodsName":"测试商品名称1", "buIds":[1,2,3], "manager":[ { "firstName":"heh", "secondName":"teng" }, { "fistName":"zhang", "secondName":"san" } ], "managers":[ { "firstName":"lei", "secondName":"teng", "age":30 }, { "firstName":"zhang", "secondName":"san", "age":18 }, { "firstName":"lei", "secondName":"san", "age":18 } ] }
1.agg terms分桶,类似于sql中的group by; agg 各种指标函数,类似于sql中的sum、avg、max、min等。并且可以联合使用
示例如下
## 查询语句,统计每个商品的gmv综合 GET goods_for_test_use/_search { "aggs": { "商品id": { "terms": { "field": "goodsId", "size": 10 } , "aggs": { "总的gmv": { "sum": { "field": "gmv" } } } } }, "size": 0 } ## 查询的结果 { "took" : 7, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 6, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "商品id" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "111", "doc_count" : 3, "总的gmv" : { "value" : 765.0 } }, { "key" : "222", "doc_count" : 3, "总的gmv" : { "value" : 999.0 } } ] } } }