days04-对es分布式搜索引擎进行实战

继上一篇文章的对es的初步了解，接下来对其进行一些实战操作，来加深理解！

一、索引库操作

1.1 mapping映射属性

mapping是对索引库中文档的约束，常见的mapping属性包括：

type：字段数据类型，常见的简单类型有：- ### 字符串：text（可分词的文本）、keyword（精确值，例如：品牌、国家、ip地址）- ### 数值：long、integer、short、byte、double、float、- ### 布尔：boolean- ### 日期：date- ### 对象：object
index：是否创建索引，默认为true
analyzer：使用哪种分词器
properties：该字段的子字段

1.2 索引库的CRUD(增删改查)

1.2.1 创建索引库

ES中通过Restful请求操作索引库、文档。请求内容用DSL语句来表示。创建索引库和mapping的DSL语法如下：

接下来就在kibana的可视化界面进行一下实战操作：

如果说这次的操作是在上一篇文章以后的话，虚拟机如果关闭了，可以将虚拟机打开了以后，输入以下命令：

syatemctl restart docker  //打开docker
docker start es //打开es容器
docker start kibana //打开kibana可视化界面容器
docker ps //查看当前运行的容器

然后我们在浏览器输入http://本地ip:5601/ 进入可视化界面,然后在Dev Tools上进行后面的索引库和文档的CRUD操作：

1.2.2 查看、删除索引库

查看索引库语法：

#GET /索引库名
# 示例
GET /heima

删除索引库语法：

# DELETE /索引库名
# 示例
DELETE /heima

1.2.3 修改索引库

索引库和mapping一旦创建无法修改，不允许对其进行修改，但是可以添加新的字段，语法如下：

！！* 因为索引库创建完以后mapping映射都会映射好，而es会基于这些映射来创建倒排索引，如果说修改某一个字段，会导致倒排索引失效，所以禁止修改索引库，这与数据库不同，但是在生产阶段，数据库尽量也别修改，会直接影响性能。*

PUT /索引库名/_mapping
{
  "properties": {
    "新字段名":{ #切记字段名不能和之前原有的重复
      "type": "integer"
    }
  }
}

二、文档操作

2.1 新增文档

新增文档的DSL语法如下：

POST /索引库名/_doc/文档id   #文档id如果不写 会自动生成一个id
{
    "字段1": "值1",
    "字段2": "值2",
    "字段3": {
        "子属性1": "值3",
        "子属性2": "值4"
    },
    // ...
}
# 示例
POST /heima/_doc/1
{
    "info": "黑马程序员Java讲师",
    "email": "zy@itcast.cn",
    "name": {
        "firstName": "云",
        "lastName": "赵"
    }
}

2.2 查询删除文档

查询文档语法：

# GET /索引库名/_doc/文档id
# 示例
GET /heima/_doc/1

删除索引库的语法：

# DELETE /索引库名/_doc/文档id 
# 示例
DELETE /heima/_doc/1

2.3 修改文档

方式一：全量修改，会删除旧文档，添加新文档

 ***   ****如果说id存在就是修改，如果说id不存在就是新增！！！！！*

PUT /索引库名/_doc/文档id
{
    "字段1": "值1",
    "字段2": "值2",
    // ... 略
}
# 示例
PUT /heima/_doc/1
{
    "info": "黑马程序员高级Java讲师",
    "email": "zy@itcast.cn",
    "name": {
        "firstName": "云",
        "lastName": "赵"
    }
}

方式二：增量修改，修改指定字段值

POST /索引库名/_update/文档id
{
    "doc": {
         "字段名": "新的值",
    }
}
# 示例
POST /heima/_update/1
{
  "doc": {
    "email": "ZhaoYun@itcast.cn"
  }
}

作为java程序员，需要将es用java代码实现出来，而不是简单地在可视化界面实现，下面就利用到RestClient来实现。

三、 RestClient操作索引库

ES官方提供了各种不同语言的客户端，用来操作ES。这些客户端的本质就是组装DSL语句，通过http请求发送给ES。官方文档地址：https://www.elastic.co/guide/en/elasticsearch/client/index.html

接下来通过一个案例来利用RestClient操作索引库

3.1 创建索引库

步骤一：首先导入上面所提供的数据库数据 tb_hotel.sql,然后导入所提供的项目hotel-demo

步骤二：然后可以对数据库的sql语句编写创建索引库，代码如下：

# 酒店的mapping
PUT /hotel
{
  "mappings": {
    "properties": {
      "id":{
        "type": "keyword"
      },
      "name":{
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "address":{
        "type": "keyword",
        "index": false
      },
      "price":{
        "type": "integer"
      },
      "score":{
        "type": "integer"
      },
      "brand":{
        "type": "keyword"
        
      },
      "city":{
        "type": "keyword"
      },
       "starName":{
        "type": "keyword"
      },
       "business":{
        "type": "keyword"
      },
       "location":{
        "type": "geo_point"
      },
       "pic":{
         "type": "keyword", 
         "index": false
       }
      }
 
  }
}

其中需要注意的是，sql中的经纬度字段，在es中有单独的表示类型：

但是如果说用户想要对多个字段进行搜索比如说用过酒店名称搜、商圈搜等等，但是就数据库而言我们可以知道，只对一个字段就行搜索效率会很高，但是在es中如何做到既对多个字段搜索又能提高效率呢？

* 我们把想要参与搜索的字段统一放到“all”的索引中*

可以将上面的mapping映射代码修改为：

# 酒店的mapping
PUT /hotel
{
  "mappings": {
    "properties": {
      "id":{
        "type": "keyword"
      },
      "name":{
        "type": "text",
        "analyzer": "ik_max_word",
        "copy_to": "all"
      },
      "address":{
        "type": "keyword",
        "index": false
      },
      "price":{
        "type": "integer"
      },
      "score":{
        "type": "integer"
      },
      "brand":{
        "type": "keyword",
        "copy_to": "all"
      },
      "city":{
        "type": "keyword"
      },
       "starName":{
        "type": "keyword"
      },
       "business":{
        "type": "keyword",
        "copy_to": "all"
      },
       "location":{
        "type": "geo_point"
      },
       "pic":{
         "type": "keyword", 
         "index": false
       },
       "all":{
        "type": "text",
        "analyzer": "ik_max_word"
      } 
    }
 
  }
}

步骤三：初始化JavaRestClient

    **1. 引入es的RestHighLevelClient依赖：**

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
</dependency>

** 2. 因为SpringBoot默认的ES版本是7.6.2，所以我们需要覆盖默认的ES版本：**

<properties>
    <java.version>1.8</java.version>
    <elasticsearch.version>7.12.1</elasticsearch.version> 
</properties>

** 3. 初始化RestHighLevelClient：**

可以新建一个测试类然后书写下面的代码：

package cn.itcast.hotel;
import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import java.io.IOException;
public class HotelIndexTest {
    private RestHighLevelClient client;
    @Test
    void testInit(){
        System.out.println(client);
    }
    @BeforeEach
    void setUp(){
        this.client=new RestHighLevelClient(RestClient.builder(
                //这里改成自己的ip地址
                HttpHost.create("http://192.168.229.101:9200") 
        ));
    }
        @AfterEach
        void tearDown() throws IOException {
            this.client.close();
    }
}

步骤四：创建索引库

在测试类中新建一个测试方法 createHotelIndex()

  @Test
    void createHotelIndex() throws IOException {
        //1. 创建Request 对象
        CreateIndexRequest request = new CreateIndexRequest("hotel");
        //2. 准备请求的参数:DSL语句
        // 其中这个HotelSource是一个常量类,类里面就是之前所写的mapping映射代码，也可以直接复制到这个位置。
        request.source(HotelSource, XContentType.JSON);
        //3. 发送请求
        client.indices().create(request, RequestOptions.DEFAULT);
    }

  @Test
    void createHotelIndex() throws IOException {
        //1. 创建Request 对象
        CreateIndexRequest request = new CreateIndexRequest("hotel");
        //2. 准备请求的参数:DSL语句
        request.source(HotelSource, XContentType.JSON);
        //3. 发送请求
        client.indices().create(request, RequestOptions.DEFAULT);
    }

* 需要注意的是，导包一定要到对，不然会报错：*

最后在可视化界面中利用

GET /hotel

查看如果有结果则说明创建索引库成功！！！

3.2 删除索引库

和创建索引库代码类似，只是调用方法不同。

    @Test
    void deleteHotelIndex() throws IOException {
        //1. 创建Request 对象
        DeleteIndexRequest deleteIndexRequest = new DeleteIndexRequest("hotel");
        //2. 发送请求
        client.indices().delete(deleteIndexRequest, RequestOptions.DEFAULT);
    }

3.3 判断索引库是否存在

  @Test
    void existHotelIndex() throws IOException {
        //1. 创建Request 对象
        GetIndexRequest getIndexRequest = new GetIndexRequest("hotel");
        //2. 发送请求
        boolean exists=client.indices().exists(getIndexRequest, RequestOptions.DEFAULT);
        //3. 输出是否存在
        System.out.println(exists);
    }

四、RestClient操作文档

案例：去数据库查询酒店数据，导入到hotel索引库，实现酒店数据的CRUD!

初始化JavaRestClient （和上面的对索引库操作的初始化代码一样，可以直接复制）

利用JavaRestClient新增酒店数据

利用JavaRestClient根据id查询酒店数据

利用JavaRestClient删除酒店数据

利用JavaRestClient修改酒店数据

4.1 新增文档

  @Autowired
    private IHotelService iHotelService;
 @Test
    void testAddDocument() throws IOException {
        // 根据id查询酒店数据
        Hotel hotel = iHotelService.getById(61083L);
        // 转换为文档类型
        HotelDoc hotelDoc = new HotelDoc(hotel);
        //1. 准备 Request对象
        IndexRequest request = new IndexRequest("hotel").id(hotel.getId().toString());
        //2. 准备Json文档
        request.source(JSON.toJSONString(hotelDoc),XContentType.JSON);
        //3. 发送请求
        client.index(request,RequestOptions.DEFAULT);
    }

当查询出id=61083的数据以后则说明新增文档成功！！！

4.2 查询文档

    @Test
    void testGetDocumentById() throws IOException {
        //1. 准备Request
        GetRequest request = new GetRequest("hotel","61083");
        //2. 发送请求，得到相应
        GetResponse response = client.get(request, RequestOptions.DEFAULT);
        //3. 解析响应结果
        String source = response.getSourceAsString();//反序列化 将json对象转化成java对象
        System.out.println(source);
    }

得到结果如下，说明查询成功：

4.3 删除文档

@Test
    void testDeleteDocumentById() throws IOException {
        //1. 准备request
        DeleteRequest request = new DeleteRequest("hotel","61083");
        //发送请求
        client.delete(request,RequestOptions.DEFAULT);
    }

当出现下面的信息时，说明id=61083的酒店信息删除成功！！

4.4 修改文档

修改文档数据有两种方式：

方式一：全量更新。再次写入id一样的文档，就会删除旧文档，添加新文档（和新增没啥区别）
方式二：局部更新。只更新部分字段，我们演示方式二

 @Test
    void testUpdateDocumentById() throws IOException{
        //1. 创建request对象
        UpdateRequest request = new UpdateRequest("hotel","61083");
        //2. 准备参数，每2个参数为一对key value
        request.doc(
                "price","999",
                "starName","四钻"
        );
        //3. 更新文档
        client.update(request,RequestOptions.DEFAULT);
    }

4.5 批量导入文档

需求：批量查询酒店数据，然后批量导入索引库中

思路：

利用mybatis-plus查询酒店数据
将查询到的酒店数据（Hotel）转换为文档类型数据（HotelDoc）
利用JavaRestClient中的Bulk批处理，实现批量新增文档，示例代码如下：

  //批处理
    @Test
    void testBulkRequest() throws IOException {
        //1. 创建Request
        BulkRequest request = new BulkRequest();
        // 批量查询酒店数据
        List<Hotel> hotels = iHotelService.list();
        //转换为文档类型的HotelDoc
        for(Hotel hotel:hotels) {
            HotelDoc hotelDoc = new HotelDoc(hotel);
            //创建新增文档的Request对象
            //2. 准备参数，添加多个新增的Request
            request.add(new IndexRequest("hotel")
                    .id(hotelDoc.getId().toString())
                    .source(JSON.toJSONString(hotelDoc),XContentType.JSON));
        }
        //3. 发送请求
        client.bulk(request,RequestOptions.DEFAULT);
    }

以上就是对es的简单实践，包括对索引库以及文档的CRUD、最后实现了数据的批处理。内容若有不足，希望大家批评指正，我们一起努力！

标签： elasticsearch 大数据搜索引擎

本文转载自: https://blog.csdn.net/m0_62025000/article/details/126865310
版权归原作者 在甲蛙海中掘金 所有，如有侵权，请联系我们删除。