0


[Spring Boot]12 ElasticSearch实现分词搜索功能

目录

一、前言

我们在使用搜索功能的时候,有时,为了使搜索的结果更多更广,比如搜索字符串“领导力”,希望有这些组合的结果(领导力、领导、领、导、力)都要能够全部展示出来。
这里我们引入ElasticSearch结合分词插件,来实现这样的搜索功能。

二、搜索功能的需求

比如:一款app需要对“课程”进行关键字搜索。
首先:课程包含的信息有:标题、子标题、讲师、简介、所属标签、封面图片、讲数等等。
搜索的需求为:输入关键字,要能分词匹配课程的这些信息:标题、子标题、讲师、所属标签。
搜索需求:搜索结果的原型图如下图所示:
在这里插入图片描述
红色高亮为搜索结果,覆盖了标题、子标题、所属标签、讲师名称,并能进行分词搜索且关键字高亮。

三、需求开发

1、服务器安装ElasticSearch和IK分词器

参考链接: [Linux安装软件详解系列]05 安装ElasticSearch和IK分词器

2、需求开发

1)pom.xml引入jar包:
<!--elasticsearch--><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-elasticsearch</artifactId></dependency>
2)yml增加配置
// 安装ElasticSearch时对应的信息
  elasticsearch:
    rest:
      uris: http://ip:9200
      username: test
      password:123456
3)配置类ElasticsearchConfig

ElasticsearchConfig:

packagecom.test.api.config;importorg.elasticsearch.client.RestHighLevelClient;importorg.springframework.beans.factory.annotation.Value;importorg.springframework.context.annotation.Bean;importorg.springframework.context.annotation.Configuration;importorg.springframework.data.elasticsearch.client.ClientConfiguration;importorg.springframework.data.elasticsearch.client.RestClients;importorg.springframework.data.elasticsearch.repository.config.EnableElasticsearchRepositories;/**
 * es 配置
 *
 * @author /
 */@EnableElasticsearchRepositories(basePackages ={"com.test.bi.*.repository"})@ConfigurationpublicclassElasticsearchConfig{@Value("${spring.elasticsearch.rest.uris}")privateString hostAndPort;@Value("${spring.elasticsearch.rest.username}")privateString username;@Value("${spring.elasticsearch.rest.password}")privateString password;@BeanpublicRestHighLevelClientelasticsearchClient(){ClientConfiguration clientConfiguration =ClientConfiguration.builder().connectedTo(hostAndPort).withBasicAuth(username, password).build();returnRestClients.create(clientConfiguration).rest();}}
4)工具类ElasticsearchUtil

ElasticsearchUtil:

packagecom.test.common.util;importcn.hutool.core.collection.CollectionUtil;importcn.hutool.json.JSONUtil;importcom.github.pagehelper.PageInfo;importlombok.AllArgsConstructor;importlombok.Data;importlombok.SneakyThrows;importlombok.extern.slf4j.Slf4j;importorg.elasticsearch.action.bulk.BulkRequest;importorg.elasticsearch.action.bulk.BulkResponse;importorg.elasticsearch.action.delete.DeleteRequest;importorg.elasticsearch.action.search.SearchRequest;importorg.elasticsearch.action.search.SearchResponse;importorg.elasticsearch.action.search.SearchScrollRequest;importorg.elasticsearch.action.support.WriteRequest;importorg.elasticsearch.client.RequestOptions;importorg.elasticsearch.client.RestHighLevelClient;importorg.elasticsearch.common.text.Text;importorg.elasticsearch.common.unit.TimeValue;importorg.elasticsearch.search.SearchHit;importorg.elasticsearch.search.builder.SearchSourceBuilder;importorg.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;importorg.elasticsearch.search.fetch.subphase.highlight.HighlightField;importorg.springframework.stereotype.Component;importorg.springframework.util.CollectionUtils;importorg.springframework.util.ReflectionUtils;importjavax.annotation.Resource;importjava.io.IOException;importjava.lang.reflect.Field;importjava.lang.reflect.Method;importjava.util.ArrayList;importjava.util.Collection;importjava.util.List;importjava.util.Map;/**
 * es 工具类
 *
 * @author /
 */@Slf4j@ComponentpublicclassElasticsearchUtil{@ResourceprivateRestHighLevelClient restHighLevelClient;/**
     * 普通查询
     *
     * @param index               索引名称
     * @param searchSourceBuilder 查询条件构建
     * @param resultClass         类
     * @param currentPage         当前页 分页的页码,不是es 的
     * @param size                每页显示数据
     */public<T>PageInfo<T>page(String index,SearchSourceBuilder searchSourceBuilder,Class<T> resultClass,int currentPage,int size,List<String> highFields){SearchRequest request =newSearchRequest(index);// 高亮字段设置if(CollectionUtil.isNotEmpty(highFields)){buildHighLight(searchSourceBuilder, highFields);}// 分页int num =(currentPage -1)* size;
        searchSourceBuilder.from(num).size(size);
        request.source(searchSourceBuilder);SearchResponse response =null;try{
            response = restHighLevelClient.search(request,RequestOptions.DEFAULT);}catch(IOException e){
            e.printStackTrace();}assert response !=null;returnanalysisResponse(response, resultClass, currentPage, size, highFields);}/**
     * 解析es 查询结果
     *
     * @param response    返回
     * @param resultClass 转换成对象类
     */private<T>PageInfo<T>analysisResponse(SearchResponse response,Class<T> resultClass,int currentPage,int size,List<String> highFields){SearchHit[] searchHits = response.getHits().getHits();List<T> retList =newArrayList<>(searchHits.length);for(SearchHit searchHit : searchHits){String strJson = searchHit.getSourceAsString();T t =JSONUtil.toBean(strJson, resultClass);try{setId(resultClass, t,String.valueOf(searchHit.getId()));}catch(Exception e){
                log.info("es 查询数据设置主键id值异常", e);}// 高亮字段设置后,组织结果,es 结果建议与java 对象 名称一直,基本要求if(!CollectionUtils.isEmpty(highFields)){Map<String,HighlightField> highlightFieldMap = searchHit.getHighlightFields();HighlightField highlightField;for(String field : highFields){
                    highlightField = highlightFieldMap.get(field);if(highlightField !=null){// 获取指定字段的高亮片段Text[] fragments = highlightField.getFragments();// 将这些高亮片段拼接成一个完整的高亮字段StringBuilder builder =newStringBuilder();for(Text text : fragments){
                            builder.append(text);}// 设置到实体类中setValue(resultClass, t, builder.toString(), field);}}}
            retList.add(t);}long totalNum = response.getHits().getTotalHits();PageInfo<T> pageVo =newPageInfo<>();
        pageVo.setPageNum(currentPage);
        pageVo.setPageSize(size);
        pageVo.setTotal(totalNum);
        pageVo.setList(retList);return pageVo;}/**
     * 对象id 为空时,给id 赋值
     *
     * @param resultClass 类
     * @param t           对象
     * @param id          主键id 的值
     */@SneakyThrowsprivate<T>voidsetId(Class<T> resultClass,T t,Object id){Field field =ReflectionUtils.findField(resultClass,"id");if(null!= field){
            field.setAccessible(true);Object object =ReflectionUtils.getField(field, t);if(object ==null){Method method = resultClass.getMethod("setId",String.class);ReflectionUtils.invokeMethod(method, t, id);}}}/**
     * 指定字段赋值
     *
     * @param resultClass 类
     * @param t           对象
     * @param fieldValue  字段名
     * @param fieldName   字段值
     */@SneakyThrowsprivate<T>voidsetValue(Class<T> resultClass,T t,Object fieldValue,String fieldName){Field field =ReflectionUtils.findField(resultClass, fieldName);if(null!= field){
            field.setAccessible(true);String methodName ="set".concat(captureName(fieldName));Method method = resultClass.getMethod(methodName,String.class);ReflectionUtils.invokeMethod(method, t, fieldValue);}}/**
     * 进行字母的ascii编码前移,效率要高于截取字符串进行转换的操作
     *
     * @param str /
     */privateStringcaptureName(String str){char[] cs = str.toCharArray();
        cs[0]-=32;returnString.valueOf(cs);}/**
     * 设置高亮
     *
     * @param searchSourceBuilder /
     * @param fields              高亮字段
     */privatevoidbuildHighLight(SearchSourceBuilder searchSourceBuilder,List<String> fields){// 设置高亮HighlightBuilder highlightBuilder =newHighlightBuilder();
        fields.forEach(highlightBuilder::field);
        highlightBuilder.preTags("<em>");
        highlightBuilder.postTags("</em>");// 给请求设置高亮
        searchSourceBuilder.highlighter(highlightBuilder);}/**
     * es Scroll 深分页 定义bean
     */@AllArgsConstructor@DatapublicclassScrollPageBean<T>{privateString scrollId;privatePageInfo<T> scrollPage;}}
5)返回的数据BO封装
packagecom.test.bi.bo.course;importio.swagger.annotations.ApiModelProperty;importlombok.Data;importorg.springframework.data.annotation.Id;importorg.springframework.data.elasticsearch.annotations.Document;importorg.springframework.data.elasticsearch.annotations.Field;importorg.springframework.data.elasticsearch.annotations.FieldType;importjava.io.Serializable;/**
 * 课程-搜索
 *
 * @author /
 */@Document(indexName ="course_es", type ="_doc", replicas =0)@DatapublicclassCourseEsDTOimplementsSerializable{@ApiModelProperty(value ="课程编号")@IdprivateString id;@ApiModelProperty(value ="标题")@Field(type =FieldType.Text, analyzer ="ik_max_word")privateString name;@ApiModelProperty(value ="讲师名称")@Field(type =FieldType.Text, analyzer ="ik_max_word")privateString teacherName;@ApiModelProperty(value ="子标题")@Field(type =FieldType.Text, analyzer ="ik_max_word")privateString subtitle;@ApiModelProperty(value ="所属标签名称列表")@Field(type =FieldType.Text, analyzer ="ik_max_word")privateString labelName;@ApiModelProperty(value ="封面图片")@Field(type =FieldType.Text)privateString pic;@ApiModelProperty(value ="讲数")@Field(type =FieldType.Integer)privateInteger count;}
6)保存数据至ElasticSearch

保存数据到ElasticSearch中,通常有两种方式:一种是通过ElasticsearchRepository 接口,另一种是通过ElasticsearchTemplate接口,我们使用ElasticsearchRepository接口来实现。
实现接口:CourseEsRepository,可以放在mapper里面。

packagecom.test.bi.mapper;importcom.test.bi.bo.course.CourseEsDTO;importorg.springframework.data.elasticsearch.repository.ElasticsearchRepository;importorg.springframework.stereotype.Repository;/**
 * 课程-搜索
 *
 * @author /
 */@RepositorypublicinterfaceCourseEsRepositoryextendsElasticsearchRepository<CourseEsDTO,String>{}

保存数据:

@ResourceprivateCourseEsRepository courseEsRepository;@OverridepublicvoidsaveCourseEs(){try{
        courseEsRepository.deleteAll();// 从库中取数据List<CourseEsDTO> list = courseMapper.getCourseEsList();if(IterUtil.isNotEmpty(list)){
            courseEsRepository.saveAll(list);}}catch(Exception e){
        log.error("es error:{}", e.getMessage());}}
7)根据关键字搜索,分页返回数据
@OverridepublicPageInfo<CourseEsDTO>getCourseEs(String keywords,Integer pageNum,Integer pageSize){// 使用best_fields模式, 对多个字段进行查询SearchSourceBuilder searchSourceBuilder =newSearchSourceBuilder();String[] queryFields ={"teacherName","subtitle","labelName"};QueryBuilder queryBuilder =QueryBuilders.multiMatchQuery(keywords, queryFields).field("name",2F).tieBreaker(0.3F);
    searchSourceBuilder.query(queryBuilder);// 高亮List<String> highFields =ListUtil.toList(queryFields);
    highFields.add("name");// 分页返回数据return elasticsearchUtil.page("course_es", searchSourceBuilder,CourseEsDTO.class, pageNum,
            pageSize, highFields);}
8)总结

总结一下,就是通过一个小的需求例子,很好地实现了分词搜索,并且要能够高亮显示关键字。
具体的ElasticSearch和对应iK分词的代码功能,我就不一一展开了,可以去官方查看,这里只是演示一下实现搜索功能的简单过程,希望对大家有所帮助。


本文转载自: https://blog.csdn.net/joinclear/article/details/128897784
版权归原作者 joinclear 所有, 如有侵权,请联系我们删除。

“[Spring Boot]12 ElasticSearch实现分词搜索功能”的评论:

还没有评论