Elasticsearch(ES)

入门
正向、倒排索引


文档

索引(Index)

与MySQL对比

架构

安装
1、elasticsearch
docker network create es-net
|
docker pull elasticsearch:8.9.0
|
docker run -d \ --name es \ --network es-net \ -p 9200:9200 \ -p 9300:9300 \ -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \ -e "discovery.type=single-node" \ -e "xpack.security.enabled=true" \ -v es-data:/usr/share/elasticsearch/data \ -v es-plugins:/usr/share/elasticsearch/plugins \ --privileged \ elasticsearch:8.9.0
|
-e "cluster.name=es-docker-cluster"
:设置集群名称
-e "http.host=0.0.0.0"
:监听的地址,可以外网访问
-e "ES_JAVA_OPTS=-Xms512m -Xmx512m"
:内存大小
-e "discovery.type=single-node"
:非集群模式
-v es-data:/usr/share/elasticsearch/data
:挂载逻辑卷,绑定es的数据目录
-v es-logs:/usr/share/elasticsearch/logs
:挂载逻辑卷,绑定es的日志目录
-v es-plugins:/usr/share/elasticsearch/plugins
:挂载逻辑卷,绑定es的插件目录【自己创建】
--privileged
:授予逻辑卷访问权
--network es-net
:加入一个名为es-net的网络中
-p 9200:9200
:端口映射配置
- 访问 http:// IPIPIP:9200 即可看到elasticsearch的响应结果:JASON字符串
docker exec -it es /bin/bash
|
bin/elasticsearch-reset-password -u elastic
|
elasticsearch-users useradd mykibana -p <esSales..> -r kibana_system elasticsearch-users useradd mykibana -p esSales.. -r kibana_system
|
2、kibana(ES_Ui界面)
Kibana 8.9.0 | Elastic
docker pull kibana:版本和kibana一致
|
docker run -d \ --name kibana \ -e ELASTICSEARCH_HOSTS=http://es:9200 \ -e ELASTICSEARCH_USERNAME=mykibana \ -e ELASTICSEARCH_PASSWORD=esSales.. \ --network=es-net \ -p 5601:5601 \ kibana:8.9.1
|
--network es-net
:加入一个名为es-net的网络中,与elasticsearch在同一个网络中
-e ELASTICSEARCH_HOSTS=http://es:9200"
:设置elasticsearch的地址,因为kibana已经与elasticsearch在一个网络,因此可以用容器名直接访问elasticsearch
-p 5601:5601
:端口映射配置
http:// IPIPIPIPIP:5601
elasticsearch.hosts: ["http://URL:9200"]
elasticsearch.username: "服务账户" elasticsearch.password: "密码.."
|
3、安装分词器
docker volume inspect es-plugins
{ "CreatedAt": "2023-12-08T17:23:15+08:00", "Driver": "local", "Labels": null, "Mountpoint": "/var/lib/docker/volumes/es-plugins/_data", "Name": "es-plugins", "Options": null, "Scope": "local" }
unzip 压缩包
|
docker restart es
docker logs -f es
|
http:// IPIPIPIPIP:5601
测试
IK分词器包含两种模式:
ik_smart
:最少切分
ik_max_word
:最细切分
GET /_analyze { "text": "辰呀正在学习java,太棒了吧", "analyzer": "ik_smart" }
|

4、安装拼音分词器
- 下载[ Pinyin
- 打包成对应的Es版本插件:依赖选择对应的Es版本,然后Maven生命周期-》打包Install
解决拼音分词器不分词情况:先使用ik分词器再使用拼音分词器

POST /_analyze { "text": ["如家如家酒店"], "analyzer": "pinyin" }
|
索引库操作

- text : 分词
- keyword:不分词
- index:是否参与搜索
创建
PUT /chen { "mappings": { "properties": { "info": { "type": "text", "analyzer": "ik_smart" }, "email": { "type": "keyword", "index": false }, "name": { "type": "object", "properties": { "firstName": { "type": "keyword" }, "lastName": { "type": "keyword" } } } } } }
|
删除
修改

查询
文档操作
新增

POST /chen/_doc/1 { "info":"辰在学习Java", "email":"3160@qq.com", "name":{ "firstName":"陈", "lastNmae":"李" } }
|
删除

查看

更改
PUT /chenp/_doc/2 { "info":"辰在学习Java", "email":"chenCCCCCCC@qq.com", "name":{ "firstName":"陈", "lastNmae":"李" } }
|
POST /chenp/_update/2 { "doc":{ "email": "PC@qq.com" } }
|
DSL查询语法

全文检索查询

GET /hotel/_search { "query": { "match_all": {} } }
|
# match查询 GET /hotel/_search { "query": { "match": { "all": "如家" } } }
|
- multi_match:根据多个字段查询,参与查询字段越多,查询性能越差
# match查询 GET /hotel/_search { "query": { "multi_match": { "query": "如家", "fields": ["brand","name"] } } }
|
精确检索查询
- term:精确检索:根据词条精确匹配,一般搜索keyword类型、数值类型、布尔类型、日期类型字段
GET /hotel/_search { "query": { "term": { "city": { "value": "北京" } } } }
|
GET /hotel/_search { "query": { "range": { "price": { "gte": 1100, "lte": 3000 } } } }
|
- 地理范围查询 distance:范围,location:中心点
GET /hotel/_search { "query": { "geo_distance":{ "distance": "2km", "location": "31.21,121.5" } } }
GET /hotel/_search { "query": { "match_all": {} }, "sort": [ { "_geo_distance": { "location": { "lat": 31.375558, "lon": 121.533514 }, "order": "asc", "unit": "km" } } ] }
|
复合查询
算分规则
在平常的查询条件上,实现过滤加分
默认BM25算法


GET /hotel/_search { "query": { "function_score": { "query": { "match": { "all" : "外滩" } }, "functions": [ { "filter": { "term": {"brand": "如家"} }, "weight": 10 } ], "boost_mode": "sum" } } }
|
布尔查询


GET /hotel/_search { "query": { "bool": { "must": [ { "match": {"name": "如家"} } ], "must_not": [ { "range": {"price": {"gte": 400} } } ], "filter": [ { "geo_distance": { "distance": "10km", "location": {"lat": 31.21,"lon": 121.5} } } ] } } }
|
搜索结果处理
- 搜索全部
- 排序,按照sort的条件进行排序,可多个排序条件,满足第一个则开始执行第二个
GET /hotel/_search { "query": { "match_all": {} }, "sort": [ { "score": "desc" }, { "price": "asc" } ] }
|
GET /hotel/_search { "query": { "match_all": {} }, "sort": [ { "_geo_distance": { "location": {"lat": 31.375558,"lon": 121.533514}, "order": "asc", "unit": "km" } } ] }
|
分页
GET /hotel/_search { "query": { "match_all": {} }, "sort": [ { "price": "desc" } ], "from": 0, "size": 2 }
|

高亮

GET /hotel/_search { "query": { "match": { "all": "如家" } }, "highlight": { "fields": { "name": { "require_field_match": "false" } } } } }
|
DSL聚合


聚合

Bucket聚合会统计Bucket内的文档数量,记为**_count**,并且按照_count降序排序
GET /hotel/_search { "size": 0, "aggs": { "barndAgg": { "terms": { "field": "brand", "size": 20, "order": { "_count": "desc" } } } } }
GET /hotel/_search { "query": { "range": { "price": { "lte": 200 } } }, "size": 0, "aggs": { "brandAgg":{ "terms": { "field": "brand", "size": 20 }}}}
|

嵌套聚合
- 获取每个品牌的用户评分的min、max、avg的值

POST /hotel/_search { "size": 0, "aggs": { "brandAgg":{ "terms": { "field": "brand", "size": 20, "order": { "scoreAgg.avg": "desc" } }, "aggs":{ "scoreAgg":{ "stats": { "field": "score" } }}}}}
|
自动补全
使用
解决拼音分词器不分词情况:先使用ik分词器再使用拼音分词器
- 要在创建索引库的时候设置


PUT /test { "settings": { "analysis": { "analyzer": { "my_analyzer": { "tokenizer": "ik_max_word", "filter": "py" } }, "filter": { "py": { "type": "pinyin", "keep_full_pinyin": false, "keep_joined_full_pinyin": true, "keep_original": true, "limit_first_letter_length": 16, "remove_duplicated_term": true, "none_chinese_pinyin_tokenize": false } } } }, "mappings": { "properties": { "name":{ "type": "text", "analyzer": "my_analyzer", "search_analyzer": "ik_smart" } } } }
|
POST /test/_analyze { "text": ["如家如家酒店"], "analyzer": "my_analyzer" }
|
总结:

语法
- 创建索引库字段的时候要求类型为 completion

PUT test { "mappings": { "properties": { "title":{ "type": "completion" } } } }
|

POST test/_doc { "title": ["Sony", "WH-1000XM3"] } POST test/_doc { "title": ["SK-II", "PITERA"] } POST test/_doc { "title": ["Nintendo", "switch"] }
POST /test/_search { "suggest": { "title_suggest": { "text": "s", "completion": { "field": "title", "skip_duplicates": true, "size": 10 } } } }
|
RestClient(ES)
ES官方提供了各种不同语言的客户端,用来操作ES。这些客户端的本质就是组装DSL语句,通过http请求发送给ES。官方文档地址:https:/Www.elastic.co/quide/en/elasticsearch/client/index.html
1、依赖
<properties> <elasticsearch.version>7.12.1</elasticsearch.version> </properties>
<dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>elasticsearch-rest-high-level-client</artifactId> <version>7.12.1</version> </dependency>
|
2、初始化 RestHighLevelClient
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder( HttpHost.create("http://192.168.238.3:9200") ));
|
索引库操作

创建
- 请求:CreateIndexRequest
- create
CreateIndexRequest request = new CreateIndexRequest("hotel");
request.source(MAPPING_TEMPLATE, XContentType.JSON);
client.indices().create(request, RequestOptions.DEFAULT);
|
删除
- 请求:DeleteIndexRequest
- delete
DeleteIndexRequest request = new DeleteIndexRequest("hotel");
client.indices().delete(request, RequestOptions.DEFAULT);
|
查询
- 请求:GetIndexRequest
- exists
GetIndexRequest request = new GetIndexRequest("hotel");
boolean exists = client.indices().exists(request, RequestOptions.DEFAULT);
|
文档操作
创建
Hotel hotel = hotelService.getById(395434);
HotelDoc hotelDoc = new HotelDoc(hotel);
IndexRequest request = new IndexRequest("hotel").id(hotel.getId().toString());
request.source(JSON.toJSONString(hotelDoc), XContentType.JSON);
client.index(request,RequestOptions.DEFAULT);
List<Hotel> hotels = hotelService.list();
BulkRequest request = new BulkRequest();
for (Hotel hotel : hotels) { HotelDoc hotelDoc = new HotelDoc(hotel); request.add(new IndexRequest("hotel") .id(hotelDoc.getId().toString()) .source(JSON.toJSONString(hotelDoc),XContentType.JSON)); }
client.bulk(request,RequestOptions.DEFAULT);
|
GET /hotel/_doc/395434
查询

GetRequest request = new GetRequest("hotel","395434");
GetResponse response = client.get(request,RequestOptions.DEFAULT);
String json = response.getSourceAsString();
HotelDoc hotelDoc = JSON.parseObject(json, HotelDoc.class);
|
更新

UpdateRequest request = new UpdateRequest("hotel", "395434");
request.doc( "price","992", "starName","满天星级" );
client.update(request,RequestOptions.DEFAULT);
|
删除
DeleteRequest request = new DeleteRequest("hotel", "395434");
client.delete(request,RequestOptions.DEFAULT);
|
批量导入
@Test void testBulkRequest() throws IOException { List<Hotel> hotels = hotelService.list();
BulkRequest request = new BulkRequest();
for (Hotel hotel : hotels) { HotelDoc hotelDoc = new HotelDoc(hotel); request.add(new IndexRequest("hotel") .id(hotelDoc.getId().toString()) .source(JSON.toJSONString(hotelDoc),XContentType.JSON)); } client.bulk(request,RequestOptions.DEFAULT);
|
文档查询与结果处理
QueryBuilders
准备:
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder( HttpHost.create("http://192.168.238.3:9200") ));
SearchRequest request = new SearchRequest("hotel");
|
检索


request.source().query(QueryBuilders.matchAllQuery());
request.source().query(QueryBuilders.matchQuery("all","如家"));
request.source().query(QueryBuilders.multiMatchQuery("如家","name","business"));
request.source().query(QueryBuilders.termQuery("city","上海"));
request.source().query(QueryBuilders.rangeQuery("price").lte(1500).gte(100));
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery(); boolQuery .must(QueryBuilders.termQuery("city","上海")) .mustNot(QueryBuilders.rangeQuery("price").lte(100)) .filter(QueryBuilders.rangeQuery("price").gte(1000)); request.source().query(boolQuery);
|
高亮
request.source() .query(QueryBuilders.matchQuery("all","如家")) .highlighter(new HighlightBuilder().field("name") .requireFieldMatch(false));
request.source().highlighter(new HighlightBuilder().field("name").requireFieldMatch(false));
|

if (!highlightFields.isEmpty()){ HighlightField highlightField = highlightFields.get("name"); if (highlightField!=null){ String name = highlightField.getFragments()[0].toString(); hotelDoc.setName(name); } }
|
距离排序

String location = params.getLocation(); if (location != null && !location.equals("")){ request.source().sort(SortBuilders .geoDistanceSort("location",new GeoPoint(location)) .order(SortOrder.ASC) .unit(DistanceUnit.KILOMETERS) ); }
Object[] sortValues = hit.getSortValues();
|
算分控制

FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery( boolQuery, new FunctionScoreQueryBuilder.FilterFunctionBuilder[]{ new FunctionScoreQueryBuilder.FilterFunctionBuilder( QueryBuilders.termQuery("isAd",true), ScoreFunctionBuilders.weightFactorFunction(10) ) } ); request.source().query(functionScoreQuery);
|
结果与解析
SearchResponse response = client.search(request,RequestOptions.DEFAULT);
SearchHits searchHits = response.getHits();
long value = searchHits.getTotalHits().value;
SearchHit[] hits = searchHits.getHits();
for (SearchHit hit : hits) { String json= hit.getSourceAsString(); HotelDoc hotelDoc = JSON.parseObject(json, HotelDoc.class); Map<String, HighlightField> highlightFields = hit.getHighlightFields(); if (!highlightFields.isEmpty()){ HighlightField highlightField = highlightFields.get("name"); if (highlightField!=null){ String name = highlightField.getFragments()[0].toString(); hotelDoc.setName(name); } } Object[] sortValues = hit.getSortValues(); if (sortValues.length>0){ Object sortValue = sortValues[0]; } System.out.println("hotelDoc" + hotelDoc); }
|
聚合


SearchRequest request = new SearchRequest("hotel");
request.source().size(0);
request.source().aggregation(AggregationBuilders .terms("brandAgg") .field("brand") .size(10) );
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
Aggregations aggregations = response.getAggregations();
Terms brandTerms = aggregations.get("brandAgg");
List<? extends Terms.Bucket> buckets = brandTerms.getBuckets();
for (Terms.Bucket bucket:buckets){ String key = bucket.getKeyAsString(); int count = bucket.getDocCount()
}
|
自动补全

SearchRequest request = new SearchRequest("hotel");
request.source().suggest(new SuggestBuilder().addSuggestion( "suggestions", SuggestBuilders.completionSuggestion("suggestion") .prefix("pg") .skipDuplicates(true) .size(10) ));
SearchResponse response = client.search(request,RequestOptions.DEFAULT);
|

Suggest suggest = response.getSuggest(); CompletionSuggestion suggestions = suggest.getSuggestion("suggestions"); List<CompletionSuggestion.Entry.Option> options = suggestions.getOptions(); for (CompletionSuggestion.Entry.Option option : options) {
System.out.println(option.getText().string()); }
|
Mq&Es数据同步
优点:实现简单,粗暴
缺点:业务耦合度高

优点:完全解除服务间耦合
缺点:开启binlog增加数据库负担、实现复杂度高

优点:低耦合,实现难度一般
缺点:依赖mq的可靠性

实践
- 对于数据库的CRUD:,MQ中对Es的操作只有,删除或者插入,es中如果对于的文档存在则为修改,不存在则新增

常量类
【消费者、发布者】
public class MqConstants {
public final static String HOTEL_EXCHANGE = "hotel.topic";
public final static String HOTEL_INSERT_QUEUE = "hotel.insert.queue";
public final static String HOTEL_DELETE_QUEUE = "hotel.delete.queue";
public final static String HOTEL_INSERT_KEY = "hotel.insert";
public final static String HOTEL_DELETE_KEY = "hotel.delete";
}
|
发布
spring:
rabbitmq: host: 198.168.238.3 port: 5672 username: itcast password: 123321 virtual-host: /
|
@Autowired private RabbitTemplate rabbitTemplate;
rabbitTemplate.convertAndSend(MqConstants.HOTEL_EXCHANGE,MqConstants.HOTEL_INSERT_KEY,hotel.getId());
|
接收
@Configuration public class MqConfig {
@Bean public TopicExchange topicExchange(){ return new TopicExchange(MqConstants.HOTEL_EXCHANGE,true,false); }
@Bean public Queue insertQueue(){ return new Queue(MqConstants.HOTEL_INSERT_QUEUE,true); }
@Bean public Queue deleteQueue(){ return new Queue(MqConstants.HOTEL_DELETE_QUEUE,true); }
@Bean public Binding insertQueueBinding(){ return BindingBuilder.bind(insertQueue()).to(topicExchange()).with(MqConstants.HOTEL_INSERT_KEY); }
@Bean public Binding deleteQueueBinding(){ return BindingBuilder.bind(deleteQueue()).to(topicExchange()).with(MqConstants.HOTEL_DELETE_KEY); } }
|
@Component public class HotelListener {
@Autowired private IHotelService hotelService;
@RabbitListener(queues = MqConstants.HOTEL_INSERT_QUEUE) public void listenHotelInsertOrUpdate(Long id){ hotelService.insertById(id); IndexRequest request = new IndexRequest("hotel").id(hotel.getId().toString());
request.source(JSON.toJSONString(hotelDoc), XContentType.JSON);
client.index(request, RequestOptions.DEFAULT); }
@RabbitListener(queues = MqConstants.HOTEL_DELETE_QUEUE) public void listenHotelDelete(Long id){ hotelService.deleteById(id); DeleteRequest request = new DeleteRequest("hotel",id.toString());
client.delete(request,RequestOptions.DEFAULT); }
}
|
ES集群
- 将单点的Es索引库分成主副分片,主副分片不存放在同一节点上

集群搭建
https://chen-1317386995.cos.ap-guangzhou.myqcloud.com/Java/Utils/%E5%AE%89%E8%A3%85elasticsearch.md
集群职责



集群数据同步




故障转移

