网络知识 娱乐 ES全文检索详细教程

ES全文检索详细教程

        

ES全文检索

ES查询一共分两种 :  语句查询    聚合查询

语句查询中包含:词条查询  匹配查询  复合查询

聚合查询中包含:统计   分组

ES javaAPI的相关体系  

 词条查询

所谓词条查询,也就是ES不会对查询条件进行分词处理,只有当词条和查询字符串完全匹配时,才会被查询到。

等值查询-term

等值查询,即筛选出一个字段等于特定值的所有记录。

SQL:

select * from person where name = '张无忌';

而使用ES查询语句却很不一样(注意查询字段带上keyword):

GET /person/_search

{

"query": {

"term": {

"name.keyword": {

"value": "张无忌",

"boost": 1.0

}

}

}

}

ElasticSearch 5.0以后,string类型有重大变更,移除了string类型,string字段被拆分成两种新的数据类型: text用于全文搜索的,而keyword用于关键词搜索。

多值查询-terms

多条件查询类似Mysql里的IN查询,例如:

Sql

select * from persons where sect in('明教','武当派');

ES查询语句:

GET /person/_search

{

"query": {

"terms": {

"sect.keyword": [

"明教",

"武当派"

],

"boost": 1.0

}

}

}

Java实现:

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 构建查询语句

searchSourceBuilder.query(QueryBuilders.termsQuery("sect.keyword", Arrays.asList("明教", "武当派")));

}

范围查询-range

Sql

Select * from pesons where  age between 18 and 22;

ES查询语句

GET /person/_search

{

"query": {

"range": {

"age": {

"from": 10,

"to": 20,

"include_lower": true,

"include_upper": true,

"boost": 1.0

}

}

}

}

Java构建查询条件:

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 构建查询语句

searchSourceBuilder.query(QueryBuilders.rangeQuery("age").gte(10).lte(30));

}

前缀查询-prefix

Sql

Select * from persons where sect like ‘武当%’;

ES查询语句

{

"query": {

"prefix": {

"sect.keyword": {

"value": "武当",

"boost": 1.0

}

}

}

}

Java构建查询条件:

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 构建查询语句

searchSourceBuilder.query(QueryBuilders.prefixQuery("sect.keyword","武当"));

 通配符查询-wildcard

Sql

select * from persons where name like '张%忌';

ES查询语句

{

"query": {

"wildcard": {

"sect.keyword": {

"wildcard": "张*忌",

"boost": 1.0

}

}

}

}

Java构建条件

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 构建查询语句

searchSourceBuilder.query(QueryBuilders.wildcardQuery("sect.keyword","张*忌"));

复合查询

Sql

select * from persons where sex = '女' and sect = '明教';

Es查询语句

{

"query": {

"bool": {

"must": [

{

    "term": {

"sex": {

"value": "女",

"boost": 1.0

}

}

},

{

"term": {

"sect.keywords": {

"value": "明教",

"boost": 1.0

}

}

}

],

"adjust_pure_negative": true,

"boost": 1.0

}

}

}

Java构建查询语句

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 构建查询语句

searchSourceBuilder.query(QueryBuilders.boolQuery()

        .must(QueryBuilders.termQuery("sex", "女"))

        .must(QueryBuilders.termQuery("sect.keyword", "明教"))

);

布尔查询

  • must:所有的语句都必须匹配,与 ‘=’ 等价。
  • must_not:所有的语句都不能匹配,与 ‘!=’ 或 not in 等价。
  • should:至少有n个语句要匹配,n由参数控制。

Java构建查询语句:

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 构建查询语句

searchSourceBuilder.query(QueryBuilders.boolQuery()

        .must(QueryBuilders.termQuery("sex", "女"))

        .should(QueryBuilders.termQuery("address.word", "峨眉山"))

        .should(QueryBuilders.termQuery("sect.keyword", "明教"))

        .minimumShouldMatch(1)

);

看一下复杂的列子 ,将bool的各子句联合使用:

Sql

select

*

from

persons

where

sex = '女'

and

age between 30 and 40

and

sect != '明教'

and

(address = '峨眉山' OR skill = '暗器')

Java构建这个查询条件

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 构建查询语句

BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()

        .must(QueryBuilders.termQuery("sex", "女"))

        .must(QueryBuilders.rangeQuery("age").gte(30).lte(40))

        .mustNot(QueryBuilders.termQuery("sect.keyword", "明教"))

        .should(QueryBuilders.termQuery("address.keyword", "峨眉山"))

        .should(QueryBuilders.rangeQuery("power.keyword").gte(50).lte(80))

        .minimumShouldMatch(1);  // 设置should至少需要满足几个条件

// 将BoolQueryBuilder构建到SearchSourceBuilder中

searchSourceBuilder.query(boolQueryBuilder);

Filter查询

query和filter的区别:query查询的时候,会先比较查询条件,然后计算分值,最后返回文档结果;而filter是先判断是否满足查询条件,如果不满足会缓存查询结果(记录该文档不满足结果),满足的话,就直接缓存结果,filter不会对结果进行评分,能够提高查询效率。

filter的使用方式比较多样,下面用几个例子演示一下。

方式一,单独使用:

{

"query": {

"bool": {

"filter": [

{

"term": {

"sex": {

"value": "男",

"boost": 1.0

}

}

}

],

"adjust_pure_negative": true,

"boost": 1.0

}

}

}

Java构建查询语句:

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 构建查询语句

searchSourceBuilder.query(QueryBuilders.boolQuery()

        .filter(QueryBuilders.termQuery("sex", "男"))

);

方式二 ,和must、must_not同级,相当于子查询:

Sql

select * from (select * from persons where sect = '明教')) a where sex = '女';

Java构建查询条件:

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 构建查询语句

searchSourceBuilder.query(QueryBuilders.boolQuery()

        .must(QueryBuilders.termQuery("sect.keyword", "明教"))

        .filter(QueryBuilders.termQuery("sex", "女"))

);

聚合查询

最值、查询最大年龄、最小年龄、平均年龄

Sql

select max(age) from persons;

Java查询条件

@Autowired

private RestHighLevelClient client;

@Test

public void maxQueryTest() throws IOException {

// 聚合查询条件

    AggregationBuilder aggBuilder = AggregationBuilders.max("max_age").field("age");

    SearchRequest searchRequest = new SearchRequest("person");

    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

    // 将聚合查询条件构建到SearchSourceBuilder中

    searchSourceBuilder.aggregation(aggBuilder);

    System.out.println("searchSourceBuilder----->" + searchSourceBuilder);

    searchRequest.source(searchSourceBuilder);

    // 执行查询,获取SearchResponse

    SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);

    System.out.println(JSONObject.toJSON(response));

}

结果中默认只会返回10条文档数据 返回多少条数据可以自主控制

Java中只需要加下面一条语句即可

searchSourceBuilder.size(20);

去重查询

Sql

Select count(distinct sect) from persons;

Java查询条件

@Test

public void cardinalityQueryTest() throws IOException {

// 创建某个索引的request

    SearchRequest searchRequest = new SearchRequest("person");

    // 查询条件

    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

    // 聚合查询

    AggregationBuilder aggBuilder = AggregationBuilders.cardinality("sect_count").field("sect.keyword");

    searchSourceBuilder.size(0);

    // 将聚合查询构建到查询条件中

    searchSourceBuilder.aggregation(aggBuilder);

    System.out.println("searchSourceBuilder----->" + searchSourceBuilder);

    searchRequest.source(searchSourceBuilder);

    // 执行查询,获取结果

    SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);

    System.out.println(JSONObject.toJSON(response));

}

分组聚合

单条件分组

Sql

select sect,count(id) from mytest.persons group by sect;

Java条件查询

SearchRequest searchRequest = new SearchRequest("person");

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

searchSourceBuilder.size(0);

// 按sect分组

AggregationBuilder aggBuilder = AggregationBuilders.terms("sect_count").field("sect.keyword");

searchSourceBuilder.aggregation(aggBuilder);

过滤聚合

前面所有聚合的例子请求都省略了 query ,整个请求只不过是一个聚合。这意味着我们对全部数据进行了聚合,但现实应用中,我们常常对特定范围的数据进行聚合,

Sql

select max(age) from mytest.persons where sect = '明教';

Java查询条件

SearchRequest searchRequest = new SearchRequest("person");

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 聚合查询条件

AggregationBuilder maxBuilder = AggregationBuilders.max("max_age").field("age");

// 等值查询

searchSourceBuilder.query(QueryBuilders.termQuery("sect.keyword", "明教"));

searchSourceBuilder.aggregation(maxBuilder);