首頁 > 軟體

Elasticsearch學習之Terms set 查詢

2023-09-12 18:04:00

什麼是 terms set 查詢?

Terms set 查詢根據匹配給定欄位的精確術語的最少數量返回檔案。

terms set 查詢與 term 查詢有何不同?

Terms set query 和 Terms query 之間的唯一區別是你可以提供必須匹配的最少數量的術語才能檢索特定檔案。

什麼是 minimum_should_match_field 引數?

指向檔案的數位(numeric)欄位名稱,其值應用作要匹配的最少術語數,以便返回檔案。

minimum_should_match_script 引數是什麼?

一個自定義指令碼,用於確定為了返回檔案而必須匹配的最少術語數。 如果你必須動態設定匹配所需的術語數,那麼它將很有幫助。

範例

讓我們首先建立索引:

`
 PUT product
 {
   "mappings": {
     "properties": {
       "name": {
         "type": "keyword"
       },
       "tags": {
         "type": "keyword"
        },
        "tags_count": {
          "type": "long"
        }
      }
    }
  }
`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

讓我們索引樣本檔案:

`
  POST product/_doc/prod1
  {
    "name":"Iphone 13",
    "tags":["apple","iphone","mobile"],
    "tags_count":3
  }
  POST product/_doc/prod2
  {
    "name":"Iphone 12",
    "tags":["apple","iphone"],
    "tags_count":2
  }
  POST product/_doc/prod3
  {
    "name":"Iphone 11",
    "tags":["apple","mobile"],
    "tags_count":2
  }
`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

使用 minimum_should_match_field 引數查詢:

用例 1:下面的查詢將返回所有 3 個檔案,因為 prod1 的最小術語匹配 (tags_count) 是 3,prod2 是 2,prod3 是 2,查詢中傳遞了總共 3 個術語("apple", "iphone", "mobile")。

 POST product/_search
 {
   "query": {
     "terms_set": {
       "tags": {
         "terms": [ "apple", "iphone", "mobile" ],
         "minimum_should_match_field": "tags_count"
       }
     }
    }
  }

上述查詢的結果是:

 `    "hits": {
     "total": {
       "value": 3,
       "relation": "eq"
     },
     "max_score": 1.4010588,
     "hits": [
       {
         "_index": "product",
          "_id": "prod1",
          "_score": 1.4010588,
          "_source": {
            "name": "Iphone 13",
            "tags": [
              "apple",
              "iphone",
              "mobile"
            ],
            "tags_count": 3
          }
        },
        {
          "_index": "product",
          "_id": "prod2",
          "_score": 0.7876643,
          "_source": {
            "name": "Iphone 12",
            "tags": [
              "apple",
              "iphone"
            ],
            "tags_count": 2
          }
        },
        {
          "_index": "product",
          "_id": "prod3",
          "_score": 0.7876643,
          "_source": {
            "name": "Iphone 11",
            "tags": [
              "apple",
              "mobile"
            ],
            "tags_count": 2
          }
        }
      ]
    }`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

用例二:下面的查詢將只返回一個檔案,因為查詢中只傳遞了 2 個術語,僅與 prod3 匹配。 prod1 將不會返回,因為 tags_count 值為 3 並且查詢中傳遞的總術語僅為 2。

 POST product/_search
 {
   "query": {
     "terms_set": {
       "tags": {
         "terms": [ "apple", "mobile" ],
         "minimum_should_match_field": "tags_count"
       }
     }
    }
  }

上述查詢的結果為:

 `    "hits": {
     "total": {
       "value": 1,
       "relation": "eq"
     },
     "max_score": 0.5007585,
     "hits": [
       {
         "_index": "product",
          "_id": "prod3",
          "_score": 0.5007585,
          "_source": {
            "name": "Iphone 11",
            "tags": [
              "apple",
              "mobile"
            ],
            "tags_count": 2
          }
        }
      ]
    }`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

minimum_should_match_script 範例:

現在讓我們看看如何使用 minimum should match 的動態值檢索相同的索引資料。

在下面的範例中,查詢中提供的術語總數的值將作為最小應匹配值傳遞。 我們將使用 params.num_terms 來計算查詢中提供的術語數。 需要匹配的詞條數不能超過 params.num_terms,即 terms 欄位中提供的詞條數。

 POST product/_search
 {
   "query": {
     "terms_set": {
       "tags": {
         "terms": ["apple","iphone"],
         "minimum_should_match_script": {
           "source": "params.num_terms"
         }
        }
      }
    }
  }

它將返回 prod1 和 prod2,因為 minimum_should_match 值將設定為 2,因為我們在查詢中僅傳遞了 2 個術語。上述命令的返回值為:

 `      "hits": [
       {
         "_index": "product",
         "_id": "prod2",
         "_score": 0.5007585,
         "_source": {
           "name": "Iphone 12",
           "tags": [
             "apple",
              "iphone"
            ],
            "tags_count": 2
          }
        },
        {
          "_index": "product",
          "_id": "prod1",
          "_score": 0.5007585,
          "_source": {
            "name": "Iphone 13",
            "tags": [
              "apple",
              "iphone",
              "mobile"
            ],
            "tags_count": 3
          }
        }
      ]
    }`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

讓我們考慮一個場景,你想要考慮 tags_count 的最小值或查詢中的術語數; 在這種情況下,以下查詢會有所幫助:

 POST product/_search
 {
   "query": {
     "terms_set": {
       "tags": {
         "terms": ["apple","iphone"],
         "minimum_should_match_script": {
           "source": "Math.min(params.num_terms, doc['tags_count'].value)"
         }
        }
      }
    }
  }

上述查詢的結果為:

 `      "hits": [
       {
         "_index": "product",
         "_id": "prod2",
         "_score": 0.61233616,
         "_source": {
           "name": "Iphone 12",
           "tags": [
             "apple",
              "iphone"
            ],
            "tags_count": 2
          }
        },
        {
          "_index": "product",
          "_id": "prod1",
          "_score": 0.61233616,
          "_source": {
            "name": "Iphone 13",
            "tags": [
              "apple",
              "iphone",
              "mobile"
            ],
            "tags_count": 3
          }
        }
      ]
    }`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

Terms set 查詢 Elasticsearch Java 使用者端

下面的程式碼將有助於使用 Elasticsearch Java 使用者端實現術語集查詢。

Using new Java API Client (8.x)

`
 List<String> tags = new ArrayList<String>();
 tags.add("apple");
 tags.add("iphone");
 // Using minimum_should_match_field param
 Query query1 = Query.of(q -> q.bool(BoolQuery.of(bq -> bq.must(ts -> ts.termsSet(
 		TermsSetQuery.of(tq -> tq.field("tags").minimumShouldMatchField("tags_count").terms(tags)))))));
 //Using minimum_should_match_script param
  Map<String, JsonData> param = new HashMap<String, JsonData>();
  Query query1 = Query
  		.of(q -> q.bool(BoolQuery.of(bq -> bq.must(ts -> ts.termsSet(TermsSetQuery.of(tq -> tq.field("tags")
  				.minimumShouldMatchScript(sc -> sc.inline(in -> in.lang("painless").source("params.num_terms").params(param)))
  				.terms(tags)))))));
`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

使用 Java High Level 使用者端(已棄用)

`
 Map<String, Object> param = new HashMap<String, Object>();
 Script script = new Script(ScriptType.INLINE, "painless", "params.num_terms", param);
 List<String> tags = new ArrayList<String>();
 tags.add("apple");
 tags.add("iphone");
 // Using minimum_should_match_field
 QueryBuilder query = QueryBuilders.boolQuery()
  		.must(new TermsSetQueryBuilder("tags", tags).setMinimumShouldMatchField("tags_count"));
  // Using minimum_should_match_script
  Map<String, Object> param = new HashMap<String, Object>();
  Script script = new Script(ScriptType.INLINE, "painless", "params.num_terms", param);
  QueryBuilder query = QueryBuilders.boolQuery()
  		.must(new TermsSetQueryBuilder("tags", tags).setMinimumShouldMatchScript(script));
`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

以上就是Elasticsearch學習之Terms set 查詢的詳細內容,更多關於Elasticsearch Terms set 查詢的資料請關注it145.com其它相關文章!


IT145.com E-mail:sddin#qq.com