23 个例子让你彻底搞定 Elasticsearch 的搜索查询语法

例子
最基础的 match query
提高相关度
布尔查询
模糊查询
通配符查询
正则表达式查询
短句查询
Query String
简化的 Query String
Term/Terms 查询
Term 查询 - 排序
范围查询
筛选布尔查询
相关度函数： Field Value Factor
相关度函数: 衰变函数
相关度函数：自定义函数

原文 https://dzone.com/articles/23-useful-elasticsearch-example-queries

为了说明 Elasticsearch 里不同的搜索类型，我们将会搜索一张名为 book 的表，并且拥有以下几个字段: title, authors, summary, release, data 和 number of reviews

首先我们将使用bulk API来准备一些测试数据

PUT /bookdb_index
    { "settings": { "number_of_shards": 1 }}


POST /bookdb_index/book/_bulk
    { "index": { "_id": 1 }}
    { "title": "Elasticsearch: The Definitive Guide", "authors": ["clinton gormley", "zachary tong"], "summary" : "A distibuted real-time search and analytics engine", "publish_date" : "2015-02-07", "num_reviews": 20, "publisher": "oreilly" }
    { "index": { "_id": 2 }}
    { "title": "Taming Text: How to Find, Organize, and Manipulate It", "authors": ["grant ingersoll", "thomas morton", "drew farris"], "summary" : "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization", "publish_date" : "2013-01-24", "num_reviews": 12, "publisher": "manning" }
    { "index": { "_id": 3 }}
    { "title": "Elasticsearch in Action", "authors": ["radu gheorge", "matthew lee hinman", "roy russo"], "summary" : "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms", "publish_date" : "2015-12-03", "num_reviews": 18, "publisher": "manning" }
    { "index": { "_id": 4 }}

例子

最基础的 match query

有两种方法可以执行一个基础的 full-text 查询：使用轻量的 Search Api，这种方式通过在 Url 中加入参数来进行简单的查询，或者通过 JSON 格式的 request body，这种方式可以使用完整的 Elasticsearch 搜索 DSL

这是一个基础的查询，搜索所有字段，匹配字符 “guide”

GET /bookdb_index/book/_search?q=guide
[Results]
"hits": [
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "4",
    "_score": 1.3278645,
    "_source": {
      "title": "Solr in Action",
      "authors": [
        "trey grainger",
        "timothy potter"
      ],
      "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
      "publish_date": "2014-04-05",
      "num_reviews": 23,
      "publisher": "manning"
    }
  },
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "1",
    "_score": 1.2871116,
    "_source": {
      "title": "Elasticsearch: The Definitive Guide",
      "authors": [
        "clinton gormley",
        "zachary tong"
      ],
      "summary": "A distibuted real-time search and analytics engine",
      "publish_date": "2015-02-07",
      "num_reviews": 20,
      "publisher": "oreilly"
    }
  }
]

下面是使用 DSL 的版本，和上面的效果相同

GET /bookdb_index/book/_search

{
    "query": {
        "multi_match" : {
            "query" : "guide",
            "fields" : ["title", "authors", "summary", "publish_date", "num_reviews", "publisher"]
        }
    }
}

multi_match关键字的作用是可以使多个字段同时匹配一个关键字，fields属性声明需要查询哪些字段，在这个例子中，我们搜索了这张表的所有字段

备注：Elasticsearch6 之前的版本可以使用"_all"，来代替你声明所有字段，"_all"字段的原理是串联所有的字段到一个大的字段，并使用空格分割，然后再分析并索引字段, 从 Elasticsearch6 开始，这个功能将被弃用，Elasticsearch6 提供了"copy_to"参数，你可以利用这个创建自定义的"_all"字段，查看Guide获取更多信息

Url 的搜索方式也允许你声明你想要搜索的字段，比如，搜索 title 为 “in Action” 的 books

GET /bookdb_index/book/_search?q=title:in action
[Results]
"hits": [
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "3",
    "_score": 1.6323128,
    "_source": {
      "title": "Elasticsearch in Action",
      "authors": [
        "radu gheorge",
        "matthew lee hinman",
        "roy russo"
      ],
      "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
      "publish_date": "2015-12-03",
      "num_reviews": 18,
      "publisher": "manning"
    }
  },
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "4",
    "_score": 1.6323128,
    "_source": {
      "title": "Solr in Action",
      "authors": [
        "trey grainger",
        "timothy potter"
      ],
      "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
      "publish_date": "2014-04-05",
      "num_reviews": 23,
      "publisher": "manning"
    }
  }
]

然而，使用完整的 DSL 搜索可以创建更加复杂的查询（我们等会会看到）并且你可以声明你想要的返回的结果，在下面的例子中，我们声明了我们想要返回多少个结果，偏移量是多少（这可以用来做分页），返回哪些字段以及配置高亮，注意我们使用了 “match”，代替了 “multi_match”，因为我们只想要搜索 title 这一个字段

POST /bookdb_index/book/_search
{
    "query": {
        "match" : {
            "title" : "in action"
        }
    },
    "size": 2,
    "from": 0,
    "_source": [ "title", "summary", "publish_date" ],
    "highlight": {
        "fields" : {
            "title" : {}
        }
    }
}
[Results]
"hits": {
  "total": 2,
  "max_score": 1.6323128,
  "hits": [
    {
      "_index": "bookdb_index",
      "_type": "book",
      "_id": "3",
      "_score": 1.6323128,
      "_source": {
        "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
        "title": "Elasticsearch in Action",
        "publish_date": "2015-12-03"
      },
      "highlight": {
        "title": [
          "Elasticsearch <em>in</em> <em>Action</em>"
        ]
      }
    },
    {
      "_index": "bookdb_index",
      "_type": "book",
      "_id": "4",
      "_score": 1.6323128,
      "_source": {
        "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
        "title": "Solr in Action",
        "publish_date": "2014-04-05"
      },
      "highlight": {
        "title": [
          "Solr <em>in</em> <em>Action</em>"
        ]
      }
    }
  ]

注意：对于多个单词的搜索，match 查询支持使用 “operator” 参数来声明多个单词之间的关系，可以使用 and 来代替默认的 or，你还能通过声明 “minimum_should_match” 参数去调整搜索结果的相关性，更多细节，查看Elasticsearch guide

提高相关度

在我们搜索多个字段的时候，我们也许想要提高某个字段的分数（相关度），在下面的例子里，我们将 summary 这个字段的分数提高到了 3，增加这个字段的重要性，反过来说，这增加了_id 为 4 的这条数据的相关性

POST /bookdb_index/book/_search
{
    "query": {
        "multi_match" : {
            "query" : "elasticsearch guide",
            "fields": ["title", "summary^3"]
        }
    },
    "_source": ["title", "summary", "publish_date"]
}
[Results]
"hits": {
  "total": 3,
  "max_score": 3.9835935,
  "hits": [
    {
      "_index": "bookdb_index",
      "_type": "book",
      "_id": "4",
      "_score": 3.9835935,
      "_source": {
        "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
        "title": "Solr in Action",
        "publish_date": "2014-04-05"
      }
    },
    {
      "_index": "bookdb_index",
      "_type": "book",
      "_id": "3",
      "_score": 3.1001682,
      "_source": {
        "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",
        "title": "Elasticsearch in Action",
        "publish_date": "2015-12-03"
      }
    },
    {
      "_index": "bookdb_index",
      "_type": "book",
      "_id": "1",
      "_score": 2.0281231,
      "_source": {
        "summary": "A distibuted real-time search and analytics engine",
        "title": "Elasticsearch: The Definitive Guide",
        "publish_date": "2015-02-07"
      }
    }
  ]

备注：提升相关度并不仅仅意味着计算得分乘以提升因子。应用的实际增强值通过标准化和一些内部优化。有关原理以及更多信息 Elasticsearch guide

布尔查询

AND/OR/NOT 操作符可以用来微调我们的搜索，使得搜索结果更加的符合预期，在 search Api 中，这被称之为布尔查询，布尔查询可以使用参数 must（等同于 AND），可以使用参数must_not（等同于 NOT）和参数should（等同于 OR），举例来说，如果我想要搜索 title 为 “Elasticsearch” 或者 “Solr”，并且 authored 为 “clinton gormley” 但又不等于 “radu gheorge” 的 books

POST /bookdb_index/book/_search
{
  "query": {
    "bool": {
      "must": {
        "bool" : { 
          "should": [
            { "match": { "title": "Elasticsearch" }},
            { "match": { "title": "Solr" }} 
          ],
          "must": { "match": { "authors": "clinton gormely" }} 
        }
      },
      "must_not": { "match": {"authors": "radu gheorge" }}
    }
  }
}
[Results]
"hits": [
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "1",
    "_score": 2.0749094,
    "_source": {
      "title": "Elasticsearch: The Definitive Guide",
      "authors": [
        "clinton gormley",
        "zachary tong"
      ],
      "summary": "A distibuted real-time search and analytics engine",
      "publish_date": "2015-02-07",
      "num_reviews": 20,
      "publisher": "oreilly"
    }
  }
]

备注：可以看到，一个布尔查询（bool）可以嵌套另一个布尔查询，并且可以支持无限层级嵌套

模糊查询

模糊查询可以在Match和Multi-Match查询里使用，来解决用户拼写错误，模糊查询的程度声明基于原单词的莱温斯坦距离，即：需要对一个字符串进行一个字符更改的数量，以使其与另一个字符串相同。

POST /bookdb_index/book/_search
{
    "query": {
        "multi_match" : {
            "query" : "comprihensiv guide",
            "fields": ["title", "summary"],
            "fuzziness": "AUTO"
        }
    },
    "_source": ["title", "summary", "publish_date"],
    "size": 1
}
[Results]
"hits": [
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "4",
    "_score": 2.4344182,
    "_source": {
      "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
      "title": "Solr in Action",
      "publish_date": "2014-04-05"
    }
  }
]

备注：除了 “AUTO”，还可以指定数字 0，1 或者 2 去声明，可以匹配到一个单词的最大字符数，使用 “AUTO” 的好处是它考虑了字符串的长度，对于一个只有三个字符的单词，指定模糊匹配的字符数量为 2，显然对于性能是有问题的，因此，大部分情况还是建议使用 “AUTO”

通配符查询

通配符查询允许你使用一个通配符来代替完整匹配。?可匹配任意字符，*可以匹配 0 或者更多其他字符，比如，我们要查找一条 authors 是"t"开头的数据

POST /bookdb_index/book/_search
{
    "query": {
        "wildcard" : {
            "authors" : "t*"
        }
    },
    "_source": ["title", "authors"],
    "highlight": {
        "fields" : {
            "authors" : {}
        }
    }
}
[Results]
"hits": [
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "1",
    "_score": 1,
    "_source": {
      "title": "Elasticsearch: The Definitive Guide",
      "authors": [
        "clinton gormley",
        "zachary tong"
      ]
    },
    "highlight": {
      "authors": [
        "zachary <em>tong</em>"
      ]
    }
  },
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "2",
    "_score": 1,
    "_source": {
      "title": "Taming Text: How to Find, Organize, and Manipulate It",
      "authors": [
        "grant ingersoll",
        "thomas morton",
        "drew farris"
      ]
    },
    "highlight": {
      "authors": [
        "<em>thomas</em> morton"
      ]
    }
  },
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "4",
    "_score": 1,
    "_source": {
      "title": "Solr in Action",
      "authors": [
        "trey grainger",
        "timothy potter"
      ]
    },
    "highlight": {
      "authors": [
        "<em>trey</em> grainger",
        "<em>timothy</em> potter"
      ]
    }
  }
]

正则表达式查询

正则表达式查询允许你声明比通配查询更加复杂的查询

POST /bookdb_index/book/_search
{
    "query": {
        "regexp" : {
            "authors" : "t[a-z]*y"
        }
    },
    "_source": ["title", "authors"],
    "highlight": {
        "fields" : {
            "authors" : {}
        }
    }
}
[Results]
"hits": [
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "4",
    "_score": 1,
    "_source": {
      "title": "Solr in Action",
      "authors": [
        "trey grainger",
        "timothy potter"
      ]
    },
    "highlight": {
      "authors": [
        "<em>trey</em> grainger",
        "<em>timothy</em> potter"
      ]
    }
  }
]

短句查询

短句查询需要匹配所有单词才能被搜索出来，按照指定的顺序并且是连续的，默认情况下单词之间必须是连续的，但是你可以通过改写 “slop” 的值来声明即便两个单词之间间隔多少个单词，该文档仍然能被匹配出来

POST /bookdb_index/book/_search
{
    "query": {
        "match_phrase_prefix" : {
            "summary": {
                "query": "search en",
                "slop": 3,
                "max_expansions": 10
            }
        }
    },
    "_source": [ "title", "summary", "publish_date" ]
}
[Results]
"hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 0.5161346,
        "_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",
          "title": "Solr in Action",
          "publish_date": "2014-04-05"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 0.37248808,
        "_source": {
          "summary": "A distibuted real-time search and analytics engine",
          "title": "Elasticsearch: The Definitive Guide",
          "publish_date": "2015-02-07"
        }
      }
    ]

备注：Query-time search-as-you-type 有一定的性能成本，更好的方案是 index-time search-as-you-type，查看Completion Suggester API或者使用 Edge-Ngram查看更多信息

Query String

query_string查询提供了一种将 “multi_match 查询”，“布尔查询”，“boosting”，“模糊查询”，“通配查询”，“范围查询” 整合在一个短句里的方法，下面的查询，匹配 “grant ingersoll” 或者 “tom morton.” 且含有短句 “grant ingersoll”，然后还将 summary 字段的分数提到了 2

POST /bookdb_index/book/_search
{
    "query": {
        "query_string" : {
            "query": "(saerch~1 algorithm~1) AND (grant ingersoll)  OR (tom morton)",
            "fields": ["title", "authors" , "summary^2"]
        }
    },
    "_source": [ "title", "summary", "authors" ],
    "highlight": {
        "fields" : {
            "summary" : {}
        }
    }
}
[Results]
"hits": [
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "2",
    "_score": 3.571021,
    "_source": {
      "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",
      "title": "Taming Text: How to Find, Organize, and Manipulate It",
      "authors": [
        "grant ingersoll",
        "thomas morton",
        "drew farris"
      ]
    },
    "highlight": {
      "summary": [
        "organize text using approaches such as full-text <em>search</em>, proper name recognition, clustering, tagging"
      ]
    }
  }
]

简化的 Query String

simple_query_string是query_string的一个版本，更加适合用在一个单独的搜索框里暴露给用户使用，因为它将 AND/OR/NOT 分别变成了 +/|/-，并且这种查询将不会直接将查询的语法错误直接抛出

POST /bookdb_index/book/_search
{
    "query": {
        "simple_query_string" : {
            "query": "(saerch~1 algorithm~1) + (grant ingersoll)  | (tom morton)",
            "fields": ["title", "authors" , "summary^2"]
        }
    },
    "_source": [ "title", "summary", "authors" ],
    "highlight": {
        "fields" : {
            "summary" : {}
        }
    }
}

Term/Terms 查询

以上的例子展示了full-text搜索，有时候我们也会想要一个精确的匹配，term以及terms查询可以帮助我们做到这一点，下面的例子中，我们搜索了所有发布者为 “Manning” 发布的 books

POST /bookdb_index/book/_search
{
    "query": {
        "term" : {
            "publisher": "manning"
        }
    },
    "_source" : ["title","publish_date","publisher"]
}
[Results]
"hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "2",
        "_score": 1.2231436,
        "_source": {
          "publisher": "manning",
          "title": "Taming Text: How to Find, Organize, and Manipulate It",
          "publish_date": "2013-01-24"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": 1.2231436,
        "_source": {
          "publisher": "manning",
          "title": "Elasticsearch in Action",
          "publish_date": "2015-12-03"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "4",
        "_score": 1.2231436,
        "_source": {
          "publisher": "manning",
          "title": "Solr in Action",
          "publish_date": "2014-04-05"
        }
      }
    ]

多个terms可以用关键字terms来代替，并且可以传入一个数组

{
    "query": {
        "terms" : {
            "publisher": ["oreilly", "packt"]
        }
    }
}

Term 查询 - 排序

Term 查询的结果（也包括其他种类的查询）可以很轻松的进行排序，多级排序也是被允许的

POST /bookdb_index/book/_search
{
    "query": {
        "term" : {
            "publisher": "manning"
        }
    },
    "_source" : ["title","publish_date","publisher"],
    "sort": [
        { "publish_date": {"order":"desc"}}
    ]
}
[Results]
"hits": [
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "3",
    "_score": null,
    "_source": {
      "publisher": "manning",
      "title": "Elasticsearch in Action",
      "publish_date": "2015-12-03"
    },
    "sort": [
      1449100800000
    ]
  },
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "4",
    "_score": null,
    "_source": {
      "publisher": "manning",
      "title": "Solr in Action",
      "publish_date": "2014-04-05"
    },
    "sort": [
      1396656000000
    ]
  },
  {
    "_index": "bookdb_index",
    "_type": "book",
    "_id": "2",
    "_score": null,
    "_source": {
      "publisher": "manning",
      "title": "Taming Text: How to Find, Organize, and Manipulate It",
      "publish_date": "2013-01-24"
    },
    "sort": [
      1358985600000
    ]
  }
]

备注：在 6 之后的版本，使用 text 类型的字段排序或者分组，比如 title，你需要为那个字段指定fielddata，更多细节查看ElasticSearch Guide.

范围查询

还有另一种查询的结构我们称之为范围查询，在下面的例子里，我们搜索了所有在 2015 年发布的 books

POST /bookdb_index/book/_search
{
    "query": {
        "range" : {
            "publish_date": {
                "gte": "2015-01-01",
                "lte": "2015-12-31"
            }
        }
    },
    "_source" : ["title","publish_date","publisher"]
}
[Results]
"hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 1,
        "_source": {
          "publisher": "oreilly",
          "title": "Elasticsearch: The Definitive Guide",
          "publish_date": "2015-02-07"
        }
      },
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "3",
        "_score": 1,
        "_source": {
          "publisher": "manning",
          "title": "Elasticsearch in Action",
          "publish_date": "2015-12-03"
        }
      }
    ]

备注：范围查询支持的字段类型有：日期，数字和字符串

筛选布尔查询

当我们使用布尔查询的时候，可以使用filter参数去筛掉一些结果，下面的例子中，我们查询了 title 或者 summary 字段含有 “Elasticsearch”,与此同时我们还需要 reviews 的数量大于等于 20

POST /bookdb_index/book/_search
{
    "query": {
        "filtered": {
            "query" : {
                "multi_match": {
                    "query": "elasticsearch",
                    "fields": ["title","summary"]
                }
            },
            "filter": {
                "range" : {
                    "num_reviews": {
                        "gte": 20
                    }
                }
            }
        }
    },
    "_source" : ["title","summary","publisher", "num_reviews"]
}
[Results]
"hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 0.5955761,
        "_source": {
          "summary": "A distibuted real-time search and analytics engine",
          "publisher": "oreilly",
          "num_reviews": 20,
          "title": "Elasticsearch: The Definitive Guide"
        }
      }
    ]

多个 filters 可以再一个bool查询中结合起来使用，下一个例子中，filter 规定了结果必须含有最少 20 个 reviews，发布时间不能早于 2015 年，并且发布人应该是 Reilly

POST /bookdb_index/book/_search
{
    "query": {
        "filtered": {
            "query" : {
                "multi_match": {
                    "query": "elasticsearch",
                    "fields": ["title","summary"]
                }
            },
            "filter": {
                "bool": {
                    "must": {
                        "range" : { "num_reviews": { "gte": 20 } }
                    },
                    "must_not": {
                        "range" : { "publish_date": { "lte": "2014-12-31" } }
                    },
                    "should": {
                        "term": { "publisher": "oreilly" }
                    }
                }
            }
        }
    },
    "_source" : ["title","summary","publisher", "num_reviews", "publish_date"]
}
[Results]
"hits": [
      {
        "_index": "bookdb_index",
        "_type": "book",
        "_id": "1",
        "_score": 0.5955761,
        "_source": {
          "summary": "A distibuted real-time search and analytics engine",
          "publisher": "oreilly",
          "num_reviews": 20,
          "title": "Elasticsearch: The Definitive Guide",
          "publish_date": "2015-02-07"
        }
      }
    ]

菜单

分享

23 个例子让你彻底搞定 Elasticsearch 的搜索查询语法

例子

最基础的 match query

提高相关度

布尔查询

模糊查询

通配符查询

正则表达式查询

短句查询

Query String

简化的 Query String

Term/Terms 查询

Term 查询 - 排序

范围查询

筛选布尔查询

相关度函数： Field Value Factor

相关度函数: 衰变函数

相关度函数：自定义函数

评论

第四章：实现 Servlet 容器的基本功能-MiniTomcat系列

MiniTomcat 项目大纲

打造属于你的 MiniTomcat：深入理解 Web 容器核心架构与实现之路

Myers 差异算法：高效比较序列的利器

身份证信息可视化工具：轻松制作身份证样本，助力开发与教学

TensorFlow 实现手写数字识别：多层感知器与随机梯度下降解析

Java算法题类型及解法

回溯算法详解

分治法详解

html selector 介绍

分享

23 个例子让你彻底搞定 Elasticsearch 的搜索查询语法

例子

最基础的 match query

提高相关度

布尔查询

模糊查询

通配符查询

正则表达式查询

短句查询

Query String

简化的 Query String

Term/Terms 查询

Term 查询 - 排序

范围查询

筛选布尔查询

相关度函数： Field Value Factor

相关度函数: 衰变函数

相关度函数： 自定义函数

评论

相关度函数：自定义函数