elasticsearch 深入 —— normalizer-白红宇

elasticsearch 深入 —— normalizer

阅读量：783 次

发布时间：2019-03-24

本文共 2275 字，大约阅读时间需要 7 分钟。

字段的normalizer属性在Elasticsearch中提供了一种强大的文档处理机制，尤其是在字段分析和查询时，能够确保文档在索引和查询阶段的一致性。以下是一些实际案例和查询示例，展示了如何在索引和查询过程中使用normalizer。

###PUT index/{index}{ Poverty and the role of government in it }

在索引设置中，可以定义一个自定义的normalizer来处理特定字段。例如，我们可以创建一个针对foo字段的normalizer，确保在分析和查询时其值是标准化后的版本。

{
  "settings": {
    "analysis": {
      "normalizer": {
        "my_normalizer": {
          "type": "custom",
          "char_filter": [],
          "filter": ["lowercase", "asciifolding"]
        }
      }
    }
  },
  "mappings": {
    "_doc": {
      "properties": {
        "foo": {
          "type": "keyword",
          "normalizer": "my_normalizer"
        }
      }
    }
  }
}

PUT index/_doc/{id}

在索引文档中，字段的值也会经过相同的normalizer处理。例如，以下文档在索引时可以执行以下操作：

{
  "foo": "BÀR"
}

在索引过程中，BÀR会被lowercase和asciifolding过滤器处理，最终被标准化为bar。

POST index/_refresh

为了确保索引中所有文档的变化得到更新，可以执行以下命令：

GET index/_search

在搜索时，同样会使用与索引一致的normalizer处理。例如：

{
  "query": {
    "term": {
      "foo": "BAR"
    }
  }
}

在这个查询中，BAR将被标准化为bar，因此查询结果将匹配所有存储的bar、BÀR等版本。

GET index/_search

随后的搜索结果如下：

{
  "took": 123,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "index",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.2876821,
        "_source": {
          "foo": "bar"
        }
      },
      {
        "_index": "index",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "foo": "BÀR"
        }
      }
    ]
  }
}

GET index/_search

对于聚合的结果，同样会使用normalizer进行处理。例如：

{
  "size": 0,
  "aggs": {
    "foo_terms": {
      "terms": {
        "field": "foo"
      }
    }
  }
}

在聚合的结果中，foo字段的值已经被标准化为bar和baz。

###聚合结果

{
  "took": 43,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.0,
    "hits": []
  },
  "aggregations": {
    "foo_terms": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "bar",
          "doc_count": 2
        },
        {
          "key": "baz",
          "doc_count": 1
        }
      ]
    }
  }
}

转载地址：http://flnkk.baihongyu.com/

你可能感兴趣的文章