Elasticsearch 是我个人使用的备忘录

2 年 ago

宇, 华

2 minutes

因为我第一次尝试使用Elasticsearch，所以我会边使用边记下笔记。

整理用語

首先从术语开始。如果与MySQL对应，可能会是这样的。

MySQLElasticsearchdatabaseindextabletyperecorddocument

映射：类似于关系型数据库中的表定义。

创建地图

$ curl -XPUT 'localhost:9200/<INDEX_NAME>' -d '
{
    "mappings" : {
      "<TYPE_NAME>" : {
        "properties" : {
          "author" : {
            "type" : "string"
          },
          "contents" : {
            "type" : "string",
            "analyzer": "japanese‎"
          },
          "enabled" : {
            "type" : "boolean"
          },
          "pub_date" : {
            "type" : "date",
            "format" : "dateOptionalTime"
          },
          "read_ratio" : {
            "type" : "double"
          },
          "reads" : {
            "type" : "long"
          },
          "subtitle" : {
            "type" : "string",
            "analyzer": "japanese‎"
          },
          "title" : {
            "type" : "string",
            "analyzer": "japanese‎"
          },
          "views" : {
            "type" : "long"
          }
        }
      }
    }
  }
}'

如果是基于定义的JSON文件(mapping.json)进行创建的话

$ curl -XPOST 192.168.33.10:9200/<INDEX_NAME> -d @mapping.json

在mapping.json文件中，
指定tokenizer为ngram。

{
  "settings": {
    "analysis": {
      "analyzer": {
        "ngram_analyzer": {
          "tokenizer": "ngram_tokenizer"
        }
      },
      "tokenizer": {
        "ngram_tokenizer": {
          "type": "nGram",
          "min_gram": "2",
          "max_gram": "3",
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    }
  },
  "mappings": {
    "items": {
      "properties": {
        "item_seq": {
          "type": "string"
        },
        "item_name": {
          "type": "string",
          "analyzer": "ngram_analyzer"
        },
        "group_seq": {
          "type": "string"
        }
      }
    }
  }
}

获取索引列表

$ curl -XGET 'localhost:9200/_aliases?pretty'

删除索引

$ curl -X DELETE 'localhost:9200/<INDEX_NAME>'

插件列表

$ $ curl -X GET 'http://192.168.33.10:9200/_nodes/plugins?pretty'
{
  "cluster_name" : "elasticsearch",
  "nodes" : {
    "42fIIy3lQaGDGbnWL2Cydg" : {
      "name" : "Gwen Stacy",
      "transport_address" : "192.168.33.10:9300",
      "host" : "192.168.33.10",
      "ip" : "192.168.33.10",
      "version" : "2.1.1",
      "build" : "40e2c53",
      "http_address" : "192.168.33.10:9200",
      "plugins" : [ {
        "name" : "analysis-kuromoji",
        "version" : "2.1.1",
        "description" : "The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis module into elasticsearch.",
        "jvm" : true,
        "classname" : "org.elasticsearch.plugin.analysis.kuromoji.AnalysisKuromojiPlugin",
        "isolated" : true,
        "site" : false
      } ]
    }
  }
}

查找

curl -XGET 'http://192.168.33.10:9200/<INDEX_NAME>/<TYPE_NAME>/_search?pretty=true' -d '
{
  "query" : {
    "simple_query_string" : {
      "query": "ほげ",
      "fields": ["_all"],
      "default_operator": "and"
    }
  }
}
'

一次性注册（PHP）

用PHP进行批量加载的方法

AWS 的认证相关（PHP）

我使用IAM时，不知道如何与elasticsearch库结合。

当我想到时，发现了这样的东西。

在中国的 AWS SDK for PHP 文档中，签署一个亚马逊 Elasticsearch 服务搜索请求。

AWS的身份验证部分（Python）

AWS的ElasticSearch服务 · 问题 #280 · elastic/elasticsearch-py

在这里进行了讨论，并且已经实施。

Python Elasticsearch客户端 — Elasticsearch 2.2.0 文档

有写着说可以像这样使用。

from elasticsearch import Elasticsearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth

host = 'YOURHOST.us-east-1.es.amazonaws.com'
awsauth = AWS4Auth(YOUR_ACCESS_KEY, YOUR_SECRET_KEY, REGION, 'es')

es = Elasticsearch(
    hosts=[{'host': host, 'port': 443}],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection
)
print(es.info())

AWS认证问题（额外篇）

有一个名为Proxy的项目。
https://github.com/coreos/aws-auth-proxy