Elasticsearch first article (Getting started) _elasticsearch

Source: Internet
Author: User
Tags create index
Introduced

Elasticsearch is an efficient, scalable, Full-text search engine basic Concept Near Realtime (NRT): ES is a near real-time query platform, meaning from storing a piece of data to being able to index to data jet lag is very small, usually within 1s cluster:es is a distributed , extensible platform for one or more servers to build the same cluster Node through the defined cluster.name (default Elasticsearch) Identity: Typically, an ES Node is deployed on one server as part of the cluster for data storage and search functionality, In a cluster, nodes are differentiated by node.name, and the default is to randomly generate a string at node startup as the name of the nodes, configurable Index: Similar to database in relational databases, for organizing a class of functionally similar data, in a cluster can define any index, The name of an index can only be made up of lowercase letters, as part of the data identity when data is indexed, updated, searched, deleted: Similar to a table in a relational database, multiple types can be defined in index, in principle one type is a collection of data composed of the same attribute Document : Similar to a record in a relational database, the most basic storage unit of data, expressed in JSON format, document is physically stored under index, but logically assigned to the specific type under Shards & Replica:
An index may store a large amount of data (exceeding the hardware limit of a single node), whether data storage or data index, to solve the data single node storage and improve concurrency, ES will each index physics into multiple slices, thereby horizontally expanding storage capacity, Increase concurrency (you can index and search for a shard at the same time)
To prevent the data from being indexed after a failure of a storage unit, ES provides the ability to replicate Shard, and after the primary shard fails, replication Shard to the primary shard for data indexing operations, which makes it highly available because replication Shard can be used when searching. Thus improving the concurrency of the data search
When index is created, you can set the number of slices and the number of copies, by default, create each index setting 5 shard and a replica, indicating that the index is stored by 5 logical storage units, each logical storage unit has a replication node for disaster preparedness, note that Shard can only be set when the index is created, and the number of shard is stored on which shard the document is allocated (typically using the hash (document _ID)% Shard num Calculation document is stored on which shard)
Installing the main shard and replic slices on different node in ES elasticsearch using the Java language Implementation, the Java Virtual machine must be installed when used (currently es1.6 and version 1.7 can choose version 1.8 java) Download address extract to installation directory C : \program Files\elasticsearch Run CD "C:\Program files\elasticsearch\bin" && elasticsearch.bat install to service services Install Elasticsearch start Service net start Elasticsearch stop service net stop Elasticsearch test
Access Address: http://localhost:9200
Access results:

1
2
3
4 5 6 7 8
9
ten
13
{
  status:200,
  name: "Smart Alec",
  cluster_name: "Elasticsearch",
  version: {Number
    : "1.6.0",
    build_hash: "cdd3ac4dde4f69524ec0a14de3828cb95bbb86d0",
    build_timestamp: "2015-06-09t13:36:34z",
    Build_snapshot:false,
    lucene_version: "4.10.4"
  },
  tagline: "You Know, for Search"
}
Interface

ES provides standard RESTAPI interface to external, use all of his cluster operations: Cluster, node, index status, and statistics view manage clusters, nodes, indexes, and types perform curd operations (create, update, read, delete) and index perform advanced search functions such as sorting, paging, filtering, aggregation, JS script execution, etc.

Format: Curl-x<rest verb> <Node>:<Port>/<Index>/<Type>/<ID> run CD with Marvel plugin C:\ Program Files\elasticsearch\bin "&& plugin-i elasticsearch/marvel/latest Access address Marvel provides sense excuses for calling ES To access the address, the following actions use sense or use the Linux Curl command line to Practice State query cluster status queries
Input: Get _cat/health?v
Output:

1
2
Epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks
1442227489 18:44:49  elasticsearch Yellow          1         1    0    0             0

Description
Status: Indicates the health of the cluster, the value may be green,yellow,red, green means that the main shard and replica (at least one) normal, yellow that the main shard normal but replica are not normal, Red means there's a problem with the main shard and replica.
Node.total: Indicates the number of nodes in the cluster node status query
Input: Get/_cat/nodes?v
Output:

1
2
Host      IP             heap.percent ram.percent load node.role Master name
silence   192.168.1.111          Wuyi      d         *      Thunderbird
Query all Indexes

Input: Get/_cat/indices?v
Output:

1
2
3
Health status Index              PRI rep docs.count docs.deleted store.size pri.store.size Yellow
open   . marvel-2015.09.02   1   1      93564            0     78.4mb         78.4mb
yellow open   . marvel-2015.09.01   1   1      39581            0     45.9mb         45.9mb

Create an index

Input: Put/test1?pretty
Output:

1
2
3
{
  ' acknowledged ': true
}

Query all indexes:

1
2
Health status Index              PRI rep docs.count docs.deleted store.size pri.store.size Yellow
open   test1                5< C30/>1          0            0       575b           575b

Description
Health: Because only one node is running, replica cannot be in the same node as the primary shard, so replica is not normal, the status of the index is yellow
Index: Indexed name
PRI: Represents the number of primary shard
Rep: Represents the number of copies per Shard
Docs.count: Index of document in index, read, delete documents

Index Document Method 1:
Input:

1
2
Put/test1/user/1?pretty
{"name": "Silence1"}

Output:

1
2
3
4
5
6 7
{
  "_index": "Test1
  " _type ":" User ",
  " _id ":" 1 ",
  " _version ": 1,
  " created ": True
}

Method 2:
Input:

1
2
Post/test1/user/2?pretty
{"name": "Silence2"}

Output:

1
2
3
4
5
6 7
{
  "_index": "Test1", "
  _type": "User",
  "_id": "2",
  "_version": 1,
  "created": True
}

Method 3:
Input:

1
2
Post/test1/user?pretty
{"name": "Silence3"}

Output:

1
2
3
4
5
6 7
{
  "_index": "Test1", "
  _type": "User",
  "_id": "Au_mdqoxryihsis7ugbq",
  "_version": 1,
  " Created ": True
}

Note: If you need to specify a document ID value when indexing a document, you need to submit the data using put or post and display the specified ID value, and if you need to automatically generate an ID by ES, you need to submit the data using Post

To read a document:
Input: Get/test1/user/1?pretty
Output:

1
2
3
4
5
6 7 8
{
  "_index": "Test1", "
  _type": "User",
  "_id": "1",
  "_version": 1,
  "found": True,
  "_ SOURCE ': {' name ': ' Silence1 '}
}

Description
_index,_type: Represents index and type information for document storage
_ID: Indicates the number of the document
_version: Represents the document version number, mainly used for concurrent processing using optimistic locks to prevent dirty data
Found: Indicates whether the requested document exists
_souce: Format for JSON, for document content

Note: We did not create user type before, we created user automatically when indexing the document, we can use default parameters to create index and type without displaying in es, or customize according to submitting data, but it is not recommended to use this. Show create index and type and set parameters when you are not sure what might be causing

To delete a document:
Input: Delete/test1/user/1?pretty
Output:

1
2
3
4
5
6 7
{
  "found": True,
  "_index": "Test1",
  "_type": "User",
  "_id": "1",
  "_version": 2
}

Read the document output again:

1
2
3
4
5
6
{
  "_index": "Test1", "
  _type": "User",
  "_id": "1",
  "found": false
}

Delete Index

Input: Delete/test1?pretty
Output:

1
2
3
{
  ' acknowledged ': true
}

Modify Document

Initialize document input:

1
2
Put/test1/user/1?pretty
{"name": "Silence2", "Age": 28}

To modify document input:

1
2
Put/test1/user/1?pretty
{"name": "Silence1"}

To read the document output:

1
2
3
4
5
6 7 8
{
  "_index": "Test1", "
  _type": "User",
  "_id": "1",
  "_version": 2,
  "found": True,
  "_ SOURCE ': {' name ': ' Silence1 '}
}

Update document

Update Data entry:

1
2
Post/test1/user/1/_update?pretty
{"Doc": {"name": "Silence3", "Age": 28}}

Read Data output:

1
2
3
4
5
6 7 8
{
  "_index": "Test1", "
  _type": "User",
  "_id": "1",
  "_version": 3,
  "found": True,
  "_ SOURCE ': {' name ': ' Silence3 ', ' Age ':}
}

Update document Input:

1
2
Post/test1/user/1/_update?pretty
{"script": "Ctx._source.age + + 1"}

To read the document output:

1
2
3
4
5
6 7 8
{
  "_index": "Test1", "
  _type": "User",
  "_id": "1",
  "_version": 4,
  "found": True,
  "_ SOURCE ': {' name ': ' Silence3 ', ' Age ':}
}

Note: Requires a post to use script must be configured in Elasticsearch/config/elasticsearch.yml script.groovy.sandbox.enabled:true
Modify (Put) and update (post+_update) is the difference is to modify the use of submitted documents to cover the ES in the document, update using the submitted parameter values to overwrite ES Chinese document corresponding parameter values delete a document from a query

Input:

1
2
Delete/test1/user/_query?pretty
{"Query": {"match": {"name": "Silence3"}}}

Output:

1
2
3 4 5 6 7 8 9
Ten
11
{"_indices": {"test1": {"_shards": {"Total
    ":
        5,
        "successful": 5,
        "failed": 0
      }
    }
  }
}

Get the number of documents

Input: Get/test1/user/_count?pretty
Output:

1
2
3
4
5
6 7 8
{
  "Count": 0,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  }
}

Bulk operations

Input:

1
2
3
4
5
6 7 8 9
Post/test1/user/_bulk?pretty
{"index": {"_id": 1}}
{"name": "Silence1"}
{"index": {"_id": 2}}
{"Nam E ': ' Silence2 '}
{' index ': {}}
{' name ': ' Silence3 '}
{' index ': {}}
{' name ': ' Silence4 '}

Input:

1
2
3
4
Post/test1/user/_bulk?pretty
{"Update": {"_id": 1}}
{"Doc": {"Age":}}
{"Delete": {"_id": 2}}

Importing data by file: Curl-xpost "Localhost:9200/test1/account/_bulk?pretty"--data-binary @accounts. JSON query

Queries can be done in two ways, one for submitting parameter queries using a query string, and one for submitting requesbody for using RESTAPI submit parameter queries

Get all document input: Get/test1/user/_search?q=*&pretty

1
2
3
4
Post/test1/user/_search?pretty
{"
  query": {"Match_all": {}}
}

Output:

 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 
{
   "took": 2,
   "Timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   } ,
   "hits": {
      "total": 3,
      "Max_score": 1,
      "hits": [
         {
            "_index": "Test1",
            "_type": " User ",
            " _id ":" 1 ",
            " _score ": 1,
            " _source ": {
               " name ":" Silence1 ",
               " Age ":
            }
         },
         {
            "_index": "Test1", "
            _type": "User",
            "_id": "AU_M2ZGWLNDQVGQQS3MP",
            "_score": 1,
            " _source ': {
               ' name ': ' Silence3 '
            }
         },
         {
            ' _index ': ' test1 ',
            ' _type ': ' User ',
            ' _id ' : "Au_m2zgwlndqvgqqs3mq",
            "_score": 1,
            "_source": {
               "name": "Silence4"}}
      ]
   }
}

Description
Took: Time to execute query (in milliseconds)
Timed_out: Execution cannot timeout
_shards: Prompts how many shard participate in queries and query success and failure shard number
Hits: Query results
Hits.total: Total number of documents
_score, Max_score: Match degree and maximum match for document and query SDL

Input:

1
2
3
4
5
6 7 8 9
Post/test1/account/_search?pretty
{"
  query": {"Match_all": {}},
  "size": 2,
  "from": 6,
  " Sort ': {
    ' age ': {' order ': ' ASC '}}}

Description
Query: Used to define filter criteria for queries
Match_all: means querying all documents
Size: Indicates the query returns the number of documents, if the default is not set to 10
From: Indicates the start position, ES uses 0 as the start index, and often a paging query with the size combination, if the default is not set to 0
Sort: Used to set sort properties and rules use _source to set the document properties returned by the query results
Input:

1
2
3
4
5
6 7
Post/test1/account/_search?pretty
{"
  query": {
    "Match_all": {}
  },
  "_source": ["FirstName "," LastName "," Age "]
}

Output:

1
2
3
4
5
6
7 8 9
30 (a) (a)
the
37
{
   "took": 5,
   "Timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1000,
      "Max_score": 1,
      "hits": [
         {"
            _index": "Test1",
            "_type": "Account",
            "_id": "4",
            "_score": 1, "
            _source": {
               "FirstName": "Rodriquez",
               "Age": "
               LastName": "Flores"
            }
         },
         {
            "_index": "Test1",
            "_type": "Account",
            " _id ":" 9 ","
            _score ": 1,
            " _source ": {
               " FirstName ":" Opal ",
               " age ":" LastName "
               :" Meadows "
            }
         }
      ]
   }
}

Use match to set query matching values
Input:

1
2
3
4
5
6 7
Post/test1/account/_search?pretty
{"
  query": {"
    match": {"Address": "986 Wyckoff Avenue"}
  },
  " Size ': 2
}

Output:

1
2
3
4
5
6 7 8 9 (16) He ' is in the
same
A
43-A-I-M
"
53".
{"Took": 1, "timed_out": false, "_shards": {"Total": 5, "successful": 5, "failed": 0},
            "hits": {"Total": 216, "Max_score": 4.1231737, "hits": [{"_index": "Test1",
               "_type": "Account", "_id": "4", "_score": 4.1231737, "_source": { "Account_number": 4, "balance": 27658, "FirstName": "Rodriquez", "Lastnam
               E ":" Flores "," Age ":" Gender ":" F "," Address ":" 986 Wyckoff Avenue ", "Employer": "Tourmania", "email": "rodriquezflores@tourmania.com", "City": "Eastv Ale ", State": "HI"}}, {"_index": "Test1", "_type" : "Account", "_id": "The", "_score": 0.59278774, "_source": {"Account_n
            Umber ": 34,   "Balance": 35379, "FirstName": "Ellison", "LastName": "Kim", "age": 30,
               "Gender": "F", "Address": "986 Revere Place", "Employer": "Signity",
         "Email": "ellisonkim@signity.com", "City": "Sehili", "state": "IL"}
 }
      ]
   }
}

Note: According to the query results can be seen in the query results are not only query address contains "986 Wyckoff Avenue" document, but contains 986,wychoff,avenue three words in any one, this is the power of ES participle
Visible Query Results _score (match the query criteria) in order from large to small
At this point you may want to value the query address contains "986 Wyckoff Avenue" document, how to do it. Using Match_phrase
Input:

1
2
3
4
5
6
Post/test1/account/_search?pretty
{"Query": {"
  match_phrase": {"Address":
    "986 Wyckoff Avenue"}
  }< c10/>}

As you may have noticed, there is only one condition in query above, and if there are multiple conditions, we must use BOOL query to combine multiple conditions
Input:

1
2
3 4 5 6 7 8 9
Ten
11
Post/test1/account/_search?pretty
{"Query": {"bool": {"must": {"
        match_phrase": {" Address ': ' 986 Wyckoff Avenue '}},
        {' match ': {' Age ':}}}

Description: Query all conditions are satisfied with the results

Input:

1
2
3 4 5 6 7 8 9
Ten
11
Post/test1/account/_search {"Query": {"bool": {"should":
      [
        {"Match_phrase": {"Address": "986 Wyckoff Avenue "}},
        {" Match_phrase ": {" address ":" 963 Neptune Avenue "}}}}

Description: Query has a result that satisfies the condition
Input:

1
2
3 4 5 6 7 8 9
Ten
11
Post/test1/account/_search {"Query": {"bool": {"
      must_not": [
        {match]: {"City": " Eastvale '}},
        {' match ': {' city ': ' Olney '}}}}

Description: Query is not satisfied with the result of the condition

You can use must, Must_not, and should in the query SDL
Input:

1
2
3
4 5 6 7 8
9
ten
13
Post/test1/account/_search {"Query": {"bool": {"
      must": [{"
        match": {"Age":}
      ],
      ' Must_not ': [
        {' match ': {' city ': ' Steinhatchee '}
  }}}

Filters Query

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.