We have started elasticsearch in the previous article, and then we can communicate with elasticsearch, such as inserting data, retrieving data, deleting data, and so on. Elasticsearch provides two ways to communicate with Java APIs and restful APIs. Java API
If you are using the Java,elasticsearch built-in two client, you can use it in your code:
Node client: The node client joins a cluster as a non-data node. In other words, it doesn't have any data on its own, but he knows what data is on which node in the cluster, and then can request to forward to the correct node and connect.
Transport client: A lighter-weight transport client can be used to send requests to a remote cluster. Instead of joining the cluster itself, he forwards the request to a node in the cluster.
Both clients use the Elasticsearch transport protocol to communicate with the Java client over port 9300. Each node in the cluster is also communicating over port 9300. If this port is banned, then your nodes will not be able to form a cluster.
Precautions
The version number of the Java client must be the same as the version number used by the Elasticsearch node, otherwise they may not be recognized.
More instructions on the Java API can be found here the Guide transmits JSON to the RESTful API via HTTP
Other languages can communicate with Elasticsearch's RESTful API via Port 9200. In fact, as you can see, you could even use the line command curl to communicate with Elasticsearch.
The requests made to Elasticsearch are consistent with the components of all other HTTP requests. For example, to calculate the number of files in a cluster, we can use:
Curl-xget ' Http://localhost:9200/_count?pretty '-d '
{
"query": {
"Match_all": {}}
}
'
The corresponding HTTP request method or variable: GET, POST, PUT, HEAD or DELETE. The Access Protocol, host name, and port of any node in the cluster. The path of the request. After any query, add the pretty. You can generate more beautiful JSON feedback to enhance readability. A JSON-encoded request body, if needed.
Elasticsearch will return an HTTP status code similar to ' OK ', and a JSON-formatted body (except for the simple ' HEAD ' request), the above request will get the JSON body below:
{
"Count": 0,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
}
}
In the feedback, we did not see the HTTP header information because we did not tell curl to display the content. If you want to see the header information, you can add the-I parameter when using the Curl command:
Curl-i-xget ' Http://localhost:9200/_count '
Basic Concepts
Elasticsearch is a document-oriented database, which means that it stores the entire object or document, not only storing them, but also indexing them so you can search for them. You can index, search, sort, and filter these documents in Elasticsearch. No rows of data are required. This will be a completely different way of thinking about the data, which is why Elasticsearch can perform complex full-text searches.
Elasticsearch uses JSON (or JavaScript Object Notation) as the format for document serialization. JSON has been supported by most languages and has become a standard format in the NoSQL world. It is simple, concise and easy to read.
For example, we have a user object, and we can construct a JSON object like this:
{
"email": " john@smith.com",
"first_name": "John",
"last_name": "Smith",
"about": {
"Bio": "Eco-warrior and defender of the weak",
"age": + ,
"interests": ["Dolphins", "whales"]
} ,
"join_date": "2014/05/01",
}
Although the user object is very complex, its structure and meaning are preserved in JSON. In Elasticsearch, it is much easier to convert an object to JSON and as an index than to do the same thing in a table structure.
In Elasticsearch, a document belongs to a type, and a variety of types exist in an index. You can also get some general similarities by analogy to traditional relational databases:
relational database ⇒ database ⇒ table ⇒ row ⇒ column (Columns)
Elasticsearch ⇒ index ⇒ type ⇒ document ⇒ field (fields)
A Elasticsearch cluster can contain multiple indexes (databases), meaning that it contains many types (tables). These types contain a lot of documents (rows), and then each document contains a lot of fields (columns). easy to confuse points
Index this term has been given too much meaning, so here we need to clarify: index (noun)
As stated above, an index is similar to a database in a traditional relational database. This is where the relevant documents are stored. Index (verb)
Creating an index for a document is the process of storing a document in an index (noun) so that it can be retrieved. This process is very similar to the INSERT in SQL
command, if the document already exists, the new document will overwrite the old document. Reverse Index
Adding an index to a column in a relational database, such as the Multiple search tree (b-tree) index, speeds up the retrieval of data, Elasticsearch and
Lucene uses a structure called the Reverse Index (inverted index) to achieve the same functionality. Example
Imagine that we are creating a new employee list system for a company called Megacorp, which should be able to work together in real time, so it should meet the following requirements: Data can contain multiple values of labels, numbers, and plain text content. You can retrieve all the data for any employee. Allow structured search. For example, find employees over 30 years of age. Allows simple full-text search and a relatively complex phrase search. Highlight keywords in the returned matching document. Have the background of data statistics and management. Data Storage
The first step is to store the employee's data. This way you need a "employee profile" form so that each document represents an employee. In Elasticsearch, each document is of one type (type), and various types exist in an index.
So in order to create an employee list, we need to do the following: Create an index for each employee's document, and each document contains all the information for an employee. Each document is marked as an employee type. This type will survive in the Megacorp index. This index will be stored in the Elasticsearch cluster.
In the actual operation, these operations are very simple (even if it looks like there are so many steps). We can do so many things through a command:
Curl-xput ' HTTP://LOCALHOST:9200/MEGACORP/EMPLOYEE/1 '-d '
{
"first_name": "John",
"last_name": "Smith" ,
"age": +,
"about": "I love to go rock climbing",
"interests": [
"Sports",
"Music"
],
" Join_time ":" 2014-11-24 "
} '
Note that the/MEGACORP/EMPLOYEE/1 path contains three parts:
name |
content |
Megacorp |
The name of the index |
Employee |
The name of the type |
1 |
ID of the current employee |
The request section, the JSON document, contains all the information about the employee. His name is "Douglas Fir", he is 25 years old, he is very fond of rock climbing.
We do not need to do any administrative operations before the operation, Elasticsearch will automatically detect the data structure and type, create an index and allow it to be searched, all operations have been silently completed in the background, very simple.
Before proceeding to the next step, we will add more employee information to this directory:
Curl-xput ' HTTP://LOCALHOST:9200/MEGACORP/EMPLOYEE/2 ' -d '
{
"first_name": "Jane",
"Last_ Name ": " Smith ",
" age ": + ,
" about ": " I like to collect rock albums ",
" interests ": ["Music"],
"join_time": "2012-10-15"
}
Curl-xput ' HTTP://LOCALHOST:9200/MEGACORP/EMPLOYEE/3 ' -d '
{
"first_name": "Douglas",
"Last_ Name ": " Fir ",
" age ": + ,
" about ": " I like to build cabinets ",
" interests ": [" Forestry "],
" join_time ":" 2016-01-24 "
}
Data Retrieval
Now that we have some data stored in the Elasticsearch, we can begin to work on the needs of the project. The first requirement is to be able to search every employee's data.
1. Request by ID
For Elasticsearch, this is very simple. We only need to execute an HTTP GET request and then point to the address of the document, which is the index, type, and ID. With these three parts, we can get the original JSON document:
Curl Xget ' HTTP://LOCALHOST:9200/MEGACORP/EMPLOYEE/1 '
The returned content contains metadata information for this document, and John Smith's original JSON document appears in the _source field:
{
"_index": "Megacorp",
"_type": "Employee",
"_id": "1",
"_version": 2,
"found": True,
"_ Source ": {
" first_name ":" John ","
last_name ":" Smith "," Age ": ' About
':
' I love to go rock climbing ',
"Interests": [
"Sports",
"Music"
],
"join_time": "2014-11-24"
}
}
2. Easy Search
We first need to complete one of the simplest search commands to search all employees:
Curl Xget ' Http://localhost:9200/megacorp/employee/_search '
You can see that we are using the Megacorp index, the employee type, but we do not specify the ID of the document, we are now using the _search port. You can return to the hits and find the three documents we entered. The search will return the first 10 values by default.
{"Took": "Timed_out": false, "_shards": {"Total": 2, "successful": 2, "faile
D ": 0}," hits ": {" Total ": 3," Max_score ": 1," hits ": [{
"_index": "Megacorp", "_type": "Employee", "_id": "1", "_score": 1,
"_source": {"first_name": "John", "last_name": "Smith",
"Age": ' About ': "I love to go rock climbing", "interests": [ "Sports", "Music"], "Join_time": "2 014-11-24 "}}, {" _index ":" Megacorp "," _type ":
"Employee", "_id": "2", "_score": 1, "_source": {
"First_Name": "Jane", "Last_Name": "Smith", "age": +, "about": "I like to collect
Rock albums "," Interests ": [" Music "],
"Join_time": "2014-11-24"}, {"_index": "Megacorp",
"_type": "Employee", "_id": "5", "_score": 1, "_source": {
"First_Name": "Douglas", "last_name": "Fir", "Age": 35, "About": "I like to build cabinets", "interests": ["Forestry "]," join_time ":" 2012-12-24 "}}]}}
Feedback values will not only tell you which documents to match, but will also include this document: all the information we need to search for the user.
Next, we're going to try to implement a search for which employee's last name includes Smith. To achieve this, we need to use a lightweight search method. This method is often called query string search because we pass the keyword of the query through the URL:
Curl-xget ' Http://localhost:9200/megacorp/employee/_search?q=last_name:Smith '
We still use the _search port, and then we can pass the parameters to q=. So we can get the result of the surname Smith:
{"Took": 2, "timed_out": false, "_shards": {"Total": 2, "successful": 2, "failed ": 0}," hits ": {" Total ": 2," Max_score ": 1," hits ": [{"
_index ":" Megacorp "," _type ":" Employee "," _id ":" 1 "," _score ": 1,
"_source": {"first_name": "John", "last_name": "Smith",
"Age": ' About ': "I love to go rock climbing", "interests": [ "Sports", "Music"], "join_time": "20 14-11-24 "}}, {" _index ":" Megacorp "," _type ": "Employee", "_id": "2", "_score": 1, "_source": {"
First_Name ":" Jane ", "Last_Name": "Smith", "age": +, "about": "I like to collect R
Ock albums "," Interests ": [" Music "], "Join_time": "2014-11-24"}}]}}
3. Using Query DSL search
A query string is a command statement that completes a point-to-point (ad hoc) search, but this also has its limitations (see the "Search Limitations" section). Elasticsearch provides a richer and more flexible query language, called the query DSL, through which you can accomplish more complex and powerful search tasks.
DSL (Domain specific Language domain-specific language) requires JSON as the principal, and we can also query employees with the surname Smith:
Curl-xget ' Http://localhost:9200/megacorp/employee/_search '-D
{
"query": {
"match": {
"last_ Name ":" Smith "}}
}
This request will return the same result. You'll find that we're not using a query string here, but instead we're using a JSON-made request body that uses the match query method, and then we'll learn about other types of queries.
more complex Searches
Next, we'll make a little bit more difficult. We are still looking for employees with the surname Smith, but we will also add a qualifying condition that is older than 30 years old. Our query statements will have some minor tweaks to identify the constraints of structured search filter filter:
Curl-xget ' Http://localhost:9200/megacorp/employee/_search '-D
{"
query": {
"filtered": {
" Filter ": {"
range ": {" Age
": {" GT ": $ <1>
}
},
" query ": {"
match ": {
" last_n Ame ":" Smith "<2>}}}}
[root@w03 cluster_cn]# curl-xget ' http://localhost:9200/megacorp/employee/_search '-d ' {"Query": {"filtered": { "Filter": {"range": {"age": {"GT": +}}, "query": {"match": {"last_name": "Smith"}}}} '
This section of the statement is the range filter, which can query all data over 30 years of age –gt represent greater than (greater than).
This part of our previous action is the same as the match query
Don't be intimidated by so many statements, we'll take you to the next to get to know their usage. All you need to know now is that we have added a filter that can be used to search for a range based on match search. Now, we're only going to show the 32-year-old employee named Jane Smith:
{
"took": 3,
"Timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
} ,
"hits": {
"total": 1,
"Max_score": 1,
"hits": [
{
"_index": "Megacorp",
"_type": " Employee ",
" _id ":" 2 ",
" _score ": 1,
" _source ": {
" first_name ":" Jane ",
" last_name ":" Smith "Age
": +, "About
": "I like to collect rock albums",
"interests": [
"Music"
],
"Join_ Time ":" 2014-11-24 "}}
]
}
}
Full-Text Search
The search above is simple: Search by name, filter by age. Let's take a look at the more complex search, full-text search-a feature that is difficult to implement in a traditional database. We will search all employees who like rock climbing:
Curl-xget ' Http://localhost:9200/megacorp/employee/_search '-D
{
"Query": {
"Match": {
"About": "Rock Climbing"
}
}
}
[Root@w03 cluster_cn]# Curl-xget
You will find that we also used the match query to search for rock climbing in the About field. We will get two matching documents:
{
"took": 7,
"Timed_out": false,
"_shards"