004
Premise
With the rapid development of artificial intelligence and big data, fast retrieval of terabytes and even petabytes of big data has become a requirement, and large enterprises have already drowned in the vast stream of data generated by the system. Big Data technologies are already focused on how to store and process these massive amounts of data. Elasticsearch as a rising star in the field of open source, from 2010 to date has been a leap-forward development. Elasticsearch with its open source, distributed, RESTFul API Three advantages, has become the downwind of the mouth "will fly pig."
In My Computer local wrote several ElasticSearch source parsing, back to think should also write an article why I will go to see its source code?
Why is it? Below I talk about oneself from contact search to now see the process of source code!
Follow me
Zhisheng
Reproduced please be sure to indicate the original address: http://www.54tianzhisheng.cn/2018/08/24/why-see-es-code/
First Contact Search
Search, we first think of is the search engine: Google, Baidu, this is even the earliest contact.
My own project. Contact Search is a project that was practiced at the time of the sophomore summer vacation, in which SOLR was used, and then a little bit more on the project.
Second Contact Search
I used the search from the first project, and I was more interested in this aspect later on. Once again contact search is an internship time into the company. The first thing is to be old shouted to learn to build Elasticsearch cluster, so, the computer installed three virtual machine, Elasticsearch on one by a loaded up. Also recorded the blog down: Elasticsearch series article (ii): Full-text search engine Elasticsearch cluster Construction of the introductory tutorial, when the construction of the ES version has just been upgraded from 2.x to 5.x, as of this time 2018.08.04, the ES version is now To 7.0, this version of the upgrade is really fast, which also shows that ES active is very high, behind the development engineer maintenance is also fast, side highlighting to see its source of importance.
At that time, in the local test set up the cluster, assigned another task is to understand ES in the self-contained participle, English participle, Chinese word of the same and differences, as well as to establish their own segmentation needs attention points. So: At the time, the company Wiki contributed this article: Elasticsearch series article (a): Elasticsearch the default word breaker and the middle division between the comparison and use of the method. This article has almost already written all the participle in the market, including their similarities, differences, how to use, how to customize the word breaker.
Then there is a classmate of my group, her task is 2.x upgrade to 5.x mapping of the big change? Later I also read her summary of the document, very detailed!
After this contact ES, because I have a local environment, so I have tested some of the features, to ES install plug-ins (IK, x-pack, Support SQL,), and then go to test es Index, document, REST API.
Third Contact Search
Because they are interested in it, so I went to find some related video, such as: Chinese stone shirt "Elasticsearch Top Master Series-Master Advanced Step" several series of video tutorials personal feeling is good, read these several series estimated that the entry must be no problem. For copyright reasons, no download links are available.
In addition, "Elasticsearch authoritative guide" translated version, translation has not all, can go to see, speak very detailed, there should not be any book on the market has so clear, if the English good can directly chew English bar.
There is the official website of the document, very very detailed, there is also the demo,2.x version of the official documents have Chinese, you can make a look.
Learn new things, to learn to read the official documents, not to mention Elasticsearch official documents so detailed!
Contact search for the fourth time
At the end of the internship, we also assigned two modules of the company's middleware monitoring: Elasticsearch and HBase component monitoring. So, once again have the opportunity to contact Elasticsearch, this time mainly using the REST API Elasticsearch comes with:,, _cluster/health
_cluster/stats
_nodes
_nodes/stats
to get to the cluster health information, node information (memory, CPU, network, JVM, and other information). In order to do this project, I also went to find a lot of similar articles on the Internet to refer to commonly used monitoring indicators and how they do monitoring. My mission was mainly to collect information, and then save to the company's major projects in the Influxdb, and finally show up with Grafana, behind my group's ops big guy showed me the monitoring market, the interface is cool, ha ha, good!
At that time, two blog posts were written:
1, Elasticsearch series article (three): Elasticsearch cluster monitoring
2, Elasticsearch series article (four): Elasticsearch single node monitoring
Take the network, but also the network, I hope to do similar tasks in the back of the small partners to give some reference advice!
Then I build ELK (ElasticSearch, Logstash, Kibana) log analysis platform, and then play the next!
Build Environment Blog: Elasticsearch series article (v): ELK Real-time log analysis Platform environment construction
Contact search for the fifth time
There was no contact with ElasticSearch, and was busy with other things.
Internship resignation, graduation out looking for work of that period of time, oneself and spent a week a little over the "Elasticsearch authoritative guide" This book, saying also help me interview quite a lot of close, haha ha! Because I wrote in the project Elasticsearch monitoring, if you Elasticsearch other unfamiliar, the interviewer slightly asked some other about this aspect, then do not know is a bit embarrassed, so still prepared the next. It's not much of a problem to deal with the interview after reading.
Sixth time Contact
It seems that I have contact with Elasticsearch for a long time, in fact, the real project is not used to Elasticsearch do projects, no use of Elasticsearch search to do what the project, So I was looking for a job in fact also want to find a job to see if you can do a project or a company project inside use Elasticsearch it?
The results were soon to be used in the new company's new project. Only this time it is not used in the Java project, but with Golang integration. However, the API is similar, more familiar with a few times quickly get started, the key is to understand how to construct a DSL query Elasticsearch, so that conversion to go inside the API is fast.
There is just a Chinese Academy of Sciences graduate students, he wrote Elasticsearch this piece of the book "from Lucene to Elasticsearch full-text search actual combat", in addition his CSDN blog is also very fire, reading is very high, interested can buy a book support.
Halfway to meet Elasticsearch really won't problem will take the initiative to find big Guy Consulting, then the big guy patiently teach me this slag dregs vegetable chicken, in the article here thanks to the big guy this time of care.
Initiation of reading the source of ideas
Since the contact with so long Elasticsearch, the project has been used, books have seen, although not very familiar, but if you look at the source of it will let me impression of it deeper?
Say dry, night home from GitHub clone source in the local, then just go home, on the train directly with VS code to see the source, nor in the IDE debug up.
Writing this article has been Elasticsearch the entire start-up process (load read configuration, loading plug-ins, etc.), how to support the REST API looked under, will go home after work to continue to read the source, continue to share my source code parsing.
If you have an idea, do it, don't try, how do you know if it's right for you?
Summarize
In fact, the main reason to read the source is still interested in it, and this thing is actually used in the project now, if I am familiar with the source code may be more thorough understanding of me, there is Elasticsearch really fire, the company almost all use, so learning is still necessary.