Origin:
Since Hadoop is used, and because the project is not currently distributed, it is a clustered environment that causes the business log to be moved every time, and then analyzed by Hadoop.
In this case, it is not as good as the previous distributed flume to work with out-of-the-box HDFs to avoid unnecessary operations. Preparation Environment:
You must have a ready-to-use version of Hadoop. My version is 2.7.3. If you don't know how to install it, you can refer to the article in my library.
"Hadoop" follows the network of various raiders to carry out CentOS7 installation today at noon 20160826 out of the Hadoop2.7.3 and various crawl pits experience OS environment:
Currently I am using CentOS, if it is ubantu can also do part of the reference.
In fact, the installation is relatively simple, the difficulty is still in the configuration. Installation Steps
A: Go to Apache official download: Flume Latest stable version
Then unzip it in the virtual machine. Of course you can also directly wget download.
Write code here