In real-time computing, You need to collect logs in real time. logstash can do this. The current version is 1.4.2. The official documentation is available at http://www.logstash.net/docs/1.4.2/, which provides detailed configuration instructions and is easy to use. The reliability of logstash is verified.
If intput is file, kill the logstash Process
Print a log every MS and read it with logstash. The logstash process is killed every 20 s and restarted.It is found that logstash has a high probability of re-sending logs, and a small number of empty messages are sent. Note that duplicate messages and empty messages must be filtered in the code.
Disable output
- The output is redis. After kill redis, logstash's write operations to redis will be blocked. After redis is restored, it will be written and data will not be lost.
- The output is Kafka and the logstash-Kafka plug-in (https://github.com/joekiller/logstash-kafka) is used ). Kafka is usually a cluster. If one of the processes is killed, the Kafka service will be unavailable for a short period of time. The logstash side will retry upon failure, and no data will be lost as long as the number of retries is sufficient; if all Kafka processes are killed, the logstash side will continue to retry. After the threshold is exceeded, the data will be discarded, and data may be lost.
Logstash spof
A server generally only allows one logstash process. If the process fails and there is no automatic recovery mechanism, you need to manually pull it.
Summary
Here we only tested the case where the input is file. In general, logstash does not transmit less data, but it is possible to transmit more data.