Dockone WeChat Share (72): The practice of log system integration in Kubernetes container cluster

Source: Internet
Author: User
Tags solr kibana logstash fluentd glusterfs openstack swift
This is a creation in Article, where the information may have evolved or changed.
"Editor's note" This course will share how to use Fluentd, ElasticSearch, Kibana to build a highly available and scalable log system in a kubernetes cluster to collect, process, analyze, and present container logs in kubernetes clusters.

Kubernetes is a native container orchestration management system with native support for load balancing, service discovery, high availability, rolling upgrades, automatic scaling, and other container cloud platforms. Today I share our practical solution for log management in kubernetes clusters. In this scenario, in addition to Docker and Kubernetes, the main technologies involved include: FLUENTD, Elasticsearch, Kibana, and Swift.

techniques involved in the Fig00-kubernetes log system

Criteria for evaluating the container Cloud Platform log system:
    1. Scalable: Supports cluster-scale growth
    2. Low overhead: Try to use less system resources
    3. Intrusion small: Try not to change the application container and cloud Platform system
    4. Large concentration: All the logs that are distributed across host nodes are analyzed and queried together
    5. Easy deployment: Facilitates automated deployment to distributed clusters
    6. Easy to customize: Easy to handle different log formats, easy to dock different storage methods
    7. Effectiveness: The log needs to be able to be viewed and analyzed in a short period of time after it is generated
    8. Community active: facilitates future maintenance and updates, and facilitates functional expansion


Introduction of FLUENTD

FLUENTD is a real-time log collection system that uses JSON as the intermediate processing format for logs, with a flexible plug-in mechanism that supports a wide variety of log input applications, output applications, and a variety of log parsing, caching, filtering, and formatting output mechanisms.

Architecture of the FLUENTD

Fluentd JSON as the intermediate format for data processing, with a plug-in architecture that extends support for different applications or systems as log source and log output. Suppose there are m input sources wordpress, MySQL, Tomcat ... N Output end MySQL, MongoDB, ElasticSearch ... Then the code module that handles the log is reduced by MXN to M+n. We collect logs in the Kubernetes cluster mainly with Https://hub.docker.com/r/fabri ... etes/this image and https://github.com/fabric8io/d ... netes This plugins.

fig01-fluend Architecture


fig02-fluentd function

Features of Fluentd:
    1. Log processing with JSON as a unified middle-tier log format
    2. Ruby-based Log Collection tool: Footprint smaller than JRuby-based Logstash
    3. Compatible input source outputs basic and Logstash equivalent
    4. Performance-related sections C code: faster
    5. Support for plug-in extensions: Input, Parser, Filter, Output, Formatter and Buffer


FLUENTD deployment Architecture in a kubernetes cluster

There are two ways to support this pod on each node Fluentd-elasticsearch: 1. Put under/etc/kubernetes/manifest, start automatically with a script; 2. Start with Daemonset. Both of these modes are designed to ensure that there is a FLUENTD pod run on each kubernetes cluster node to collect logs. Kubernetes Daemonset This API object is designed to host the pod.

FIG03-FLUENTD deployment Architecture in a kubernetes cluster

Deployment case in Kubenetes YAML

When you deploy FLUENTD in a kubernetes cluster, you can deploy pods that use Docker mirroring fluentd-elasticsearch to each kubernetes node using a Yaml file similar to the following.

FIG04-FLUENTD deployment of YAML in a kubernetes cluster
FLUENTD Runtime status of pod:

Fig05-fluentd running state in kubernetes cluster

Reasons to choose Fluentd:
    • Low overhead: The core code is C, the plug-in code is Ruby, and you don't need to package the JDK
    • Small intrusion: Deploy with pods without interfering with application containers and host services
    • Easy to deploy: Use container mirroring as a single container pod deployment
    • Easy to customize: It's easy to add and change plugins for your application


ElasticSearch

Elasticsearch is a real-time, distributed search and analysis engine. It can be used for document storage, full-text search, structured search and real-time analysis, and the most typical application scenarios for common Internet applications are log analysis queries and full-text search.

Architecture of the Elasticsearch

In Elasticsearch, there are three types of nodes: the first is Data node, which is used to store the information, and the Elasticsearch can have multiple copies of the data to provide high data availability; The second is client node, or query node. Provides load balancing for queries, and the third class is master eligible node, or index node, which can be elected as master node, while master node controls the state of the entire Elasticsearch cluster.

architecture of the Fig06-elasticsearch

Elasticsearch deployment Architecture in Kubernetes

In Kubernetes, all three types of nodes are one containing the same mirror pod
Elasticsearch-cloud-kubernetes, the difference is only when the environment role at startup is different. The query and index nodes need to provide external Web services and need to be published as a kubernetes service. The data node needs to bind a persisted storage, we create the storage volume with Kubernetes PV, and the data node is bound to the corresponding data volume by Kubernetes PVC. The actual storage of PV can be NFS, glusterfs and other shared storage.

Fig07-elasticsearch deployment Architecture in Kubernetes

Features of Elasticsearch

    • Search Engine: Full-text search engine based on Apache Lucene, as an open-source search Engine application case began to be more popular than SOLR
    • Document database: Can be used as a document database, store document Big data, log big Data
    • Real-time Data Analysis Query system: Support real-time analysis and query of large data volume
    • Fully distributed: Ability to expand storage and query speed with the number of nodes
    • High availability: can automatically detect broken shards, auto-rebalance The Shard data stored on each node, can back up cold data to object storage
    • Support for plug-in extensions: Customizable plug-ins support different backup targets


Analogy between Elasticsearch and traditional relational database

There are many similarities between elasticsearch and traditional data concepts, and the following is a comparison between Elasticsearch and traditional relational databases.

comparison of Fig08-elasticsearch with traditional database

Elasticsearch deployment in a kubernetes cluster

Deploying Elasticsearch in a kubernetes cluster, we deploy 3 nodes in similar graphs: The Es-data class is used to hold the index data, the Es-master class is used to provide the index write service, and the es-client is used to provide the index query service.

Fig09-elasticsearch deployment in a kubernetes cluster

Persistent storage of elasticsearch data in kubernetes clusters

When deploying Es-data nodes, their data volumes are mounted on shared storage volumes, where the PVC/PV mode mounts an NFS PV supply data volume as shown in.

persistent storage of fig10-elasticsearch data in kubernetes clusters

Backup and restore of logs

Elasticsearch allows backup and recovery for a single index or for an entire cluster. The types of target storage warehouses supported by backup recovery include:

S3 Warehouse: Use AWS S3 as the backup warehouse

installation command:
sudo bin/elasticsearch-plugin install REPOSITORY-S3

To create a warehouse example:
PUT _snapshot/my-s3-repository-1
{
"Type": "S3",
"Settings": {
"Bucket": "S3_repository_1",
"Region": "Us-west"
}
}

Azure Warehouse: Azure as the Backup warehouse

installation command:
sudo bin/elasticsearch-plugin install repository-azure

To create a warehouse example:
PUT _snapshot/my-azure-repository-1
{
' Type ': ' Azure ',
"Settings": {
"Container": "Backup-container",
"Base_path": "Backups",
"Chunk_size": "32m",
"Compress": true
}
}

HDFs Warehouse: HDFs as the backup warehouse

installation command:
sudo bin/elasticsearch-plugin install Repository-hdfs

To create a warehouse example:
PUT _snapshot/my-hdfs-repository-1
{
"Type": "HDFs",
"Settings": {
"uri": "hdfs://namenode:8020/",
"Path": "Elasticsearch/respositories/my_hdfs_repository",
"Conf.dfs.client.read.shortcircuit": "True"
}
}

GCS Warehouse: Use Google Cloud storage as a backup warehouse

installation command:
sudo bin/elasticsearch-plugin install Repository-gcs

To create a warehouse example:
PUT _snapshot/my-gcs-repository-1
{
"Type": "GCs",
"Settings": {
"Bucket": "My_bucket",
"Service_account": "_default_"
}
}

As a private cloud deployment environment, most OpenStack-based IaaS tiers can be backed by OpenStack's object store Swift.

Swift Warehouse: Use OpenStack Swift as a backup warehouse

installation command:
sudo bin/elasticsearch-plugin install org.wikimedia.elasticsearch.swift/swift-repository-plugin/2.1.1
To create a warehouse example:
PUT _snapshot/my-gcs-repository-1
{
"Type": "Swift",
"Settings": {
"Swift_url": "http://localhost:8080/auth/v1.0/",
"Swift_container": "My-container",
"Swift_username": "MyUser",
"Swift_password": "Mypass!"
}
}

Reasons to choose Elasticsearch:
    • Easy to scale: can expand storage capacity and indexing capabilities as you increase node levels
    • Large focus: Centralize data from all pods and containers for easy querying and analysis
    • Easy to deploy: Use container mirroring as a single container pod deployment
    • Easy to customize: It's easy to add and change plugins for your application
    • Effectiveness: The log needs to be able to be viewed and analyzed in a short period of time after it is generated
    • Community Active: The Elasticsearch community is becoming more active and catching up with SOLR


Kibana

Kibana Deployment in Kubernetes

Kibana and Elasticsearch integration is relatively intuitive, using https://hub.docker.com/r/fabric8/kibana4/image, set the Elasticsearch_url parameters can be, Kibana is a web front-end service deployed in a kubernetes cluster, and it references the ELASTICSEARCH_URL environment variable for use as a resource.

fig11-deploying Kibana in the Kubernetes cluster

Architecture of the overall log management system

A FLUENTD container is run on each node in the Kubernetes cluster, and the log of the collection container is sent to the Elasticsearch cluster. Elasticsearch clusters save a week of logs as thermal data for real-time analysis and querying, and users can view data for any node, Namespace, Service, pod, and container through Kibana. For more than a week of log data, Elasticsearch is automatically backed up to the corresponding bucket in the Swift object store.

fig12-The architecture of the overall kubernetes log management system

Q&a

Q: How is the log of kubernetes host collected?

A: The log is similar to the collection container, in fact the container logs are also collected from the log directory of the host.
Q: What problems should I pay attention to when I put the machine log into the system, especially the mobile device? Generating logs now there are two scenarios: (1) using the app to send Fluentd (2) to use the syslog, personally feel that the second is more appropriate, or there are other better options?

A: Sorry, we are more concerned about the Shuyun server-side logs, mobile device logs have not been studied. If the mobile device logs are uploaded to the server via the server's API, then the same processing is done. Generally our understanding, mobile device log is through the application of some of their own log programs, regularly compressed to send to the server or third-party log phone platform, and then on the server side, as a normal server application log to deal with, but to hit the mobile device and the user's relevant tag.
Q:elasticsearch is the time you can set up a backup cycle? What if I want to keep a one-month log for my query?

A: Can be set. But it can be done through your own scripts or crontab tasks. ES is currently primarily offering to create, delete, and restore a cluster from a reserved backup via the rest API.
Q:fluentd can I set the collection container application specified directory log (non-standard output directory), how to set?

A: The container application directory is within the container, FLUENTD is not collected unless your output directory is also externally mounted shared directory. If it is a simple Docker engine managed node, the data stored by another container can be accessed through--volumn-from, which is also the volumn of the container-V declaration rather than any directory; This is not helpful for kubernetes clusters.
q:elasticsearch Log Retention policy, how to set it, is the API to delete or Elasticsearch?

A: The way we use it now is to index it in time, and then the script deletes it periodically.
Q: Data node PVC/PV mounted file system is that the kind? What problems have you encountered in actual use?

A: We have used mainly NFS and glusterfs. The initial implementation of the PV ratio is weak, PVC can not be matched by label to PV, only by size and access type matching, can not accurately select PV storage. Now the latest kubernetes supports PVC selector support for selecting PV with a specific label.
Q: How is the log of kubernetes host collected?

A: With the corresponding different FLUENTD plug-ins, similar mount to the corresponding host log directory. The container's logs are now also collected through the host.
Q: The log includes the standard output log of the container capture and the log that the app hits into the log file. For these two types of scenarios, how can I use FLUENTD to implement the automatic log discovery and collection of new boot containers?

A: For logging in log files, it is recommended in principle that the log directory be a host binding or a shared directory. The automatic discovery and collection of logs requires that the files of the specified directory be filtered out via the Fluentd plugin, for example, the standard output log must be in the host's/var/lib/docker/containers/container-id/. Our integrated FLUENTD image has been packaged and configured with the corresponding plugin https://github.com/fabric8io/f ... ilter, can be consulted.
Q: We are also using FLUENTD to collect container logs, collecting containers written to log files is slower than collecting logs from standard output, what are your tuning methods?

A: File log slower than standard output is normal, tuning FLUENTD performance may be based on the description of FLUENTD gradually accumulate experience where the first location of slow, and then test to speed up the method, with the performance of various systems tuning is the same idea. The following link has some tuning suggestions. Http://docs.fluentd.org/articl ... uning
Q: How to deal with the cluster self-recovery mechanism, such as Elasticsearch-master, Elasticsearch-client hung?

A: In the Kubernetes cluster, both Elasticsearch-master and Elasticsearch-client are started with the Relication controller or replication set mode, The system automatically guarantees high availability of the service. Other clusters are similar mechanisms, and generally the high availability of Web applications is the same, to have mechanisms to ensure restart services, to have a mechanism to do service discovery and load balancing, in the k8s cluster is relying on Relication controller and Kube-proxy.
Q: Would you like to run a FLUENTD container on each node in the Kubernetes cluster, which is the container or the node that deploys Docker?

A: This is the host node (possibly a physical or virtual machine), which is kubernetes node, which deploys Docker nodes. The recommended official method is to deploy through Kubernetes's daemonset. Of course, you can also maintain an automatic startup script on each node and run some pod services that each node will start.
Q: Excuse me, Kubernetes Master is a single point, whether you have optimized, now 1.3, your platform upgrade is a hot deployment?

A: We are using podmaster to do at least 3 node master high-availability deployments, Api-server is alive, Controllermanager and schedule are 1 Live 2 prepared. Platform upgrades are now manual and do not affect the services that are running. But now the platform upgrade requires careful manual action by the engineer, not yet automated.
Q: What is the difference between FLUENTD and Flume except the development language? And the number of shards in the ES Day index is recommended?

The design concepts of A:FLUENTD and Flume are similar, one with Cruby, one with JRuby, and Fluentd and Logstash, the FLUENTD image will be smaller and the memory consumption will be smaller at run time. And Flume's image is about hundreds of trillion to pack the JDK. In the Linux environment around the container, Java's cross-platform itself does not bring a particularly big advantage, but FLUENTD mirror small advantages more obvious. The other one is of great significance to our system practice, that is, the FLUENTD and kubernetes cluster's scheme is simple and clear. Es the default number of shards is 5, the general cluster situation to be measured according to the use of the situation, for different index, this shards number can be not the same. The number of shards as far as possible not set 1 bar, set 1, in the future want to add shards, the amount of data moved is too large. It is better to set 2 to 5 before you have sufficient experience in production testing.
The above content is organized according to the July 28, 2016 night group sharing content. Share people Wang Xin, master of Communications, Tsinghua University, ISC2 certified Information Systems security expert (CISSP), senior architect and software development specialist. Former VMware, IBM Senior research and development manager, BEA, Lucent Senior Engineer, engaged in more than 10 years of enterprise middleware, PAAs platform and SDN and other products research and development work, there are more than 10 U.S. and China in the trial patent invention. Is now the chief architect of Light element。 Dockone Weekly will organize the technology to share, welcome interested students add: Liyingjiesz, into group participation, you want to listen to the topic or want to share the topic can give us a message.
    • docker-elasticsearch-kubernetes.docx
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.