architecture, distributed, log queue, the title itself is looking at bluffing, in fact, is a log collection function, but in the middle add a Kafka do message queue.
Kafka Introduction
Kafka is an open source processing platform developed by the Apache Software Foundation, written by Scala and Java. Kafka is a high-throughput distributed publish-subscribe messaging system that handles all the action flow data in a consumer-scale website. This kind of action (web browsing, search and other user actions) is a key factor in many social functions on modern networks. This data is usually resolved by processing logs and log aggregations due to throughput requirements.
Characteristics
Kafka is a high-throughput distributed publish-subscribe messaging system with the following features:
- Provides persistence of messages through the disk data structure of O (1), a structure that maintains long-lasting performance even with terabytes of message storage.
- High throughput: Even very common hardware Kafka can support millions of messages per second.
- Support for partitioning messages through Kafka servers and consumer clusters.
- Supports Hadoop parallel data loading.
Key Features
Publish and subscribe to the message flow, which is similar to Message Queuing, which is why Kafka is categorized as a Message Queuing framework
Record message flows in a fault-tolerant manner, Kafka store message flows as files
- Can be processed when the message is released
Usage Scenarios
Message transfer process
Introduction to related terms
- Broker
The Kafka cluster contains one or more servers, which are called broker
- Topic
Each message published to the Kafka Cluster has a category, which is called topic. (Physically different topic messages are stored separately, logically a topic message is saved on one or more brokers but the user only needs to specify the topic of the message to produce or consume data without worrying about where the data is stored)
- Partition
Partition is a physical concept, and each topic contains one or more partition.
- Producer
Responsible for publishing messages to Kafka broker
- Consumer
The message consumer, the client that reads the message to Kafka broker.
- Consumer Group
Each consumer belongs to a specific consumer group (the group name can be specified for each consumer, and the default group if the group name is not specified)
Kafka Installation Environment
Linux, JDK, Zookeeper
Download Binary Program
wget https://archive.apache.org/dist/kafka/0.10.0.1/kafka_2.11-0.10.0.1.tgz
Installation
tar -zxvf kafka_2.11-0.10.0.1.tgzcd kafka_2.11-0.10.0.1
Catalogue description
bin 启动,停止等命令config 配置文件libs 类库
Parameter description
######################## #参数解释 ############################# #broker. Id=0 #当前机器在集群中的唯一标识, like the myid nature of zookeeper port= 9092 #当前kafka对外提供服务的端口默认是9092host. name=192.168.1.170 #这个参数默认是关闭的num. Network.threads=3 # This is the number of threads that are borker for network processing num.io.threads=8 #这个是borker进行I/O processes the number of threads log.dirs=/opt/kafka/kafkalogs/#消息存放的目录, this directory can be configured as "," Comma-separated expressions, The above num.io.threads is larger than the number of this directory, if you configure more than one directory, the newly created topic he persisted the message is, the current comma-separated directory, the minimum number of partitions to put that one socket.send.buffer.bytes= 102400 #发送缓冲区buffer大小, the data is not sent in a sudden, first back to the buffer to reach a certain size after the transmission, can be raised high-performance socket.receive.buffer.bytes=102400 #kafka接收缓冲区大小, When the data reaches a certain size, it is serialized to disk socket.request.max.bytes=104857600 #这个参数是向kafka请求消息或者向kafka发送消息的请请求的最大数, This value cannot exceed the Java stack size Num.partitions=1 #默认的分区数, a topic default of 1 partitions log.retention.hours=168 #默认消息的最大持久化时间, 168 hours, 7 days message.max.byte=5242880 #消息保存的最大值5Mdefault. replication.factor=2 #kafka保存消息的副本数, if a copy fails, Another can continue to provide services replica.fetch.max.bytes=5242880 #取消息的最大直接数log. segment.bytes=1073741824 #这个参数是: Because Kafka's message is landed in an appended form to the file , when this value is exceeded, Kafka will start a new file log.retention.check.interval.ms=300000 #每隔300000 MS to check the log expiration time configured above (log.retention.hours=168), to the directory to see if there is an expired message if there is, delete the Log.cleaner.enable=false #是否启用log压缩, generally do not enable, If enabled, it can be used to raise high performance zookeeper.connect=192.168.1.180:12181,192.168.1.181:12181,192.168.1.182:1218 #设置zookeeper的连接端口, If you configure a non-cluster address, you can ######################## #参数解释 ##############################
Start Kafka
Start Kafka before starting the appropriate zookeeper cluster, self-installing, do not explain here.
#进入到kafka的bin目录./kafka-server-start.sh -daemon ../config/server.properties
Kafka Integrated Environment
Spring-boot, Elasticsearch, Kafka
Pom.xml introduced:
<!-- kafka 消息队列 --><dependency><groupId>org.springframework.kafka</groupId> <artifactId>spring-kafka</artifactId> <version>1.1.1.RELEASE</version></dependency>
Producers
Import Java.util.hashmap;import Java.util.map;import Org.apache.kafka.clients.producer.producerconfig;import Org.apache.kafka.common.serialization.stringserializer;import Org.springframework.beans.factory.annotation.value;import Org.springframework.context.annotation.bean;import Org.springframework.context.annotation.configuration;import Org.springframework.kafka.annotation.EnableKafka; Import Org.springframework.kafka.core.defaultkafkaproducerfactory;import Org.springframework.kafka.core.kafkatemplate;import org.springframework.kafka.core.producerfactory;/** * Producer * Creator Ke Gang Net * created February 4, 2018 */@Configuration @enablekafkapublic class Kafkaproducerconfig {@Value ("${kafka.producer.servers}" ) Private String servers; @Value ("${kafka.producer.retries}") private int retries; @Value ("${kafka.producer.batch.size}") private int batchsize; @Value ("${kafka.producer.linger}") private int linger; @Value ("${kafka.producer.buffer.memory}") private int buffermemory; PublIC map<string, object> producerconfigs () {map<string, object> props = new hashmap<> (); Props.put (Producerconfig.bootstrap_servers_config, SERVERS); Props.put (Producerconfig.retries_config, retries); Props.put (Producerconfig.batch_size_config, batchsize); Props.put (Producerconfig.linger_ms_config, LINGER); Props.put (Producerconfig.buffer_memory_config, buffermemory); Props.put (Producerconfig.key_serializer_class_config, Stringserializer.class); Props.put (Producerconfig.value_serializer_class_config, Stringserializer.class); return props; } public producerfactory<string, String> producerfactory () {return new Defaultkafkaproducerfactory<> ;(p roducerconfigs ()); } @Bean public kafkatemplate<string, string> kafkatemplate () {return new kafkatemplate<string, Stri Ng> (Producerfactory ()); }}
Consumers
Mport Java.util.hashmap;import Java.util.map;import Org.apache.kafka.clients.consumer.consumerconfig;import Org.apache.kafka.common.serialization.stringdeserializer;import Org.springframework.beans.factory.annotation.value;import Org.springframework.context.annotation.bean;import Org.springframework.context.annotation.configuration;import Org.springframework.kafka.annotation.EnableKafka; Import Org.springframework.kafka.config.concurrentkafkalistenercontainerfactory;import Org.springframework.kafka.config.kafkalistenercontainerfactory;import Org.springframework.kafka.core.consumerfactory;import Org.springframework.kafka.core.defaultkafkaconsumerfactory;import org.springframework.kafka.listener.concurrentmessagelistenercontainer;/** * Consumer * Creator Section help Net * created February 4, 2018 */@ Configuration@enablekafkapublic class Kafkaconsumerconfig {@Value ("${kafka.consumer.servers}") Private String serve Rs @Value ("${kafka.consumer.enable.auto.commit}") Private Boolean enableautocommit; @Value ("${kafka.consumer.session.timeout}") Private String sessiontimeout; @Value ("${kafka.consumer.auto.commit.interval}") Private String autocommitinterval; @Value ("${kafka.consumer.group.id}") Private String groupId; @Value ("${kafka.consumer.auto.offset.reset}") Private String autooffsetreset; @Value ("${kafka.consumer.concurrency}") private int concurrency; @Bean public kafkalistenercontainerfactory<concurrentmessagelistenercontainer<string, String>> Kafkalistenercontainerfactory () {concurrentkafkalistenercontainerfactory<string, String> factory = new Concu Rrentkafkalistenercontainerfactory<> (); Factory.setconsumerfactory (Consumerfactory ()); Factory.setconcurrency (concurrency); Factory.getcontainerproperties (). Setpolltimeout (1500); return factory; } public consumerfactory<string, String> consumerfactory () {return new Defaultkafkaconsumerfactory<> ;(consumerconfigs ()); } Public map<string, Object> Consumerconfigs () {map<string, object> propsmap = new hashmap<> (); Propsmap.put (Consumerconfig.bootstrap_servers_config, SERVERS); Propsmap.put (Consumerconfig.enable_auto_commit_config, enableautocommit); Propsmap.put (Consumerconfig.auto_commit_interval_ms_config, autocommitinterval); Propsmap.put (Consumerconfig.session_timeout_ms_config, sessiontimeout); Propsmap.put (Consumerconfig.key_deserializer_class_config, Stringdeserializer.class); Propsmap.put (Consumerconfig.value_deserializer_class_config, Stringdeserializer.class); Propsmap.put (Consumerconfig.group_id_config, groupId); Propsmap.put (Consumerconfig.auto_offset_reset_config, Autooffsetreset); return propsmap; } @Bean Public Listener Listener () {return new Listener (); }}
Log monitoring
Import Org.apache.kafka.clients.consumer.consumerrecord;import Org.slf4j.logger;import org.slf4j.LoggerFactory; Import Org.springframework.beans.factory.annotation.autowired;import Org.springframework.kafka.annotation.kafkalistener;import Org.springframework.stereotype.component;import Com.itstyle.es.common.utils.jsonmapper;import Com.itstyle.es.log.entity.syslogs;import com.itstyle.es.log.repository.elasticlogrepository;/** * Scan Listener * Creator Department network * created February 4, 2018 */@Componentpublic class Listene R {Protected final Logger Logger = Loggerfactory.getlogger (This.getclass ()); @Autowired private Elasticlogrepository elasticlogrepository; @KafkaListener (topics = {"Itstyle"}) public void Listen (consumerrecord<?,? > Record) {logger.info ("Kafka Key: "+ Record.key ()); Logger.info ("Kafka Value:" + record.value ()); if (Record.key (). Equals ("Itstyle_log")) {try {syslogs log = jsonmapper.fromjsonstring (record.val UE (). toString (), syslogs. Class); Logger.info ("Kafka Save log:" + log.getusername ()); Elasticlogrepository.save (log); } catch (Exception e) {e.printstacktrace (); } } }}
Test log Transfer
/** * kafka 日志队列测试接口 */ @GetMapping(value="kafkaLog") public @ResponseBody String kafkaLog() { SysLogs log = new SysLogs(); log.setUsername("红薯"); log.setOperation("开源中国社区"); log.setMethod("com.itstyle.es.log.controller.kafkaLog()"); log.setIp("192.168.1.80"); log.setGmtCreate(new Timestamp(new Date().getTime())); log.setExceptionDetail("开源中国社区"); log.setParams("{‘name‘:‘码云‘,‘type‘:‘开源‘}"); log.setDeviceType((short)1); log.setPlatFrom((short)1); log.setLogType((short)1); log.setDeviceType((short)1); log.setId((long)200000); log.setUserId((long)1); log.setTime((long)1); //模拟日志队列实现 String json = JsonMapper.toJsonString(log); kafkaTemplate.send("itstyle", "itstyle_log",json); return "success"; }
Kafka and Redis
Previously, the Redis distributed log queue for the Javaweb project architecture was briefly introduced, and small partners chatted that Redis Pub/sub did not guarantee any reliability or persist. Of course, the original project is only log, not very important information, can be lost to a certain extent
The biggest difference between Kafka and Redis Pub/sub is that Kafka is a complete distributed publish-subscribe messaging system, and Redis pub/sub is just a component.
Usage Scenarios
- Redis pub/sub
Message persistence requirements are low, throughput requirements are low, and data loss can be tolerated
- Kafka
High-availability, high-throughput, durable, diversified consumption processing models
Open source project source code (Reference): Https://gitee.com/52itstyle/spring-boot-elasticsearch
Javaweb Project Architecture Kafka distributed log queue