Javaweb Project Architecture Kafka distributed log queue

Last Update:2018-02-07 Source: Internet

Author: User

Tags message queue serialization zookeeper

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

architecture, distributed, log queue, the title itself is looking at bluffing, in fact, is a log collection function, but in the middle add a Kafka do message queue.

Kafka Introduction

Kafka is an open source processing platform developed by the Apache Software Foundation, written by Scala and Java. Kafka is a high-throughput distributed publish-subscribe messaging system that handles all the action flow data in a consumer-scale website. This kind of action (web browsing, search and other user actions) is a key factor in many social functions on modern networks. This data is usually resolved by processing logs and log aggregations due to throughput requirements.

Characteristics

Kafka is a high-throughput distributed publish-subscribe messaging system with the following features:

Provides persistence of messages through the disk data structure of O (1), a structure that maintains long-lasting performance even with terabytes of message storage.
High throughput: Even very common hardware Kafka can support millions of messages per second.
Support for partitioning messages through Kafka servers and consumer clusters.
Supports Hadoop parallel data loading.

Key Features

Publish and subscribe to the message flow, which is similar to Message Queuing, which is why Kafka is categorized as a Message Queuing framework
Record message flows in a fault-tolerant manner, Kafka store message flows as files
Can be processed when the message is released

Usage Scenarios

Build reliable pipelines between systems or applications for transmitting real-time data, Message Queuing capabilities
Build real-time stream data processing program to transform or process data flow, and data processing functions

Message transfer process

Introduction to related terms

Broker
The Kafka cluster contains one or more servers, which are called broker
Topic
Each message published to the Kafka Cluster has a category, which is called topic. (Physically different topic messages are stored separately, logically a topic message is saved on one or more brokers but the user only needs to specify the topic of the message to produce or consume data without worrying about where the data is stored)
Partition
Partition is a physical concept, and each topic contains one or more partition.
Producer
Responsible for publishing messages to Kafka broker
Consumer
The message consumer, the client that reads the message to Kafka broker.
Consumer Group
Each consumer belongs to a specific consumer group (the group name can be specified for each consumer, and the default group if the group name is not specified)

Kafka Installation Environment

Linux, JDK, Zookeeper

Download Binary Program

wget https://archive.apache.org/dist/kafka/0.10.0.1/kafka_2.11-0.10.0.1.tgz

Installation

tar -zxvf kafka_2.11-0.10.0.1.tgzcd kafka_2.11-0.10.0.1

Catalogue description

bin 启动,停止等命令config 配置文件libs 类库

Parameter description

######################## #参数解释 ############################# #broker. Id=0 #当前机器在集群中的唯一标识, like the myid nature of zookeeper port= 9092 #当前kafka对外提供服务的端口默认是9092host. name=192.168.1.170 #这个参数默认是关闭的num. Network.threads=3 # This is the number of threads that are borker for network processing num.io.threads=8 #这个是borker进行I/O processes the number of threads log.dirs=/opt/kafka/kafkalogs/#消息存放的目录, this directory can be configured as "," Comma-separated expressions, The above num.io.threads is larger than the number of this directory, if you configure more than one directory, the newly created topic he persisted the message is, the current comma-separated directory, the minimum number of partitions to put that one socket.send.buffer.bytes= 102400 #发送缓冲区buffer大小, the data is not sent in a sudden, first back to the buffer to reach a certain size after the transmission, can be raised high-performance socket.receive.buffer.bytes=102400 #kafka接收缓冲区大小, When the data reaches a certain size, it is serialized to disk socket.request.max.bytes=104857600 #这个参数是向kafka请求消息或者向kafka发送消息的请请求的最大数, This value cannot exceed the Java stack size Num.partitions=1 #默认的分区数, a topic default of 1 partitions log.retention.hours=168 #默认消息的最大持久化时间, 168 hours, 7 days message.max.byte=5242880 #消息保存的最大值5Mdefault. replication.factor=2 #kafka保存消息的副本数, if a copy fails, Another can continue to provide services replica.fetch.max.bytes=5242880 #取消息的最大直接数log. segment.bytes=1073741824 #这个参数是: Because Kafka's message is landed in an appended form to the file , when this value is exceeded, Kafka will start a new file log.retention.check.interval.ms=300000 #每隔300000 MS to check the log expiration time configured above (log.retention.hours=168), to the directory to see if there is an expired message if there is, delete the Log.cleaner.enable=false #是否启用log压缩, generally do not enable, If enabled, it can be used to raise high performance zookeeper.connect=192.168.1.180:12181,192.168.1.181:12181,192.168.1.182:1218 #设置zookeeper的连接端口, If you configure a non-cluster address, you can ######################## #参数解释 ##############################

Start Kafka

Start Kafka before starting the appropriate zookeeper cluster, self-installing, do not explain here.

#进入到kafka的bin目录./kafka-server-start.sh -daemon ../config/server.properties

Kafka Integrated Environment

Spring-boot, Elasticsearch, Kafka

Pom.xml introduced:

<!-- kafka 消息队列 --><dependency><groupId>org.springframework.kafka</groupId>    <artifactId>spring-kafka</artifactId>    <version>1.1.1.RELEASE</version></dependency>

Producers

Import Java.util.hashmap;import Java.util.map;import Org.apache.kafka.clients.producer.producerconfig;import Org.apache.kafka.common.serialization.stringserializer;import Org.springframework.beans.factory.annotation.value;import Org.springframework.context.annotation.bean;import Org.springframework.context.annotation.configuration;import Org.springframework.kafka.annotation.EnableKafka; Import Org.springframework.kafka.core.defaultkafkaproducerfactory;import Org.springframework.kafka.core.kafkatemplate;import org.springframework.kafka.core.producerfactory;/** * Producer * Creator Ke Gang Net * created February 4, 2018 */@Configuration @enablekafkapublic class Kafkaproducerconfig {@Value ("${kafka.producer.servers}"    ) Private String servers;    @Value ("${kafka.producer.retries}") private int retries;    @Value ("${kafka.producer.batch.size}") private int batchsize;    @Value ("${kafka.producer.linger}") private int linger;    @Value ("${kafka.producer.buffer.memory}") private int buffermemory; PublIC map<string, object> producerconfigs () {map<string, object> props = new hashmap<> ();        Props.put (Producerconfig.bootstrap_servers_config, SERVERS);        Props.put (Producerconfig.retries_config, retries);        Props.put (Producerconfig.batch_size_config, batchsize);        Props.put (Producerconfig.linger_ms_config, LINGER);        Props.put (Producerconfig.buffer_memory_config, buffermemory);        Props.put (Producerconfig.key_serializer_class_config, Stringserializer.class);        Props.put (Producerconfig.value_serializer_class_config, Stringserializer.class);    return props; } public producerfactory<string, String> producerfactory () {return new Defaultkafkaproducerfactory<&gt    ;(p roducerconfigs ()); } @Bean public kafkatemplate<string, string> kafkatemplate () {return new kafkatemplate<string, Stri    Ng> (Producerfactory ()); }}

Consumers

Mport Java.util.hashmap;import Java.util.map;import Org.apache.kafka.clients.consumer.consumerconfig;import Org.apache.kafka.common.serialization.stringdeserializer;import Org.springframework.beans.factory.annotation.value;import Org.springframework.context.annotation.bean;import Org.springframework.context.annotation.configuration;import Org.springframework.kafka.annotation.EnableKafka; Import Org.springframework.kafka.config.concurrentkafkalistenercontainerfactory;import Org.springframework.kafka.config.kafkalistenercontainerfactory;import Org.springframework.kafka.core.consumerfactory;import Org.springframework.kafka.core.defaultkafkaconsumerfactory;import org.springframework.kafka.listener.concurrentmessagelistenercontainer;/** * Consumer * Creator Section help Net * created February 4, 2018 */@ Configuration@enablekafkapublic class Kafkaconsumerconfig {@Value ("${kafka.consumer.servers}") Private String serve    Rs    @Value ("${kafka.consumer.enable.auto.commit}") Private Boolean enableautocommit; @Value ("${kafka.consumer.session.timeout}") Private String sessiontimeout;    @Value ("${kafka.consumer.auto.commit.interval}") Private String autocommitinterval;    @Value ("${kafka.consumer.group.id}") Private String groupId;    @Value ("${kafka.consumer.auto.offset.reset}") Private String autooffsetreset;    @Value ("${kafka.consumer.concurrency}") private int concurrency; @Bean public kafkalistenercontainerfactory<concurrentmessagelistenercontainer<string, String>> Kafkalistenercontainerfactory () {concurrentkafkalistenercontainerfactory<string, String> factory = new Concu        Rrentkafkalistenercontainerfactory<> ();        Factory.setconsumerfactory (Consumerfactory ());        Factory.setconcurrency (concurrency);        Factory.getcontainerproperties (). Setpolltimeout (1500);    return factory; } public consumerfactory<string, String> consumerfactory () {return new Defaultkafkaconsumerfactory<&gt    ;(consumerconfigs ());  }  Public map<string, Object> Consumerconfigs () {map<string, object> propsmap = new hashmap<> ();        Propsmap.put (Consumerconfig.bootstrap_servers_config, SERVERS);        Propsmap.put (Consumerconfig.enable_auto_commit_config, enableautocommit);        Propsmap.put (Consumerconfig.auto_commit_interval_ms_config, autocommitinterval);        Propsmap.put (Consumerconfig.session_timeout_ms_config, sessiontimeout);        Propsmap.put (Consumerconfig.key_deserializer_class_config, Stringdeserializer.class);        Propsmap.put (Consumerconfig.value_deserializer_class_config, Stringdeserializer.class);        Propsmap.put (Consumerconfig.group_id_config, groupId);        Propsmap.put (Consumerconfig.auto_offset_reset_config, Autooffsetreset);    return propsmap;    } @Bean Public Listener Listener () {return new Listener (); }}

Log monitoring

Import Org.apache.kafka.clients.consumer.consumerrecord;import Org.slf4j.logger;import org.slf4j.LoggerFactory; Import Org.springframework.beans.factory.annotation.autowired;import Org.springframework.kafka.annotation.kafkalistener;import Org.springframework.stereotype.component;import Com.itstyle.es.common.utils.jsonmapper;import Com.itstyle.es.log.entity.syslogs;import com.itstyle.es.log.repository.elasticlogrepository;/** * Scan Listener * Creator Department network * created February 4, 2018 */@Componentpublic class Listene    R {Protected final Logger Logger = Loggerfactory.getlogger (This.getclass ());    @Autowired private Elasticlogrepository elasticlogrepository; @KafkaListener (topics = {"Itstyle"}) public void Listen (consumerrecord<?,? > Record) {logger.info ("Kafka        Key: "+ Record.key ());        Logger.info ("Kafka Value:" + record.value ()); if (Record.key (). Equals ("Itstyle_log")) {try {syslogs log = jsonmapper.fromjsonstring (record.val UE (). toString (), syslogs. Class);                Logger.info ("Kafka Save log:" + log.getusername ());            Elasticlogrepository.save (log);            } catch (Exception e) {e.printstacktrace (); }        }    }}

Test log Transfer

  /**    * kafka 日志队列测试接口    */   @GetMapping(value="kafkaLog")   public @ResponseBody String kafkaLog() {        SysLogs log = new SysLogs();        log.setUsername("红薯");        log.setOperation("开源中国社区");        log.setMethod("com.itstyle.es.log.controller.kafkaLog()");        log.setIp("192.168.1.80");        log.setGmtCreate(new Timestamp(new Date().getTime()));        log.setExceptionDetail("开源中国社区");        log.setParams("{‘name‘:‘码云‘,‘type‘:‘开源‘}");        log.setDeviceType((short)1);        log.setPlatFrom((short)1);        log.setLogType((short)1);        log.setDeviceType((short)1);        log.setId((long)200000);        log.setUserId((long)1);        log.setTime((long)1);        //模拟日志队列实现        String json = JsonMapper.toJsonString(log);        kafkaTemplate.send("itstyle", "itstyle_log",json);        return "success";   }

Kafka and Redis

Previously, the Redis distributed log queue for the Javaweb project architecture was briefly introduced, and small partners chatted that Redis Pub/sub did not guarantee any reliability or persist. Of course, the original project is only log, not very important information, can be lost to a certain extent

The biggest difference between Kafka and Redis Pub/sub is that Kafka is a complete distributed publish-subscribe messaging system, and Redis pub/sub is just a component.

Usage Scenarios

Redis pub/sub
Message persistence requirements are low, throughput requirements are low, and data loss can be tolerated
Kafka
High-availability, high-throughput, durable, diversified consumption processing models

Open source project source code (Reference): Https://gitee.com/52itstyle/spring-boot-elasticsearch

Javaweb Project Architecture Kafka distributed log queue

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More