1.kafka is a high-throughput distributed publish-subscribe messaging system that handles all the action flow data in a consumer-scale website
Step 1:download The Code
Download the 0.8.2.0 release and Un-tar it.
Tar-xzf kafka_2.10-0.8.2.0.tgz CD kafka_2.10-0.8.2.0
Step 2:start the server first to create zookeeper.>bin/zookeeper-server-start.sh config/zookeeper.properties
[2013-04-22 15:01:37,495] INFO Reading configuration from:config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig) ...
then open a new window now to start the server:
bin/kafka-server-start.sh config/server.properties [2013-04-22 15:01:47,028] Info Verifying properties (kafka.utils.VerifiableProperties) [2013-04-22 15:01:47,051] Info Property Socket.send.buffer.bytes is overridden to 1048576 (kafka.utils.VerifiableProperties) ...
If there is no mistake, zookeeper is generally bound to be 2181 (port number), server is 9092. You cannot run multiple times or a port usage error occurs. Check whether the port is occupied:
>lsof-i: 9092 or netstat-anp | grep 9092
If occupied, use the kill-9 PID
To test whether a port is a pass: Telnet hostip port
Step 3:create a topicLet's create a topic named "Test" with a single partition and only one replica:
bin/kafka-topics.sh--create--zookeeper localhost:2181--replication-factor 1--partitions 1--topic test
we can now see that topic if We run the list topic command:
bin/kafka-topics.sh--list--zookeeper localhost:2181test
Alternatively, instead of manually creating topics you can also configure your brokers to auto-create topics when a No N-existent topic is published to.Step 4:send Some messagesKafka comes with a command line client that would take input from a file or from the standard input and send it out as mess Ages to the Kafka cluster. By default, line would be sent as a separate message.
Run the producer and then type a few messages into the console to send to the server.
bin/kafka-console-producer.sh--broker-list localhost:9092--topic Test this was a messagethis is another message
Step 5:start a consumerKafka also have a command line consumer , that would dump out messages to standard output.
bin/kafka-console-consumer.sh--zookeeper localhost:2181--topic test--from-beginning This was a messagethis is another message
If you had each of the above commands running in a different terminal then you should now being able to type messages into t He producer terminal and see them appear in the consumer terminal.
All of the command line tools has additional options; Running the command with no arguments would display usage information documenting them in more detail.
Step 6:setting up a multi-broker clusterso far we had been running against a single broker, but that's no fun. For Kafka, a single broker was just a cluster of size one, so nothing much changes other than starting a few more broker in Stances. But just-to-get feel for it, let's expand our cluster to three nodes (still all on our local machine).
First we make a config file for each of the brokers:
CP config/server.properties Config/server-1.properties CP config/server.properties Config/server-2.properties
Now edit these new files and set the following properties:
Config/server-1.properties: broker.id=1 port=9093 log.dir=/tmp/kafka-logs-1 config/ Server-2.properties: broker.id=2 port=9094 log.dir=/tmp/kafka-logs-2
the
broker.id
Property was the unique and permanent name of each node in the cluster. We have to override the port and log directory for only because we is running these all on the same machine and we want to Ke EP The brokers from any trying to register on the same port or overwrite each others data.
We already has Zookeeper and our single node started, so we just need to start the new nodes:
bin/kafka-server-start.sh Config/server-1.properties & bin/kafka-server-start.sh config/server-2.properties &...
Now create a new topic with a replication factor of three:
bin/kafka-topics.sh--create--zookeeper localhost:2181--replication-factor 3--partitions 1--topic My-replicated-topic
Okay but now, we have a cluster how can we know which broker are doing what? To see that run the "Describe topics" command:
bin/kafka-topics.sh--describe--zookeeper localhost:2181--topic my-replicated-topic Topic:my-replicated-topicpartitioncount:1replicationfactor:3configs:topic:my-replicated-topicpartition:0leader : 1replicas:1,2,0isr:1,2,0
Here's an explanation of output. The first line gives a summary of all the partitions, each additional line gives information about one partition. Since we have only one partition for this topic there are only one line.
- "Leader" is the node responsible-reads and writes for the given partition. Each node would be is the leader for a randomly selected portion of the partitions.
- "Replicas" is the list of nodes, replicate the log for this partition regardless of whether they was the leader or Eve N if they is currently alive.
- "ISR" is the set of "In-sync" replicas. This is the subset of the replicas list, which is currently alive and caught-up to the leader.
Note that in my example node 1 are the leader for the only partition of the topic.
We can run the same command on the original topic we created to see where it is:
bin/kafka-topics.sh--describe--zookeeper localhost:2181--topic test topic:testpartitioncount:1replicationfactor:1configs:topic:testpartition:0leader:0replicas:0isr:0
so there is no surprise there-the original topic have no replicas and is on server 0, the only server with our cluster WH En we created it.
Let's publish a few messages to our new topic:
bin/kafka-console-producer.sh--broker-list localhost:9092--topic my-replicated-topic... my test message 1my test message 2^c
Now let ' s consume these messages:
bin/kafka-console-consumer.sh--zookeeper localhost:2181--from-beginning--topic my-replicated-topic... my Test message 1my Test message 2^c
Now let's test out fault-tolerance. Broker 1 is acting as the leader so let ' s kill it:
PS | grep server-1.properties 7564 ttys002 kill-9 7564
Leadership have switched to one of the slaves and Node 1 are no longer in the In-sync replica set:
bin/kafka-topics.sh--describe--zookeeper localhost:2181--topic my-replicated-topic Topic:my-replicated-topicpartitioncount:1replicationfactor:3configs:topic:my-replicated-topicpartition:0leader : 2replicas:1,2,0isr:2,0
But the messages was still be available to consumption even though the leader that took the writes originally was down :
bin/kafka-console-consumer.sh--zookeeper localhost:2181--from-beginning--topic my-replicated-topic... my Test message 1my Test message 2^c
Reference: http://kafka.apache.org/
Http://www.cloudera.com/content/cloudera/en/documentation/cloudera-kafka/latest/topics/kafka_spark.html
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
"Big Data Architecture" 3. Kafka Installation and use