Kafka Disks and Filesystem (disk and file system)

Source: Internet
Author: User

Reprint please indicate source address: http://www.cnblogs.com/dongxiao-yang/p/5206631.html

We recommend using multiple drives to get good throughput and not sharing the same drives used for KAFKA data with Applica tion logs or other OS filesystem activity to ensure good latency. You can either RAID these drives together to a single volume or format and mount each drive as its own directory. Since Kafka have replication the redundancy provided by RAID can also is provided at the application level. This choice has several tradeoffs.

We recommend that the server use multiple drives: (1) achieve high throughput (2) Isolation of Kafka data files from application log files and other system-related disk consumption to ensure low latency. Multiple drives can be raid into a single volume or each hard drive will display a separate drive letter mount. Due to the ability of KAKFA to provide backup of data redundancy provided by RAID at the application level, it is possible to weigh the selected strategy in several ways.

If You configure multiple data directories partitions would be assigned round-robin to data directories. Each partition is entirely in one of the data directories. If data is not a well balanced among partitions this can leads to load imbalance between disks.

If configured as multiple hard disks, the partitions will be polled across the hard disk files, and each partition will fall completely onto a separate disk. If the partition in the data is not evenly distributed, it can cause the load imbalance between the disks.

RAID can potentially do better at balancing load between disks (although it doesn ' t all seem to) because it balances lo Ad at a lower level. The primary downside of RAID is that it's usually a big performance hit for write throughput and reduces the available di SK Space.

RAID congenital performance is better on data balancing between hard disks (though not always), since raid is a data balancer at a lower level. However, the main disadvantage is that raid typically consumes very much on write throughput and reduces available disk space.

Another potential benefit of RAID is the ability to tolerate disk failures. However our experience have been that rebuilding the RAID array are so I/O intensive that it effectively disables the server , so this does not provide much real availability improvement.

Another potential benefit of RAID is the ability to tolerate disk failures. However, our experience is that the action of rebuilding a raid queue is an overly IO intensive work that significantly causes the server to fail, so this does not provide much practical usability improvements.

Kafka Disks and Filesystem (disk and file system)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.