Intel opens Big Data intelligence era

Source: Internet
Author: User
Keywords Intel OK these

"IT168" with the increasing demand for large data solutions, Apache Hadoop has quickly become one of the preferred platforms for storing and processing massive, structured, and unstructured data. Businesses need to deploy this open-source framework on a small number of intel® xeon® processor-based servers to quickly start large data analysis with lower costs. The Apache Hadoop cluster can then be scaled up to hundreds of or even thousands of nodes, shortening the query response time for petabytes of data to the secondary second.

Intel collaborates with the Apache Hadoop community to support system administrators as much as possible to achieve the highest performance of their Apache Hadoop clusters-while ensuring minimal complexity. Intel has developed the Hitune Performance Analyzer and the Hibench benchmark suite, which reduces the complexity of Apache Hadoop performance tuning, and allows users to design and implement the Apache Hadoop solution more confidently in a shorter period of time.

Hitune Performance Analyzer

One of the main advantages of Apache Hadoop is that it is easier to deploy and use than traditional data warehouses. However, because of the complex interaction between hardware and software in a distributed environment, there are challenges to optimizing the Apache Hadoop cluster and workload to improve performance. To meet this challenge, Intel has developed Hitune, providing developers with simple tools to develop highly scalable applications. This scalable, lightweight, scalable Performance Analyzer can help you deliver higher performance Apache Hadoop clusters and applications to your customers. In addition, you can help your customers gain greater value throughout the lifecycle of their clusters.

A typical Apache Hadoop query is written using an intuitive, advanced data flow model. This is ideal for programmers because all the intricacies of data partitioning, task distribution, load balancing, fault tolerance, and node communication are handled by the Apache Hadoop runtime environment. However, hiding this low-level complexity can also cause performance tuning to be a cumbersome challenge. Because engineers have little or no knowledge of low-level interactions between hardware and software, this recognition is an essential prerequisite for understanding and optimizing performance. Engineers often rely on lengthy and time-consuming trial-and-error methods, which often result in suboptimal performance.

Hitune will monitor key performance metrics for each server in the Apache Hadoop cluster, and then summarize these low-level metrics to associate these metrics with the advanced data flow model. This allows engineers to gain insight into the dynamic interaction between tasks and phases, and quickly identify performance bottlenecks, application hotspots, and hardware problems with slow performance.

1. Simplify and accelerate performance tuning. Hitune provides detailed analysis and visualization, with minimal performance impact on running applications and no need to modify the source code. Intel engineers make extensive use of this tool, which in many cases achieves a performance gain of up to six times times on a relatively simple hardware or software adjustment.

2. Extend analysis across thousands of servers. Hitune can be used to analyze applications that run across thousands of servers and contain hundreds of thousands of synchronization processes in a production environment. The Hitune analysis engine can be run as an Apache Hadoop operation, enabling rapid analysis of massive performance data through large-scale parallel execution. Instead of analyzing some of the applications running on one part of a cluster, engineers can collect and analyze complete confidence and gain more useful insights.

3, gradually obtain higher value. Intel will continue to extend and optimize hitune for Apache Hadoop and other distributed large data solutions. Intel has used Hitune to tune and optimize Apache Hive performance, and Apache Hive is an open source data warehouse built on Apache Hadoop. The expertise you have accumulated now will deliver higher value in the future.

Hibench Benchmark Suite

As the market progresses, it becomes more important to optimize and validate the performance of the Apache Hadoop cluster as customers begin to use large data insights to improve revenue flow, profitability, and operational efficiencies in a near real-time manner. With the Hibench benchmark suite, you can accurately and consistently measure, validate, and compare the performance of the Apache Hadoop cluster across different workloads to provide customers with better information and confidence.

Hibench provides easy access to 10 Easy-to-use Apache Hadoop workloads that are scaled, configured, and customized to reflect typical deployments. You can measure performance for specific common tasks, such as sorting and word counts, or measuring performance for more complex practical applications, including Web search, machine learning, and data analysis. Different workloads have different characteristics that enable you to build test matrices that reflect the resource requirements of a particular environment.

Intel will continue to expand and improve hibench, and will collaborate with leading vendors and standard entities to develop industry standard performance benchmarking for Apache Hadoop. After you build these benchmarks, you have a better foundation for understanding the architectural issues, measuring and validating the performance of the Apache Hadoop solution.

Build a validated Foundation

Designing a fully optimized Apache Hadoop cluster requires an in-depth understanding of the entire solution system. It may take months to explore the Apache Hadoop workload features and learn how to interact with the underlying hardware and software. You can also take advantage of Intel's expertise in researching and collaborating with companies that are currently running some of the world's largest and most successful Apache Hadoop, including Google, Yahoo! and some of the top telecommunications and financial services companies.

Intel constructs this professional experience as a Reference Architecture, tuning guide, and best practices recommendations that can be used as a starting point for designing and deploying the Apache Hadoop cluster. With clear guidance from the hardware specification to the complete software architecture, you can design, build, and configure the most appropriate solution faster and more economically.

You can also choose from a variety of leading Apache Hadoop releases, all of which are highly optimized for Intel Xeon processors. Intel works with Cloudera, Hortonworks, IBM, and other commercial distributors to ensure that software that has been scaled, strengthened, and tested specifically for production readiness in the enterprise environment achieves optimal performance on the Intel architecture.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.