This article is composed of ImportNew
This article is translated from apmblog.compuware.com by ImportNew-Tang youhua. To reprint this article, please refer to the reprinting requirements at the end of the article. In recent weeks, my colleagues and I attended the Hadoop and Cassandra Summit Forum in the San Francisco Bay Area. It is a pleasure to have such intensive discussions with many experienced big data experts. Thanks
This article is translated from apmblog.compuware.com by ImportNew-Tang youhua. To reprint this article, please refer to the reprinting requirements at the end of the article.
In recent weeks, my colleagues and I attended the Hadoop and Cassandra Summit Forum in the San Francisco Bay Area. It is a pleasure to have such intensive discussions with many experienced big data experts. Thanks to our partners DataStax and Hortonworks for hosting this event! At the same time, I also saw that performance issues have become the main content discussed in the community. We collected a lot of feedback on typical big data performance issues and were surprised by the challenges posed by performance issues. Because the participants are experts, the general issues and basic cluster monitoring methods are not discussed. This article will introduce the advanced questions about Hadoop and Cassandra.
I sorted out the most interesting and common Hadoop and Cassandra deployment problems:
Hadoop focus problem Map Reduce data local problem
Local data is the core advantage of Hadoop Map/Reduce. map code is executed on the node where the data is located. However, it is interesting that many people find that this is not always the case in practice. They found the following exceptions:
- Prediction execution
- Heterogeneous Distributed Clusters
- Data distribution and location
- Data Layout and input shunting
These problems occur more frequently in large clusters, that is, the more data nodes, the less localized data. The larger the cluster, the less likely it is to be identical. The higher the update speed of some nodes, the unbalanced computing ratio. Prediction execution consumes computing power even if there is no local data. The problematic data node may calculate other content, which causes another node to perform non-local processing. The root cause of the problem may be Data Layout and input shunting. In any case, processing non-local data will cause network scalability problems, making the network a bottleneck. Furthermore, it is very difficult to observe and diagnose these problems due to data locality.
To improve data locality, you first need to check which Data Locality problems exist in your job or the performance may degrade over time. With the APM (number of operations per minute) solution, you can know which tasks have accessed what data nodes. It is more complicated to solve local problems, including changing the data location and data layout, using different schedulers, or simply changing the er and reducer slot of the task ). Next, you can verify whether the new scheme can bring a better local data ratio by performing the same job.
Inefficient task code and Hadoop workload "analysis"
Then we confirmed an interesting point: A lot of Hadoop work is very inefficient. Please note that this is not a Hadoop issue but a task execution issue. However, in a larger Hadoop cluster, the "analysis" task is the main pain point. Black box monitoring alone is not enough. Traditional analyzer cannot cope with the distributed features of Hadoop clusters. Our solutions have been recognized by many senior Hadoop developers. We also received a lot of interesting feedback on how to make our Hadoop task "analysis" more effective.
TaskTracker performance and its impact on scrambling time
As we all know, scrambling is the most important factor affecting performance in Haddop jobs. In many Hadoop Performance Tuning articles, we have described the data in the optimization graph (for example, using a combination of them) and (using a splitter) scrambling distribution and pure read/merge performance (number of threads and low-end memory management ). However, few articles talk about reducing the speed of a specific TaskTracker. This solution has been widely discussed among many senior "Hadooper.
When the computing node is in a high-pressure state, the hardware capability is insufficient, or the stack effect is reached, the local TaskTracker will be negatively affected. To put it simply, some nodes in a large cluster will reduce performance!
The result is that the TaskTracker node does not provide a fast data scrambling function for the Restoration server, or an error may occur during the operation. Basically, this problem occurs in all reducers, because scrambling is the bottleneck of the entire task execution time and will continue to increase. On a small cluster, we can monitor the performance of a group of running TaskTracker. However, in reality, this is not feasible in the cluster. The central average value of monitoring masks the tasks that trigger the problem. Therefore, it is difficult to determine which TaskTracker causes the problem and the causes.
The solution is to configure the PurePath/PureStack model in baseline mode. Baseline processing of TaskTracker requests can solve average and monitoring problems. In this way, if TaskTracker mapOutput performance problems occur, we can get instant notifications and identify the TaskTracker problems in a timely manner. Next, we can identify the infrastructure, Hadoop configuration, or problems caused by the new operating system through the health status of the JVM host. Finally, by tracking all tasks, tasks, and mapOutput request tasks, we can know which task triggers TaskTracker performance problems and which tasks are affected.
Slow NameNode and DataNode
Like TaskTracker, NameNode and DataNode also affect task performance. The slow NameNode or specific DataNode will have a significant impact on the entire cluster. The solution can establish a baseline for the request, perform detection, and automatically detect performance degradation. We can also know which tasks and clients are affected by NameNode and DataNode slowdown, and determine whether the infrastructure, high usage, or service errors occur.
Cassandra focus
Spotify's speech at the Cassandra summit was the best. If you are using or planning to use Cassandra, we strongly recommend it to you!
Performance degrades over time during read Operations
When Cassandra was deployed for the first time, the operation speed was very fast, but the read operation continued to increase over time. In fact, all operations will have similar problems over time, and reading and deleting rows across SStable will lead to dead nodes. All problems can be attributed to access mode and pattern design errors, which are usually related to data. If you write data to the same row for a long time (several months), this row will spread to many sstables. Reading this row of data will become slow, but accessing more "new" rows (located in the same SSTable) will be fast. It is even worse to constantly delete and insert the same row. Not only will the data of this row spread everywhere, but it will be full of dead nodes, and the read validity rate will be terrible. However, the average data only increases slowly (this is the mean effect ). In fact, the performance of the "old" row will drop sharply, while the speed of the "new" row is still very fast.
To avoid this situation, do not delete data frequently or write data to the same row for a long time. To discover this problem, you should first create a baseline for the read requests of a group of columns (columns) in Cassandra. Compared with the mean, the baseline method can detect changes in a distributed environment and notify you which requests will degrade performance and remain fast. In addition, classifying Cassandra requests of actual end users can help you quickly locate problems.
Slow nodes affect the entire cluster.
Like many real-world applications, the Cassandra node slows down due to various factors (hardware, compression, GC, network, disk, etc ).? Cassandra is a cluster-based database. Each row exists many times in the cluster. Each write request is sent to all nodes (or even nodes with the same level) that contain the row ). Failure of a single node is not a big problem, because other nodes contain the same data. All read/write requests can continue to work normally. Theoretically, a super slow node will not cause problems unless we explicitly specify to request data to the same level of "all" nodes. However, each node has an internal Coordination queue waiting for all requests to be completed, even if it should immediately respond to the client when the request is completed. The queue can respond to a super slow node and quickly point out that a single node cannot respond to the request. However, this prevents the cluster from responding to any requests.
There are two solutions to this problem. If possible, use a token client similar to Astyanax. By directly communicating with nodes that contain data, the client can efficiently skip the coordination queue issue. In addition, you should establish a baseline for the Cassandra request of the server node and give a warning when the node slows down. It is strange to say that disabling the problem node can also solve the problem temporarily, because Cassandra can handle the problem almost immediately.
Too many reads/too much read data
Another typical performance problem of Cassandra comes from our habit of SQL, which is especially typical for Cassandra beginners. This is a database design problem. A transaction contains too many requests or reads a large amount of data. This is not a problem with Cassandra. The fact is that too many requests or reading a large amount of data slows down the processing speed of actual transactions. This problem can be easily detected and discovered through the APM method, and the solution usually needs to modify the code and data model.
Summary
Hadoop and Cassandra are both highly scalable systems! However, this scalability usually cannot solve performance problems. Both of them cannot be avoided and simple misuse cannot be solved.
None of the specific problems on these systems will occur on other systems. Others, though not new, have never appeared on such a large-scale distributed system. Due to scalability and scale problems, these problems are difficult to diagnose (especially Hadoop) and will have a huge impact (such as Cassandra cluster stop ). Performance analysis experts can raise a toast and they will not be able to finish their work for a long time in the future.
Original article address: I would like to thank the original author for discussing the performance of Hadoop and Cassandra.