China Eastern Airlines business-to-business system performance test

Source: Internet
Author: User

Cause: Since September 4, the East Airlines business-to-business system in the last days of continuous downtime, the reason is unknown.

Test intervention: Fortunate to be assigned to do business-to-business performance testing, target positioning problem module and give the direction of tuning.

Originally wanted to write a full set of testing process and the summary of the process, then think about it or too much time, in the future to see so many words can not grasp the focus, the key to record the shortcomings of it.

Deficiencies of the records, solutions, summary:

1 Step into the wrong communication

A unclear demand (serious)

Clear demand, this sounds pretty simple indeed. But the real difficulty in the actual work is that the person you are handing over to is not aware of what he wants or wants to do. Therefore, it must be treated with caution and avoid deviations from understanding.

1 Take this performance requirements as an example, he did not pay attention to the performance of the system, but the performance test to confirm that the package before and after September 4, which module exists, so as to tune.

If you say this time you directly to the two sets of code to test, give the September 4 after the performance of the package after the bottleneck where the indicator how. OK, you're already wrong.

If you say this time you step in, tell him XXX module has serious performance problems, how to tune. OK, you're half right, half wrong.

My first test was that 1 and 21 were done, and the results were not satisfied and were asked to redo.

What's the reason? Mainly this, the system exactly how many pits, who do not know, there may be N modules have a performance problem, and this time need to find out is not "performance problem module", but will affect the system down the module.

So the question is, what kind of performance module will affect the downtime of the module?

This is really not good to be sure, the original idea is to test the environment first to show the same as the online environment of the error and the state before the outage, and then based on the simulation of the scene to troubleshoot the module.

So in fact, this demand analysis is divided into two steps to go:

1 reproduce the error message and outage status of the online environment

2 Find the problem module based on the scenario in Step1

b Multi-point command (serious)

Everyone has their own ideas and opinions, especially for the fuzzy performance of the need to identify the interface person. People have a lot of miscellaneous, will affect the idea.

I made a mistake because I would usually check my own opinion on "vague needs" and tell the other person how I analyzed it. But since I

Not insisting on the "dialogue" with the interface, which led me to a development team's "advice", caused a discrepancy in my understanding of the need for some time afterwards.

Confirm that the interface person, is always 1 to 1 of the communication, such communication more efficient.

C Division of labor unclear (moderate)

To tell you the truth, I think some of our employees are really bureaucratic and like to push things. I don't know what they think, but it's true that you can get off on the clock every day.

The reason is that the test environment data is not enough, there is a job to build data, then the problem is to build the code of the data who write. When the development came to me and asked me to write, I think how to build data can not reach 7 days

Reach the Tens data bar, after all, the CPU is limited, you do not go. All the other APIs are developed, and I write the equivalent of a shell with selenium automation to write,

It really is very inefficient to run. As a result their manager called me to write, ok I wrote. The truth is that they are lazy and unwilling to go through the code API they've written.

Later on, more unpleasant things happened, I wrote the code about 1 hours, I transferred to the past, the results of their various problems can not get up and said I am not responsible, engaged in my day.

There are two questions here, and I should not be in the rejoined of kindness, because there may be other people spilling their hands all the way. In addition to communication, I should gently put the other department leader's call to our department leader over there.

D do not adhere to their own judgment, easy to let the upper leadership to influence

Most of the time, top leaders don't know the details of the technology, and they sometimes come up with less meaningful work. Then you say this is too time-consuming, not very effective.

In fact, I think no matter is really meaningless, just because the time to invest more than the gap. OK, I was just saying I was going to build the data is too inefficient, time and long.

The other side immediately back to a sentence, then you evaluate the time to build first, I immediately have no position.

And then there was one more thing. A system to test performance, but a manager and I said to the cluster to do performance testing, this is not nonsense? The performance of the system is related to load balancing, and it is possible to give 4 days.

Or that one sentence, you first evaluate.

It is true that the final performance test of a single machine will be completed and there is no spare power to play that thing.

I'll probably answer that later.

1 I do not do meaningless evaluation work.

2 can be listed, low priority, when the normal work is completed before consideration.

E-Mental problems, time is tight

Demand people often and you describe the problem is very urgent, business-to-business downtime must be 1 days to fix. OK, I am anxious to what sense, the result once I went to his side found that the other side is also on the Internet to play shopping.

Oh, he put all the treasure on my body, he is not in a hurry, really calm.

Then somehow the time became longer and became 7 days.

Oh, no comment. This is the way it should be, and after the wolves have come too much, I can't see the urgent thing.

2 The mistake of stepping into the test process

A operation link is too redundant (medium)

This is my 2 B, but also related to the process, when to do performance testing, it is very common to start restarting the server, but it involves a permissions problem, I do not have an account to restart the server.

The start time is very inefficient, because each restart, the re-cloth to find the configuration personnel.

Later I and the configuration personnel to apply for, got the account and learned a bit of startup mode is OK, the efficiency has improved a lot.

B Early-stage tool preparation insufficient (mild)

Was monitoring jconsole incredibly not support, I fainted, then did not consider this, and later I wrote a tool to complete the monitoring, spent an unexpected half-time

Murphy's Law, it's always a little unexpected.

But I think I can handle it. I did find it. A dedicated monitoring tool was found.

C Schedule Irregular (medium)

The plan is written lightweight and does not take into account the leadership of the watch, as they want to see intuitive scheduling, and a visual summary of the problem.

D Pre-deployment environment insufficient (medium)

Preparation of the performance test, the machine being tested, the pressure machine, the system environment, the amount of data,

If the environment is not ready, direct request to return.

Before that, the data volume gap is too large, then I was to bite the bullet to test, the effect is not good, a bit of a waste of time suspects.

Then change the data size of the quasi-production environment and reproduce the problem within 20 minutes.

--------------------------------------------------------------------------------------------------------------- ------------------

1 The final performance test is still a satisfactory end, I was saved to the business e-commerce platform. has been recognized by the leadership, the overall process is difficult, the results are quite good.

2 In addition I this person mentality is not good, do not know why is not like those mixed son to get their own physical and mental exhaustion, but also need to strengthen the exercise mind ah,

3 others put their lives on the top of the work is not wrong Ah, is also a kind of respect for life.

4 speak and do things, tell yourself, calm down and start again.

-----------------------------------------------------------------------------------------------------------

Attach the question of this performance test.

In the user query module, there is a query statement that uses Hibernate's associated query feature, and this query statement queries a table that has a lot of related fields in it. Causes many redundant query operations to be executed and occupy a large amount of memory space.

---------------------------------------------------------------------------------------------------------

There are so many e-mails in the last 7 days, just attach the final email I sent for this project.

Hello everyone: Performance requirements are complete:

This performance test completed a total of two stages of testing, namely troubleshooting phase and performance Comparison phase

1 during the troubleshooting phase we are positioned to 9 months on the line when the optimization of the query order by name is a performance problem, the solution is to roll back this part of the code and continue to follow the previous logic.

2 in the performance comparison phase of the test results are tuned package and production package performance difference in 5% around, the difference is not small, meet the pre-agreed performance test conditions, test pass.

System Performance Assessment:

Although the performance of this tuning package is satisfied with the requirements, but still continue the performance of the old production package, highCPU occupancy rate and query by name SQL statement execution efficiency is not high, in the large data volume of the order query, The number of transactions per second and the response time metrics are poor.

Suggestions

1 Business-to-business there is no time limit for user queries, and if the user is querying 1 years of data then the pressure on the server is great. ( This security vulnerability I have proposed in the 8 months, hope to be repaired as soon as possible.) )

2 It is recommended to increase the size of test environment test data, if the test data size is large enough to detect this performance problem during the functional testing phase.

3 recommend the development of a unit test, this performance problem is largely due to the misuse of a feature of hibernate.

Detailed data information

Due to the relevant data and the larger currently uploaded FTP 172.25.3.212/test data/performance test/B2B read-only module performance test/second round

Or check out the daily performance test for the past week.

---------------------------------------------------------------------------------------------------------

Planning, analysis, reporting roughly 6 minutes, just stick to the timetable.

Work schedule

China Eastern Airlines business-to-business system performance test

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.