Classic case: Best Practices for managing cloud service performance

Source: Internet
Author: User
Keywords We this these run
When an enterprise migrates its core IT systems to a private cloud or public cloud network, the work is not over. Now there is a different set of technical issues that need to be addressed: how to manage the cloud to ensure that the enterprise's investment returns, providing the desired efficiency and return on investment.

Cloud management and cloud monitoring have become more important as the Amazon EC2 (Flex Computing Cloud) Service outage occurred this April. In that incident, the IT field saw what would happen when the cloud was in trouble, and many of the company's businesses were interrupted by the outage. There have also been some serious cloud interruption incidents recently.

According to IDC analyst Mary Johnston Turner, the ability to get corporate purchases is one of the big pitfalls of a public cloud. In a recent survey of 250 user companies, she noted that service-level protocol performance guarantees ranked second in importance, second only to the specific needs of the application itself.

The company is very concerned about performance, Turner said. One reason why businesses are so interested in private cloud is because it leaders are responsible for getting good performance for their users. They are not prepared to give these huge responsibilities to third party cloud manufacturers.

When it comes to cloud computing, management software is no longer an afterthought; it must be part of the implementation, and every time you make a decision you have to think about how to best integrate cloud capabilities into your enterprise's IT architecture.

Software as a service (SaaS) and infrastructure as a service (IaaS) are two types of cloud commitment that offer tremendous opportunities for enterprise IT tasks. If IT pros are ahead of the trend, they really need to learn how to talk like experts when it comes to SaaS and IaaS issues.

When it comes to cloud computing contracts, knowledge is the key and reading is the foundation.

It's not just a cloud issue, she adds, but a problem that arises from the complexity of composite applications. Next, these composite applications will be introduced into the cloud environment.

This is a huge challenge, Turner said. Users need to invest in application performance management products made for composite applications and virtualized environments. Now this is a complete category of products.

The idea is to be able to independently monitor the performance of the application in the network and performance in the cloud, and then be able to measure where the application has reached the user's performance requirements, inside or outside the firewall.

IGN.com is one of the world's largest video game sites. For David Ting, the company's vice-president for engineering, it is important to monitor the cloud performance of his company because the survival of the business depends on the ability to connect 2.54 million of its users to the site's ad-supported online games.

"For us, performance is money, because viewing pages is the key," Ting said. We are supported by advertising, every time we watch the count of the page can help our company bring revenue. This is something that we pay close attention to. ”

To make it work, the ING Entertainment Department of media giant News Corp uses the new Relic Company's performance monitoring tool in San Francisco to continuously observe its Web site's performance in the cloud. We rely mainly on this tool, Ting said. For us, this is the response time of the IGN Web site and the number of processing times per second.

Extended tracking performance with cloud deployment

IGN.com has been using the new Relic tool for about 18 months. It first migrates unproductive development and other applications into the cloud to see how they work. Now, IGN.com is putting new projects on the cloud server, including a social media stack, so that the company can strengthen applications and upgrade them as needed. In addition, one application planned for deployment in this cloud is the disaster recovery infrastructure for this network.

Ting, referring to the company's IT system, said the system would eventually migrate to the cloud. We must ensure the stability of our performance when we are going to do it in the future. We're looking at this thing.

New Relic Tool Monitoring can provide performance metrics that IGN can not provide with other tools, Ting said. These old tools are good for physical machine monitoring, but they can't be monitored at all without the engineering team doing a lot of work.

By looking at the new Relic management tools, IT staff was able to start more cloud-based servers, close underperforming application instances, and then add new instances as needed to maintain user response time. With the previous tools, the Ting team can only understand uptime and no response time.

Ting explains that the New relic provides great visibility for response times. This allows IT staff to take action even when the server is running. For example, we found that a memcached (high-performance distributed Memory Object caching system) server performs a much worse performance than the other servers in this pool. Before further investigation, we found that a memory module was faulted. In the Nagios (an open-source, free network monitoring tool) environment, that server runs until it freezes.

IGN.com is currently using Amazon's EC2 service to enter the cloud domain.

With the new Relic tool, IGN.com is able to observe all aspects of this three-tier architecture, from its foreground to its database to its API (Application programming Interface) layer. This management tool helps ensure that user response times are optimized and do not reach peak levels.

"We are able to see what is running in the cloud, collecting data using plug-ins and sending those results back to the new relic tool," Ting said. This data will tell you in great detail about the performance of these server groups. ”

"The amount of data and the accuracy of the data are very important." This is the starting point for us to look at metrics and to use it to make intelligent business decisions. "Ting said.

In addition to migrating its IT infrastructure, IGN.com has been exploring many sites in more than 100 web sites that use the cloud to host it to improve performance and uptime. The main sites include IGN.com, AskMen.com, gamespy.com, fileplanet.com, teamxbox.com and gamestats.com.

"So far, the tests have been positive. We've moved some of the infrastructure components to the cloud. The matter is now in the experimental stage. We're checking performance. "Ting said.

Using a variety of tools

Online publisher Bleacher, a fan of professional sports and university sports in San Francisco, quickly discovered the importance of performance monitoring a year ago after migrating his core infrastructure into the cloud.

Sam Parnell, vice president of the company, said his company was concerned about potential performance problems, including possible delays, because the company wanted to scale up its energy to meet the needs of 20 million of users and to view 500 million times a month. To prevent bottlenecks, he purchased a large number of tools to monitor and manage this new cloud environment for the ad-supported web site.

"No single tool can do anything for us," says Parnell. We use different tools at different levels to provide a comprehensive suite of surveillance. So far, no latency issues have occurred. However, we use these tools to optimize the various parts of the system. ”

The company's toolbox includes a server-level tool scout. This tool allows IT staff to see what the workload is in the primary and standby databases, and to see processor utilization and memory consumption on the server. You can do this monitoring and report reminders and status data using agents running on the cloud server.

The company also uses Nagios enterprises monitoring tools and monit open source software tools. "Many of these tools are certainly overlapping," Parnell said. However, these tools have features that are good for you. This is why we use these tools together. ”

The bleacher also uses the Pingdom Ping Detection tool to ensure that each Web site runs properly and runs well.

"In each case monitored, 100% uptime and fast Web response speed are very important." If people can't access the site and see the ads, then we're going to lose money. "Parnell said.

The company also uses new relic tools to monitor application performance. This allows IT staff to understand which pages are running faster, which pages are slow, memory consumption, and processor usage.

Real-time observation

Parnell said his staff had been observing the monitoring data in real time to reach the display.

He points out that the key is to use a wide range of products for surveillance. This way, in the case of a failure, you will get more information as soon as possible to fix the problem. In general, I would rather have too much data than not enough data. The New Relic tool is a good way to display important information in the console. This way, you don't have to bother reading the data. This is helpful when you want to see the running state quickly.

In order to observe performance in real time, Parnell's team used large monitors to cycle through different reports, so that members of the team could see the reports all day. ' We don't dig into these reports every day, Parnell explains. But we do monitor things that look unusual. When we need to drill down into the data, all of these tools provide us with in-depth data.

The monitor screen is primarily watched by a team of responsible engineers, especially when deploying new features or working at high workloads.

Another important point to keep in mind is that the cloud environment and cloud monitoring are in the early stages. IT departments need to be flexible, find and use cloud monitoring tools, and continue to look for better new tools.

We use Scout Tools for only 5 or 6 months, Parnell said. This tool works very well. But after 5 months, other tools might do better. You need to know the pulse of the market. In this way, you can keep up with the new tools. The new company has been constantly appearing.

Another thing to keep in mind, he says, is that you constantly monitor the servers your cloud vendors provide to ensure that you always have the best performance servers.

"This is one of the biggest benefits of using cloud services. With cloud services, you can discard slow servers and select another server through the control Panel. ”

Monitoring tools are also used internally to improve the development of new Web features. These new features are available to readers of the bleacher.

"If an engineer is deploying a new feature, I ask them to observe performance and ensure that this new feature does not adversely affect performance elsewhere." We continue to adjust and select everything in this system to make sure it is as fast as possible. If an important sports news comes up suddenly, our network traffic will be very large. Everything needs to be upgraded. We need to be able to deal with this situation.

Know what you're going to get and monitor what

' To get the functionality that your company really needs, you have to ask your cloud manufacturer for your specific requirements, ' says James Staten, a Forrester Analytics analyst.

One of the most important things is transparency, Staten says, and what exactly are they going to offer you? This includes asking them what level of monitoring they are allowed to direct and what records they are sending you, so that you can see what is happening. If the cloud vendors do not provide you with these things, you ask them to provide them.

Staten says the main part of your relationship with your cloud maker is managing your expectations. He points out that any performance monitoring you have to do is your responsibility, not your manufacturer's responsibility.

If you can't do this kind of surveillance yourself, you can hire a lot of companies to do it for you. These companies include Hyperstratus, Keynote Bae, Hewlett-Packard, IBM, Accenture and other companies.

Many people believe that their service-level agreements contain performance monitoring and are not actually included. Service-level agreements include availability, that's all.

At the same time, all of the applications and services your company runs on cloud networks are not important tasks, he adds. Therefore, you do not need to monitor the performance of all applications in the cloud. You have to figure out what important apps are.

End-to-end Cloud Management is far from the end

The last thing to consider is that the cloud performance monitoring market is still immature, said IDC analyst Turner.

Turner says there are a lot of vendors that are going to talk about it from a roadmap point of view, but that's not comprehensive. This year is still the main emphasis on automated configuration. That would allow real end-to-end cloud monitoring. ' With this year's past, I think we're going to see something more advanced, ' she said.

As more companies transition to a production environment in the cloud, this monitoring requirement will become larger. ' I think it will be the preferred area for many organizations to invest in this year, ' Turner said. She predicts that it may take another two years to reach that level because of the level of sophistication needed.

Staten says, of course, all surveillance needs are contradictory. When you pay for monitoring to ensure that you will get the performance of the contract, you may first jeopardize the cost savings of your company's adoption of cloud services. If you spend a lot of money on delaying issues, should you spend a lot of money on cloud services?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.