Remote monitoring basics and troubleshooting in Windows Azure

Source: Internet
Author: User
Keywords Azure azure remote monitoring

In the Component Block blog post for building powerful cloud applications, we present a series of blog articles and technical articles from the Azure CAT team that describe the cloud services underlying code project in Windows Azure, published on the MSDN code base. In this series, the first component we want to introduce is remote monitoring. This is one of the first reusable components we built when we were performing various sizes of Windows Azure customer projects. In fact, someone once said, "trying to manage a complex cloud solution without the right remote monitoring infrastructure is like a blind deaf person trying to cross a busy road." You are not sure or know where the problem is, and you can't take precautions, it's easy to get into trouble. Conversely, if you collect adequate monitoring and diagnostic information about the state of your application components in a timely manner, you can make informed decisions, such as cost and efficiency analysis, capacity planning, and operational excellence. This blog also contains a Wiki article that provides an in-depth overview of remote monitoring basics and troubleshooting.

To manage systems of any size in the cloud, there are actually different ways to support operational results in terms of performance monitoring and application health. Using existing tools and techniques is challenging because the cloud platform is quite abstract. In addition, if your solution needs to scale, the amount of information generated by hundreds of web/worker role, database partitions, and other services will be at risk, allowing you to be inundated with data that is relatively low, irrelevant, and deferred. Provides an end-to-end experience that always surrounds the operating insider, helping customers match their SLAs with their users. At the same time, reduce management costs by making smarter decisions about current and future resource consumption and deployment. This can be achieved only if the layers involved are fully considered, including from an infrastructure perspective (such as the use of resources, such as CPU, I/O, memory, etc.) to the application itself (database response times, exceptions, etc.) until business activities and KPIs.

The Operations team (maintaining service performance, analyzing resource consumption, managing support phones) and development teams (troubleshooting, planning new versions, and so on) can benefit from handling, associating, and using this information.

The remote monitoring solution itself must be designed to be used to scale across multiple role instances to perform data acquisition and transformation activities that store data in multiple raw SQL Azure databases. To facilitate reporting and analysis of components, summary data resides in a centralized database that serves as the primary data source for predefined and customized reports and dashboards, as shown in the following simplified architecture diagram:

Because the subject itself was very large, we decided to divide it into four blog posts and Wiki articles to form a mini series:

1. Remote monitoring basic knowledge and troubleshooting

2. Application Health Measurement

3. Data collection Pipeline

4. Reporting and analysis

This is the first article to introduce the basic principles of a remote monitoring solution that first defines the basic metrics and key metrics for our application's health. We also provide a variety of information sources that you can subscribe to for automated remote monitoring systems or to manually troubleshoot applications that are less complex to deploy.

Features such as Windows Azure Diagnostics (WAD), if properly configured, will be the primary starting point for gathering and summarizing these critical information. Unfortunately, some of these data sources are not currently integrated with WAD (such as azure SQL databases), so you need to use slightly different methods and APIs to extract this information. Azure Storage Analysis is another good example that requires specific efforts to collect and consolidate metrics.

To read this topic, see Remote Monitoring basics and Troubleshooting Wiki articles, where we will focus on a profiling method that can be used to associate all of these different data sources with a view that describes the health of the End-to-end solution. In addition, to help you achieve this, we provide tools (Microsoft tools and Third-party tools) and scripts that can be used in troubleshooting sessions.

This will be the cornerstone of a series of articles that we'll introduce in future articles. You can find the entire series on the Cloud Services Foundation TechNet Wiki login page.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.