Remote monitoring basics and troubleshooting in Windows Azure

Last Update:2014-12-24 Source: Internet

Author: User

Keywords Azure azure remote monitoring

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the Component Block blog post for building powerful cloud applications, we present a series of blog articles and technical articles from the Azure CAT team that describe the cloud services underlying code project in Windows Azure, published on the MSDN code base. In this series, the first component we want to introduce is remote monitoring. This is one of the first reusable components we built when we were performing various sizes of Windows Azure customer projects. In fact, someone once said, "trying to manage a complex cloud solution without the right remote monitoring infrastructure is like a blind deaf person trying to cross a busy road." You are not sure or know where the problem is, and you can't take precautions, it's easy to get into trouble. Conversely, if you collect adequate monitoring and diagnostic information about the state of your application components in a timely manner, you can make informed decisions, such as cost and efficiency analysis, capacity planning, and operational excellence. This blog also contains a Wiki article that provides an in-depth overview of remote monitoring basics and troubleshooting.

To manage systems of any size in the cloud, there are actually different ways to support operational results in terms of performance monitoring and application health. Using existing tools and techniques is challenging because the cloud platform is quite abstract. In addition, if your solution needs to scale, the amount of information generated by hundreds of web/worker role, database partitions, and other services will be at risk, allowing you to be inundated with data that is relatively low, irrelevant, and deferred. Provides an end-to-end experience that always surrounds the operating insider, helping customers match their SLAs with their users. At the same time, reduce management costs by making smarter decisions about current and future resource consumption and deployment. This can be achieved only if the layers involved are fully considered, including from an infrastructure perspective (such as the use of resources, such as CPU, I/O, memory, etc.) to the application itself (database response times, exceptions, etc.) until business activities and KPIs.

The Operations team (maintaining service performance, analyzing resource consumption, managing support phones) and development teams (troubleshooting, planning new versions, and so on) can benefit from handling, associating, and using this information.

The remote monitoring solution itself must be designed to be used to scale across multiple role instances to perform data acquisition and transformation activities that store data in multiple raw SQL Azure databases. To facilitate reporting and analysis of components, summary data resides in a centralized database that serves as the primary data source for predefined and customized reports and dashboards, as shown in the following simplified architecture diagram:

Because the subject itself was very large, we decided to divide it into four blog posts and Wiki articles to form a mini series:

1. Remote monitoring basic knowledge and troubleshooting

2. Application Health Measurement

3. Data collection Pipeline

4. Reporting and analysis

This is the first article to introduce the basic principles of a remote monitoring solution that first defines the basic metrics and key metrics for our application's health. We also provide a variety of information sources that you can subscribe to for automated remote monitoring systems or to manually troubleshoot applications that are less complex to deploy.

Features such as Windows Azure Diagnostics (WAD), if properly configured, will be the primary starting point for gathering and summarizing these critical information. Unfortunately, some of these data sources are not currently integrated with WAD (such as azure SQL databases), so you need to use slightly different methods and APIs to extract this information. Azure Storage Analysis is another good example that requires specific efforts to collect and consolidate metrics.

To read this topic, see Remote Monitoring basics and Troubleshooting Wiki articles, where we will focus on a profiling method that can be used to associate all of these different data sources with a view that describes the health of the End-to-end solution. In addition, to help you achieve this, we provide tools (Microsoft tools and Third-party tools) and scripts that can be used in troubleshooting sessions.

This will be the cornerstone of a series of articles that we'll introduce in future articles. You can find the entire series on the Cloud Services Foundation TechNet Wiki login page.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More