Cloud computing design mode (11)--Health Endpoint monitoring mode

Source: Internet
Author: User
Tags response code ssl certificate

Cloud computing design mode (11)--Health Endpoint monitoring mode

Implementing external tools can periodically access the functional checks in the application by exposing the terminal. This mode can help verify that the applications and services are executed correctly

Background and issues


It is good practice and is usually a business requirement, and monitors the Web application, and the middle tier and shared services to ensure that they are available and execute correctly. However, it is more difficult to monitor services running in the cloud than it is to monitor local services. For example, you don't have to have full control of the host environment, and services typically rely on platforms, vendors, and other companies to provide other services.

There are also a number of factors that affect cloud-hosted applications, such as network latency, performance and the availability of the following compute and storage systems, and the network bandwidth between them. The service may fail completely or partially because of any of these factors. Therefore, you must periodically verify that the service is performing correctly to ensure availability, which may be the level required by your service level agreement (SLA).

Solution Solutions


Health monitoring is implemented by sending requests to the endpoint of the application. The application should perform the necessary checks and return an indication of its status.

A health monitoring check usually combines two factors: a check (if any) the application or service responds to the request sent to the health verification endpoint for execution, and the results are analyzed by the tool or framework that is performing a health check validation. The response code represents the state of the application and optionally any component or service that it uses. The delay or response time check is performed by the monitoring tool or framework. Figure 1 shows an overview of the execution of the pattern.

Figure 1-Schema overview


Additional checks may be performed as follows: the health monitoring code in the application includes:
• Check the database for cloud storage or availability and response times.
• Inspect other resources or services located within the application, or elsewhere, but used by the application.

Several existing services and tools can be used to monitor Web applications by submitting a request to a configurable set of endpoints and evaluating the results against a set of configurable rules. It is relatively easy to create a service endpoint whose sole purpose is to perform some functional testing on the system.

This can be done through a monitoring tool to perform typical checks including:
• Verify the response code. For example, an HTTP response of 200 (OK) indicates that the application responds without error. The monitoring system may also check if there are other response codes, giving a more comprehensive indicator of the results.
• Check the contents of the response to detect errors, even when returning a status code of (OK). This can detect only partial errors that affect the returned web page or service response. For example, check the title of a page or look for a specific phrase to indicate that the correct page is being returned.
• Measure response time, which indicates that network latency and application combine the time it takes to execute the request. The added value can indicate an emerging issue with the application or network.
• Check for resources or services that are located outside of the application, such as content distribution networks that are used by applications to deliver content from the global cache.
• Check that the SSL certificate expires.
• Measure the response time of the DNS query used for the URL of the application in order to measure DNS latency and DNS failures.
• Verify the returned DNS query to ensure that the URL is entered correctly. This helps to avoid the redirection of malicious requests by successful attacks on the DNS server.

It is also useful, where possible, to run in an on-premises and managed location from these different checks to measure and compare response times from different places. Ideally, you should monitor applications that are close to the customer to get the exact view location of each location's performance. In addition to providing a more robust inspection mechanism, the results may affect the deployment location of the selected application and whether it is deployed in more than one data center.

The trial should also be run for all customer service instances that are used by all customers to ensure that the application is working properly. For example, if the customer's storage space is distributed across multiple storage accounts, all of these must be checked during the monitoring process.

Issues and considerations


When deciding how to implement this pattern, consider the following points:
• How to verify the response. For example, just a single (OK) status code is sufficient to verify that the application is working properly? While this provides the most basic measure of the usability of the application, and is minimal in the execution of this pattern, it provides little information about the operations, trends, and issues that may arise in the application.


Note:

Make sure that the application does not correctly return 200 status codes only if the target resource is discovery and processing. In some cases, when a master page is used to host a target webpage, for example, the server may return a 404 OK status code instead of a code that is not found for the target page, even if the destination content is not found.


• The number of endpoints to expose to an application. One approach is to expose the core services used by the application to at least one endpoint, and the other for secondary or low-priority services, so that the different levels of importance will be assigned to each monitoring result. You can also consider exposing multiple endpoints, such as for each core service, to provide additional granularity of monitoring. For example, a healthy validation check can check the database, storage, and applications using external geocode services, each of which requires different levels of uptime and response time. The application may still be healthy if the GeoCode service, or some other background task, is unavailable for several minutes.
• Whether to use the same endpoint monitoring as for general access, but rather to design a specific path for health verification checks; For example,/health Check/{guid}/to the general access endpoint.} This allows some functionality within the application to be tested by monitoring tools such as adding new user registrations, logging in, and the order of a test being executed, while also confirming that the general access terminal is available.
• Collect the type of information that the service responds to monitoring requests and how to return that information. Most existing tools and frameworks look at the return of the HTTP status code endpoint only. To recover and verify additional information, you may need to create a custom monitoring utility or service.
• How much information is collected. Over-processing during the check process can overload the application and affect other users, and it may take longer than the monitoring system to time out, making it a sign that the application is unavailable. Most applications include instrumentation, such as error handlers, and performance counters that record performance and detailed error information, which may be sufficient instead of additional information returned from a health verification check.
• How to configure security monitoring endpoints to protect them from public use, which can expose malicious attacks on the application, exposure to risk-sensitive information, or attract denial-of-service (DoS) attacks. Typically, this should be done in the configuration of the application so that it can be easily updated without restarting the application. Consider using one or more of the following technologies: by requiring certification? Secure endpoint. This can be achieved by using the security key for authentication in the request header or by passing credentials and requests, provided that the Monitoring Service or tool supports authentication.
? Use an obscure or hidden endpoint. For example, exposing a different IP address on the endpoint, configuring endpoints on a non-standard HTTP port using the default application URL, and/or using a complex path test page. It is usually possible to specify an additional endpoint address and port in the application configuration, and a DNS server for those endpoints (if required) to avoid adding entries directly to the specified IP address.
? A method on a expose that accepts an endpoint of a parameter, such as the value of a key or the value of an operation pattern. Depending on the request, the code you receive can perform a specific test or a set of tests, or return a value of 404 (not Found) error provided by this parameter if the parameter value cannot be recognized. The parameter values that are recognized can be set in the configuration of the application.


Note:

A Dos attack is possible on a separate endpoint, which performs a basic functional test without affecting the action of the application less. Ideally, you should avoid the use of tests that may expose sensitive information. If you must return, it might be useful information for an attacker to consider how to protect the endpoint from unauthorized access to the data. In this case, it is not enough to rely on obscurity alone. You should also consider using HTTPS to connect and encrypt any sensitive data, although this increases the load on the server.


• How to access endpoints that are pinned using authentication. Not all tools and frameworks can be configured to include credentials with a health authentication request. For example, Microsoft's Azure built-in health verification feature cannot provide authentication credentials. Some third-party alternatives can be pingdom, panopta,newrelic, and Statuscake.
• How to ensure that the monitoring agent is performing correctly. One approach is to expose an endpoint to only return values from the application's configuration or random values that can be used to test the agent.



Note:

Also ensure that the monitoring system carries out its own checks, such as self-test and built-in tests, to prevent it from giving false positive results.

when to use this mode


This model is ideal for:
• Monitor Web sites and Web applications to verify availability.
• Monitor Web sites and Web applications to check that they are working properly.
• Monitor the middle tier or shared services to detect and isolate failures that may affect other applications.
• To supplement existing instruments, such as performance counters and error handlers, in the application. Sanitary inspection checks do not replace the requirements of the logs and audits in the application. Gauges can provide valuable information for existing frameworks, monitoring counters and error logs to detect failures or other problems. However, it cannot provide information if the application is not available.

Example


The following code example, taken from the Healthendpointmonitoring.web project of the Healthcheckcontroller class, includes samples that can be downloaded from this guide, demonstrating exposing an endpoint for a series of health checks.
The Coreservices method, as shown below, performs a series of checks on the services used in the application. If no errors are performed in all tests, the method returns a (OK) status code. If any of the tests throws an exception, the method returns a 500 (internal error) status code. When an error occurs, the additional information can optionally be returned if the monitoring tool or framework can take advantage of it.

[CSharp]View Plaincopy
  1. Public ActionResult coreservices ()
  2. {
  3. Try
  4. {
  5. //Run A simple check to ensure the database is available.
  6. DataStore.Instance.CoreHealthCheck ();
  7. //Run A simple check in our external service.
  8. MyExternalService.Instance.CoreHealthCheck ();
  9. }
  10. catch (Exception ex)
  11. {
  12. Trace.traceerror ("Exception in basic health check: {0}", ex.  Message);
  13. //This can optionally return different status codes based on the exception.
  14. //Optionally it could return more details about the exception.
  15. //The additional information could is used by administrators who access the
  16. //endpoint with a browser, or using a pings utility that can display the
  17. //additional information.
  18. return New Httpstatuscoderesult ((int) httpstatuscode.internalservererror);
  19. }
  20. return New Httpstatuscoderesult ((int) httpstatuscode.ok);
  21. }

The Obscurepath method shows how to read the path to the application configuration and use it as a test endpoint. This example also shows how to accept an ID as a parameter and use it to check for a valid request.

[CSharp]View Plaincopy
  1. Public ActionResult Obscurepath (string id)
  2. {
  3. //The ID could is used as a simple-to-obscure or hide the endpoint.
  4. //The ID to match could is retrieved from configuration and, if matched,
  5. //Perform a specific set of tests and return the result. It not matched it
  6. //could return a 404 Not Found status.
  7. //The obscure path can set through configuration in order to hide the endpoint.
  8. var Hiddenpathkey = cloudconfigurationmanager.getsetting ("Test.obscurepath");
  9. //If The value passed does not match this in configuration, return 403 ' not Found '.
  10. if (! String. Equals (ID, hiddenpathkey))
  11. {
  12. return New Httpstatuscoderesult ((int) httpstatuscode.notfound);
  13. }
  14. /Else Continue and run the tests ...
  15. //Return results from the core Services test.
  16. return this .  Coreservices ();
  17. }


The Testresponsefromconfig method shows how you can expose an endpoint that performs a specified configuration SetPoint check.

[CSharp]View Plaincopy
  1. Public ActionResult Testresponsefromconfig ()
  2. {
  3. //Health Check that is returns a response code set in the configuration for testing.
  4. var returnstatuscodesetting = cloudconfigurationmanager.getsetting (
  5. "Test.returnstatuscode");
  6. int returnstatuscode;
  7. if (! Int. TryParse (returnstatuscodesetting, out returnstatuscode))
  8. {
  9. Returnstatuscode = (int) Httpstatuscode.ok;
  10. }
  11. return new Httpstatuscoderesult (Returnstatuscode);
  12. }

Monitoring applications hosted by endpoints in Azure


Some of the options in the Azure application monitoring terminal include:
• Use Microsoft Azure for built-in features such as management services or traffic manager.
• Use a third-party service or Microsoft System Center Operations Manager framework, etc.
• Create a custom tool, or a service that runs on your own or hosted server.

Note:

Although Azure provides a reasonable and comprehensive monitoring option, you can decide to use additional services and tools to provide additional information.


Azure Management Services provides a comprehensive set of built-in monitoring mechanisms for alerting rules around the world. The alerts section of the Azure Management portal in the Manage Services webpage can be configured to serve you up to 10 alert rules per subscription. These rules specify a condition and a threshold for services such as CPU load, or the number of requests or errors per second, and the service can automatically send an e-mail notification to you at the address defined in each rule.

You can monitor the cost depending on the conditions you choose to fit your application's hosting mechanism (such as websites, cloud services, virtual machines, or mobile services), but all of these, including the ability to create a rule using the network endpoint alert, you specify in the settings for your service. This endpoint should respond in a timely manner so that the alert system can detect that the application is functioning properly.

Note:

For more information about creating monitoring alerts, see Managing Services on MSDN.


If your host is in the Azure cloud services Network and work roles or virtual machine applications, you can take advantage of the built-in services in the so-called traffic manager in Azure. Traffic Manager is a routing and load balancing service that can distribute requests to your cloud service hosted applications based on a series of rules and specific instances of settings.

In addition to request routing, the Traffic management ping URL, port and relative to the path you regularly specify are active and respond to requests for the instance of the application defined in its rule. If it detects a status code of (OK) It marks the application to be available, other state code will cause traffic manager to mark the application offline. You can view the status and configuration rules of the Traffic Manager console to reroute other instances of the application that the request is responding to.

However, keep in mind that traffic manager will wait only 10 seconds to receive a response from the monitoring URL. Therefore, you should make sure that your health verification code is executed within this timeframe, allowing the network to delay the trip from the traffic manager to your application before returning.

Note:

For more information about using Windows Traffic Manager to monitor your application, see Microsoft Azure on MSDNtraffic Manager's. Traffic Manager is discussed in several data center deployment guides.

This article is translated from msdn:http://msdn.microsoft.com/en-us/library/dn589789.aspx

Cloud computing design mode (11)--Health Endpoint monitoring mode

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.