Construction of performance Monitoring system

Source: Internet
Author: User
Tags chrome developer chrome developer tools webpagetest

Introduction

Before the W3ctech into the name enterprise-Baidu front-end FEX "bombast" said that after seven days after the lecture can build their own front-end performance monitoring system, since said out of can not promise. The beauty of front-end data in a previous article believe that we have a certain understanding of the front-end data, the following is a detailed description of the performance data and its monitoring.

Start action

The performance in this article mainly refers to the Web page load performance, the performance is not known? Don't worry, the next "every day" comes with me into the world of front-end performance.

Day 1 why monitor performance?

"If You cannot measure it, you cannot improve it" ———— William Thomson

This is one of the most basic questions, why focus and monitor front-end performance? For a company, performance is directly related to benefits to a certain extent. There are many research data in this field abroad:

Performance Benefits
Google Delay 400ms Search volume decreased 0.59%
Bing delay 2s Income decline 4.3%
Yahoo Delay 400ms Flow down 5-9%
Mozilla page Open reduce 2.2s Download volume boost 15.4%
Netflix opens Gzip Improved performance 13.25% bandwidth reduction 50%

Data Source: Http://www.slideshare.net/bitcurrent/impact-of-web-latency-on-conversion-rates http://stevesouders.com/ Docs/jsdayit-20110511.pptx

Why does performance affect the company's earnings? The root cause is also that performance affects the user experience . Loading delays, operation Lag, and so on will affect the user experience. In particular, mobile users have low tolerance for page response delays and connection interruptions. Imagine you're holding a cell phone. To open a webpage to see a message but load a half-day mood, you probably choose to go straight away to change a webpage. Google also will page loading speed as a weight of SEO, page loading speed on the user experience and the impact of SEO research has a lot.

Despite the importance of performance, it is inevitable that the development iterations will be overlooked and performance will decay with the iteration of the product . Especially on the mobile side, the network has been a big bottleneck, and the page is more and more large, more and more complex functions. There are no simple golden rules to take care of the performance optimization work, we need a performance monitoring system monitoring, evaluation, warning page performance status, detection bottlenecks, to guide the implementation of the optimization.

What tools are available for day 2?

工欲善其事 its prerequisite

Page performance evaluation and monitoring There are many mature and excellent tools, reasonable use of existing tools to achieve a multiplier effect. The following is a brief introduction to several commonly used tools:

Page speed

Page speed is a tool that Google has developed for analyzing and optimizing Web pages that can be used as a browser plugin. The tool detects the site based on a series of optimization rules and gives detailed recommendations for the failed rules. Similar tools, such as Yslow, are recommended to view the results of multiple analysis tools at the same time using the Gtmetrix Web site, as shown in:

Webpagetest

Webpagetest is a very good web front-end performance testing tool, open source. You can use the online version or build it yourself. Domestic also has the use Webpagetest to build the performance test platform, recommended to use the Ali test (the following example using Ali testing).

With Webpagetest, you can learn more about waterfall Flow, performance score, element distribution, view analysis, and more during site loading. The more intuitive view analysis function can directly see the page loading each stage of the screenshot:

: Please click here for the entire test result.

Visually reveals two important points of time for the browsing site: White screen time and first screen time, that is, how often the user can see the content on the page, and how often the first screen rendering is complete (including images and other elements loaded). These two points of time directly determine how long the user will have to wait to see the information they want to see. Google optimization Recommendations also mentioned the reduction of non-first screen use of CSS and JS, as soon as possible to make the first screen rendering.

Phantomjs

Phantomjs easily brings monitoring to the ranks of automation. Phantom JS is a server-side JavaScript API WebKit, based on which you can easily implement Web automation testing. PHANTOMJS requires a certain amount of programming work, but is also more flexible. The official documentation already has a complete example of getting a Web page to load the Har file, specifying that you can view this document, as well as a lot of information about this tool in the country. In addition Sina @ tapir to eat Bun incense development Similar tool BERSERKJS also very good, also intimate to provide the first screen statistics function, specific article can see here.

Day 3 Real user performance monitoring at the start of the online

Take its director and avoid the short

There must be a classmate asked, since there are so many excellent tools, why should we monitor the real access performance of online users?

We find that the tool simulation tests deviate to a certain extent from the real situation and sometimes do not reflect the performance fluctuations. In addition to white screen the first screen and other basic indicators, product line is also concerned about product-related indicators , such as ads visible, search available, check-in available, etc., these functions directly related to page JS loading, through the tool is more difficult to simulate.

In order to continuously monitor the user access situation in different network environment and the availability of the page functions, we choose to embed JS in the page to monitor the real user access performance on the line, and use the existing analysis tools as auxiliary to form a complete and diversified data monitoring system to provide reliable data for the evaluation and optimization of the product line.

For a simple comparison of different monitoring methods, you can view the following table:

type Advantages Disadvantages Example
Non-intrusive Complete indicator, client active monitoring, competitor monitoring There is no way to know the number of performance impact users, sample less prone to distortion, unable to monitor complex applications and subdivision functions Pagespeed, PHANTOMJS, Uaq
Invasive type Real mass user data, ability to monitor complex applications and business functions, user clicks and area rendering Need to insert script statistics, network indicators are not complete, unable to monitor the competition DP, Google Stats
Day 4 How do I collect performance data?

Monitor the user's pain points

What indicators are monitored on line? How to better reflect user perception?

For users he felt why the page is not open, why the button can not click, why the picture is so slow. For engineers, it is possible to focus on browser load process metrics such as DNS queries, TCP connections, service responses, and so on. According to the user's pain point, the browser loading process to extract four key indicators, that is, white screen time, first screen time, user operation, total download time (definition of the previous article). How are these indicators counted?

Determining the starting point for statistics

We need to start counting when the user enters the URL or clicks the link, as this will measure the user's wait time. If your users have high-end browser ratios, you can use the navigation timing interface directly to get a statistical starting point and time-consuming stages in the loading process. In addition, cookies can be used to record the time stamp statistics, it is important to note that the cookie method can only be counted in the arrival of the data jump.

Statistics White screen time

White screen time is the first time users see content, also known as the first rendering time, Chrome High version has Firstpainttime interface to get this time-consuming, but most browsers do not support, you must think of other ways to monitor. Careful observation webpagetest view analysis found that the white screen time appears in the head outside the chain resources loaded near, because the browser only load and parse the head resources to actually render the page. Based on this we can approximate the statistical white screen time by acquiring the moment when the head resource is loaded. Although not accurate, the main factors that affect white screen are considered: the first byte time and the head resource load time.

How to Count head resource load? We found that the head embedded JS usually need to wait for the front of the js\css loaded before execution, is not in the browser head inside the bottom add a JS statistics head resource loading end point? You can test it with a simple example:

<! DOCTYPE html><metacharset="UTF-8"/><script>VarStart_time=+NewDate;Test time start, actual statistics starting point for DNS query</script><!--This JS will return after 3s<scriptSrc="Script.php"></script><script>VarEnd_time=+Newdate//Time end var headtime = end_time -start_time//head resource load time console. Log (headtime</script> <body>  <p> before the head resource is loaded, the page will be white screen </p> <p>script.php is simulated set 3s back, head bottom embedded JS Wait for the front JS to return after the execution of </p> <p>script.php replaced by a long-time loop to perform the JS effect is also the same </p> </body></HTML>      

The test found that the head load time of the statistics is similar to the download time, and replaced by a long execution time JS will wait until the completion of JS statistics. This method is possible (see the browser rendering principle and the JS single-threaded description for specific reasons).

Statistics first screen time

The first screen time is more complicated, because it involves many elements such as pictures and asynchronous rendering. Watch the load view to find images that affect the main factor of the first screen. The time to complete the first screen rendering can be obtained by counting the loading time of the pictures in the first screen. The statistical process is as follows:

首屏位置调用 API 开始统计 -> 绑定首屏内所有图片的 load 事件 -> 页面加载完后判断图片是否在首屏内,找出加载最慢的一张 -> 首屏时间

This is a simple statistical logic in the case of synchronous loading, plus several points to note:

    • When the page has an IFRAME, you also need to determine the load time
    • GIF images may repeatedly trigger the Load event on IE to be excluded
    • In the case of asynchronous rendering, the first screen should be computed after the asynchronous get data is inserted
    • CSS important background image can be counted by JS request image URL (browser does not load repeatedly)
    • No picture to statistics JS execution time to the screen, that is, the text appears time
Statistical user-actionable and total downloads

user -operable by default, the domready time is counted because the event action is usually bound at this point. For the use of modular asynchronous loading JS can be in the code to actively mark the important JS loading time, which is also the statistical method of product indicators.

The total download time By default counts the onload time, which allows you to count the time it takes to load all the resources that are loaded synchronously. If there is a lot of asynchronous rendering on the page, you can take the time to complete the asynchronous render as the total download time.

Network metrics

Network type judgment

For mobile, the network is the most important factor of page loading speed, need to be based on different networks to take appropriate optimization measures, for example, 2G users to adopt a simple version. But there is no interface on the web to get the user's network type. In order to obtain the type of user network, we can determine the corresponding network of different IP segments by means of speed measurement. Speed-measuring, for example, is a classic Facebook program. After the speed measurement analysis, the user's loading rate has obvious distribution between the distribution, as shown in:

Each distribution interval corresponds to different network types, and after the auxiliary test with the client, the success rate can be over 95%. With this IP library corresponding rate data, you can analyze the user data based on IP to determine the user network type.

Network time-consuming statistics

Network time-consuming data can be obtained by referring to the Navigation Timing interface, similar to the resource Timing, can get the page all static resources load time. This interface makes it easy to get DNS, TCP, first byte, HTML transfer, and so on, and the Navigation Timing interface is as follows:

The above focus on the data collection part, which is also the most important part of the system, only to ensure that the data can truly reflect the user perception, can improve the user experience to the remedy. After the data is collected, we can escalate the page after it has been loaded, as an example:

http://xxx.baidu.com/tj.gif?dns=100&ct=210&st=300&tt=703&c_dnslookup=0&c_connecting=0&c_waiting=296&c_receiving=403&c_fetch_dns=0&c_nav_dns=75&c_nav_fetch=75&drt=1423&drt_end=1436&lt=3410&c_nfpt=619&nav_type=0&redirect_count=0&_screen=1366*768|1366*728&product_id=10&page_id=200&_t=1399822334414

# # # Day 5 How do I analyze performance data?

Let the data speak

The data analysis process, as described in the previous article, can analyze data from multiple dimensions. Large data processing requires the use of Hadoop, Hive, and so on, and for ordinary sites, any kind of back-end language processing.

Mean value and distribution

Mean and distribution are two of the most common ways of data processing. Because it can directly represent the trend and distribution of indicators, easy to evaluate, bottleneck detection and alarm. Outliers should be removed during processing, such as dirty data that clearly exceeds the threshold value.

Time-consuming assessment, there are many research data in this area. For example, three basic time ranges have been proposed:

    • 0.1 seconds : 0.1 seconds is the smallest user-perceived granularity, and operations done within this timeframe are considered to be smooth without delay.
    • 1 seconds : The response completed in 1 seconds does not interfere with the user's thought flow. Although the user can feel the delay, but 0.1 seconds-1 seconds to complete the operation does not need to give obvious loading hint
    • 10 seconds : up to 10 seconds users will not be able to stay focused and may choose to leave to do other things

According to some research in the industry, combined with the characteristics of different indicators, the distribution evaluation interval of the index is developed. As shown in the following:

The evaluation interval makes it easy for us to understand the current performance situation while reacting to performance trend fluctuations.

Multidimensional Analysis

To facilitate the mining of potential bottlenecks in performance, data needs to be analyzed from a multidimensional perspective. For example, the most important dimension of mobile is the network, data processing in addition to the overall data, but also based on the type of network analysis. Common dimensions include systems, browsers, regional operators, and so on. We can also be based on the characteristics of their products to determine some dimensions, such as page length distribution, simple version of the dazzling version.

It is important to note that the dimension is not the more the better, according to the characteristics of the product and the terminal to determine. dimensions are designed to facilitate the discovery of performance bottlenecks .

episode : Previously seen from Weibo that some say they want to monitor but the company does not have a log server . There is no need for a separate log server, as long as the statistics of this request access log can be saved. If the site's own independent server there is no solution, in the Baidu Developer Center to create a new application, write a simple Web service will receive the statistical data analysis to save to the Baidu Cloud free database, and then daily with Mysql processing next day data can be, There should be no problem with sampling performance data for normal sites. Please call me Lei Feng.

Day 6 How to use monitoring data to solve problems?

Find the bottleneck, the remedy

For the chart production, more famous have highcharts, Baidu developed Echarts is also very good. No matter what tool you use, the key point is to make the report focused and straightforward.

Make a report before you ask a few how to let people intuitively see the current situation and possible problems, which can be strengthened, which can be removed, the use of habits and so on.

With the real world that reflects the user perception, and subdivided into various business functions, there are detailed network and other ancillary data, we are more handy in solving the front-end performance. Monitoring system has a continuous feedback on the status of online access, based on the existing evaluation and bottleneck selection of the corresponding solution to optimize, and finally based on feedback to adjust, I believe that performance optimization is no longer a problem.

How to choose the optimization plan ? This is a relatively big topic, fortunately already have a lot of experience to learn from. In the appendix, we have compiled some of the performance of the learning materials , you can read the study according to needs.

Day 7 Summary

Through the above "days" of efforts, we can build a small and beautiful front-end performance monitoring system. But this is only the beginning, the front-end data has a lot of digging value. Performance optimization is also a need to study the course, in order to create a smooth use of experience, in order to make users more satisfied, and quickly build their own front-end data platform it!

The article is written in the W3ctech-Baidu front-end FEX session, share when the PPT here, video here.

gorgeous non-split line ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Benefits--Performance guidelines for performance learning data finishing at the front end ★★★★★
    • Yahoo performance Military, Chinese articles
    • Google Performance Optimization article recommended
Analysis tools ★★★

Entry

    • Pagespeed based on Google performance guidelines, browser-installable plug-ins can be run
    • Yslow based on the Yahoo Performance criteria detection Tool, browser installation plug-in operation
    • Pagecheck Baidu Internal development, complete indicators, support automatic operation

Advanced

    • Webpagetest View page loading data such as waterfall flow, advanced prerequisite tools
    • Chrome developer tools are powerful and worth learning
    • PHANTOMJS Powerful analysis tool, master essential Swiss Army Knife
    • JsPerf JS Perform performance analysis website, who knows with whom
Browser and Html Standard ★★

Entry

    • Browser caching mechanism
    • Navigation Timing, Resource Timing related articles Please Google, necessary knowledge
    • Principles of the DNS parsing process
    • High-performance Browser network translation series

Advanced

    • The HTTP2.0 protocol based on this SPDY agreement is about to be released
    • The browser rendering principle is rather difficult to understand, but very classic
    • Chrome Implementation Principles Learning Guide A summary of the multi-benefits Daniel
Development combat ★★★★

General

    • High-performance JavaScript
    • Writing-fast-memory-efficient-javascript
    • Understanding and solving Internet Explorer Leak Patterns
    • Modular loading of FIS, Seajs. FIS has perfect static resource management and optimization scheme, recommended.
    • Best practices for front-end performance optimization

Animation and rendering

    • Requestanimationframe
    • Optimization of 16ms
    • CSS, JS do not cause Repaint & Reflow

Mobile-side development

    • Improving the performance of your HTML5 App
    • Steve Souders
    • Creating high-performance Mobile Websites
    • HTML5 Techniques for optimizing Mobile performance
    • Mobile Web site Optimization Guide
Performance monitoring ★★★★
    • Monitoring indicator Selection
    • Complete Web monitoring-web performance at Emetrics
    • BERSERKJS building a front-end performance monitoring platform
    • NY Web Perf meetup:peeling the Web performance Onion
Related meetings ★★★
    • Velocity one of the most famous international conferences in the industry
    • Google I/O
    • Qcon
Recommended Blog ' ★ ¡ï '
    • Web Performance Today
    • Perfplanet
    • Stevesouders.com
    • Site-performance-and-optimization

Construction of performance Monitoring System (RPM)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.