Enter the URL from the browser address bar, to the Web page opened completely, what happened in the middle?

Source: Internet
Author: User



Enter the URL from the browser address bar, to the Web page opened completely, what happened in the middle?






This is a classic interview problem, I thought that only I like this problem, and later on the micro-blog found other technology Daniel also out of this problem.






This problem actually tests not the specific technology, but the whole concept of the Internet and the understanding of the process behind it.






I first say what I understand or I expect the approximate answer points, and then explain the purpose and meaning of the problem, in fact, the so-called Internet this process is divided into three large chunks, a client, a network transport layer, a piece is the service side.






Starting from the input URL, this URL in the client will have a resolution, the first browser if there is a hook, may be directly to the site to make judgments and feedback, such as most of the third-party browser in the country (ie, the shell of the type), Will directly put the address into the keyword should jump to the Microsoft Search page of the URL hijacked, so the first step is the browser to the URL of the judgment and hijacking, the second step is the local host file judgment, the host file is also a variety of Trojan and rogue software favorite processing modified files, The hao123 of your hand is hijacked, and you can hardly see where it changed.






After escaping the local client's judgment, the domain name Query request will be sent to the service provider's DNS server, the service provider's DNS server will check the cache, if there is no domain you want to access or the cache status has expired, will access the root name server, the root name server will check the cache first, If the result cannot be returned directly, it will check which DNS server is responsible for resolving the domain name, then forward the request to the past, obtain the IP corresponding to the domain name, then return to the local service provider's DNS, the local service provider's DNS updates the cache, and then return the IP to the client.






Of course, here local DNS service providers may also hijack domain names for some purpose, as for GFW, here can only say hehe, but another well-known case is that if you enter the wrong keyword in the IE address bar, should default is to jump to Microsoft's search page, assuming you are using the original IE, And the local does not have a variety of tools that will give the IE plug-in, then your request should be Microsoft Search page, but in the local telecommunications this piece will still be hijacked, instead of the local telecommunications search results page. So sometimes, some users will change the DNS of their computer's Internet configuration to a more reliable public DNS, rather than the local telecommunications DNS. (The hijacking capability of local telecom is of course more than DNS, but also includes content substitution, strong interpolation.) But that's what's behind it. )






On the resolution to the IP, the complex point can also be said about the principle and mechanism of CDN, as well as for different areas of user intelligence analysis mechanism. But this is not going to unfold, of course, I do not necessarily speak clearly.






To IP here, the browser makes a request to specify the IP to get the specified file, which involves routing addressing and message transmission, of course, if the detail I can not tell, but at least know that using tracert to track the route is good. Then to the designated IP, the server will have a daemon on port 80 to accept the request, there are three handshake protocol here, then here is how webserver work, static page processing is relatively simple, dynamic script also needs an interpreter system to work, After executing a piece of code, the output of the result is returned. This may involve server-side caching, database, load balancing, polling, and so on, which may not be a host, but a cluster at a later stage. And then down into the architect's topic, even more unable to unfold.






But here is not the end, why, the return of this content page often contains a large number of embedded page requests, such as CSS, such as a variety of small icons, small pictures, which involves the browser to make requests, there are also some need to pay attention to the logic, such as the browser when making the request, the number and queuing restrictions. In addition, this content page may also involve some executable code that is executed on the browser, which has an important impact on what you see.






But it's not over yet, and when it comes to the target room, and the data comes back to your computer, there's a risk of tampering with the subnet being hijacked, ARP spoofing, what the ARP protocol is, why the content you want to access will be hijacked and tampered with. In addition, in the transmission will not be listening, will not be tampered with, mentioned above, in addition to GFW, there are strong local telecommunications.






So long-winded, some people may think, I apply for a service-side programmer/front-end technology/operations engineer, I need to know this? So the problem is, usually when I ask this question, there will be one to two extended topics.






Topic 1: If a user told you that your website/game is very slow, how can you analyze, how to respond?






Topic 2: If there is a user to tell you, open your website will pop up lewd ads or anti-virus software reported Trojan Horse, how do you analyze, how to respond?






See these, believe that many people will find, alas, this is really a common problem ah. The troubleshooting of this common problem, in fact, involves each step as above. is the card and slow the client, network layer or transport layer problem? How can I quickly troubleshoot and locate problems, as well as the scope of the impact? This study questions does not unfold today, but without the knowledge of the process above, you will not be able to give a good result on this topic.






Say an opinion, why the whole stack of engineers are now so valued, because most problems arise, the problem will not tell you, it belongs to which field, and need you to explore, troubleshooting, if you do not have a comprehensive vision and open ideas, you will probably not be able to find the key to the problem. How do you show your ability?






In fact, we need to be proficient in the front end, proficient in network protocols, proficient in the service side, this is indeed too difficult, but there is a holistic framework of ideas, and then to master one of the areas, your ability and vision will be a step, at least in the troubleshooting, technical cooperation and other aspects will appear more professional, more confident.






When I first started to touch the Internet, when I first began to write Web programs, it was a very ignorant, I wrote a CGI program, I want to let him run up, I am confused, do not know how this thing is running, I ran a webserver, set directory permissions, confused, And the first to look at Coolfire a hacker manual tiger to get someone else server permissions, or confused, for a long time do not know this thing is the principle of shrimp. I always think that in those years no one told me this, help me comb, so at that time a lot of things to do, but do not understand the mechanism of its occurrence, but also do not understand how to better tuning and refinement.






Combing these, in fact, can not spend too much time and energy, and after combing to the understanding of the problem of ascension, is huge, the promotion of technical synergy awareness, is also huge.



Enter the URL from the browser address bar, to the Web page opened completely, what happened in the middle?


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.