Optimization of "Work" Proxy server-detection of target site URL changes

Source: Internet
Author: User
Tags prefetch

At work, I am responsible for a proxy (proxy) module in the group, this module is for Microsoft Office 365 Mail Portal OWA implementation, after working, users access Office 365 OWA, no longer need to enter the Office 365 URL, Simply enter the address of our proxy and we will forward the request to Office 365 OWA for the purpose of the user's visit and make the user experience the same as the actual access to Office 365 OWA.

In fact, the principle of our proxy is to use node. js to build an HTTP Server, get the client (actually browser) request, then transfer the request to Office 365, the Office 365 of the return content response to the client side, so that the function of proxy .

Of course, there are a lot of details in the actual implementation process, including the processing of cookies, url conversion, and so on, not detailed here.

  

But when I developed and maintained this module at work, I found the problem that while we were forwarding the request, there were still a lot of requests that we needed to deal with, and there were a lot of complicated requests that needed to be researched to support, so as a proxy I had to know office 365, that is, the target site has what type of request, in fact, what are the different URLs, different URLs in fact path is different.

  

So I made an optimization because proxy is essentially an HTTP Server, so I print all the request URLs sent by the client in log so that I can collect all the URLs in the log and send the URL back to the result (Response Status Code) is also printed together, so that you can know if the URL is dealing with a problem, if the return value of 200, then OK.

  

So after printing in log, we get the following log,

1/___/outlook.office365.com/, 3022/owa/, 3023/__/LOGIN/LOGIN.SRF, 2004/owa/prefetch.aspx, 2005/___/r1.res.office365.com/owa/prem/16.801.12.1741001/scripts/preboot.js, 2006/___/r1.res.office365.com/owa/prem/16.801.12.1741001/scripts/boot.worldwide.0.mouse.js, 2007/___/OUTLOOK.OFFICE365.COM/GETUSERREALM.SRF, 2008/___/r1.res.office365.com/owa/prem/16.801.12.1741001/scripts/boot.worldwide.1.mouse.js, 2009/OWA/EV.OWA2, 200Ten/OWA/EV.OWA2, 200 One/___/outlook.office365.com/, 302 A/OWA/EV.OWA2, 200 -/owa/, 302 -/__/LOGIN/LOGIN.SRF, 200 the/OWA/EV.OWA2, 200 -/OWA/SERVICE.SVC, 200 -/owa/prefetch.aspx, 200 -/___/r1.res.office365.com/owa/prem/16.807.12.1742334/scripts/preboot.js, 200 +/OWA/SERVICE.SVC, 200 -/___/r1.res.office365.com/owa/prem/16.807.12.1742334/scripts/boot.worldwide.0.mouse.js, 200 +/OWA/EV.OWA2, 200 A/OWA/EV.OWA2, 200 at/OWA/SERVICE.SVC, 200 -/OWA/SERVICE.SVC, 200 -/___/OUTLOOK.OFFICE365.COM/GETUSERREALM.SRF, 200 -/___/r1.res.office365.com/owa/prem/16.807.12.1742334/scripts/boot.worldwide.1.mouse.js, 200 -/__/LOGIN/PPSECURE/POST.SRF, 200 -/owa/, 302

Each row of data, preceded by a URL, is followed by the response Status Code that the request received.

  

At the same time I wrote a script to parse the log data, because the data is repeated, need to go to the weight and sort.

The script is as follows:

1 varLinereader = require (' Line-reader '));2 varFS = require (' FS ');3 4 varFilereaddata = "URLs.log";5 varFilewritedata = "Result.txt";6 7 varIgnorenormalstatuscode =false;8 if(PROCESS.ARGV && process.argv[2]) {9Ignorenormalstatuscode = process.argv[2];//development to be passed as ParamTen } One  AConsole.log ("Ignorenormalstatuscode:" +ignorenormalstatuscode); -  - //Create data Object the varCreatedataobjectfromline =function(str) { -     vardata = Str.split (","); -  -     varobj = { +Url:data[0].trim (), -Statuscode:data[1].trim (), +Number:1 A     }; at  -     returnobj; - }; -  - //get the index in the array - varIndexofobjinarray =function(array, obj) { in     varpos =-1; -      to      for(vari=0; i<array.length; i++) { +         varE =Array[i]; -  the         if(E.url = = = Obj.url && E.statuscode = = =obj.statuscode) { *pos =i; $              Break;Panax Notoginseng         } -     } the  +     returnPos; A }; the  + //Compare number to sort - varCompare_number =function(A, b) { $     returnB.number-A.number; $ }; -  - //write the array ' s data to file the varWriteresulttofile =function(result, number) { -     varString = "";Wuyistring + = "Here's this URL scan result blow, \ n"; thestring + = "orignial URL number:" + number + "\ n"; -string + = "Unrepeat URL number:" + result.length + "\ n"; Wustring + = "------------------------------------------\ n \ nthe"; -string + = "Req url, this URL ' s response status code (" OK "), number statics\n"; About Fs.appendfilesync (Filewritedata, string); $  -      for(vari=0; i<result.length; i++) { -Fs.appendfilesync (Filewritedata, Result[i].url + "," + Result[i].statuscode + "," + Result[i].number + "\ n"); -     } A }; +  the //create an array to save the URLs - varresult = []; $  the //count the orignial URL number the varNumber = 0; the  the //Main function -Linereader.eachline (Filereaddata,function(line, last) { innumber++; the  the     //parse the data from every line About     varobj =Createdataobjectfromline (line); the     //console.log (obj); the      the     varpos =Indexofobjinarray (result, obj); +     if(pos! =-1) { -         //This object already exists in result array theresult[pos].number++;Bayi     } the     Else { the         if(Ignorenormalstatuscode && Obj.statuscode = = = ' 200 ') { -             // ... -         } the         Else { the             //Add this obj to result the Result.push (obj); the         } -     } the      the     if(last) { the         //sort the array by number94 Result.sort (compare_number); the  the         //write the result to file the Writeresulttofile (result, number);98  About         //stop reading lines from the file -         return false;101     }102});

Here, a node. js Module Line-reader is used to read data from a row of rows in a file.

After that, you can get the parsed result,

1Here is ThisURL Scan result blow,2Orignial URL number:1423Unrepeat URL Number:64------------------------------------------5 6Req URL, ThisUrls response Status code (is OK), number statics7 /owa/, 302, ten8 /___/outlook.office365.com/, 302, 59 /owa/auth/15.1.225/themes/resources/segoeui-regular.ttf, 404, 3Ten /owa/auth/15.1.225/themes/resources/segoeui-semilight.ttf, 404, 1 One /___/outlook.office365.com/favicon.ico, 302, 1 A/owa/auth/15.1.219/themes/resources/segoeui-semilight.ttf, 404, 1

Of course, the above result is not shown the status Code 200 URL, because this is the proxy processing the normal URL, there is no need for statistics and analysis.

After getting the result, it is obvious that there are a lot of 404 URLs, our proxy is not handled correctly, need further analysis, support in code. This completes the optimization of the product module.

  

  Personal small feeling, work a lot of small things, if you think right, you should stick to do. Small optimization, as long as it is meaningful, will be of great use:-)

Kevin Song

2015-7-22

Optimization of "Work" Proxy server-detection of target site URL changes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.