Do all kinds of Sao operations via HTTP header

Source: Internet
Author: User

As a professional transduction engineer, I never care about the header of the webpage, the most concerned is Status Code not 200 . But the header is really important ah, the client to get content from the server side, the first is through the header for all kinds of communication! Header can help us to complete many Sao operations, improve the performance of the site, the user's experience. All right, let's feel a little bit.

Primary SAO Operation
    • Multi-lingual ( Accept-Language )
    • Anti-theft chain ( Referer , Referered )
    • Gzip, simply to say, save traffic ( Accept-Encoding , Content-Encoding )
Multi-lingual

Multi-language is a website can be a multi-lingual switch, here does not discuss the construction of n sites, a website is also a language. Here's how to intelligently return the language that the user needs.

Server Client
Threw it at the server.Accept-Language
Receive each other'sAccept-Language
The field is probably like thiszh,en-US;q=0.9,en;q=0.8
Start processing, change the field to q an array with weights
Sort of well probably long like this[{"name":"zh","q":1},{"name":"en-US","q":0.9},{"name":"en","q":0.8}]
Returns the owning language according to the weight, returns zh zh without zh returningen-US
What if I don't have the language pack that the other person needs? Urgent, online etc!
No way, only give each other our official (default) language
Send, please receive
Your accept language has been matched This site is quite on the road, although it is a foreign site, but know that I am Chinese
We don't have a language pack in your area Emmmm, is this Martian text?

Easy-to-implement version with multiple languages:

Let languages = {zh:{title: "Hello", Content: "Classmate"}, en:{title: "Hey", Content: "Guy" },}//Set the default language, in case the user's language we do not support it? Let defaultlanguage= "en" Let http = require (' http '), function GetLanguage (client_langs) {Let finallanguage= DefaultLanguage try{if (client_langs) {//Sort gets the language order client_langs=client_langs.split (', '). Map                (l=>{let [name,q] = L.split (';'); Q = q?            Number (q.split (' = ') [1]): 1 return {NAME,Q}}). Sort ((b) =>b.q-a.q);                Matches the language of the server and returns a for (let i = 0; I <languages.length;i++) {let name= languages[i].name;                    if (Languages[name]) {finallanguage=name;                Break }}}}catch (e) {} return Languages[finallanguage]}http.createserver (function (req,res) {//Get customer    End of the language let client_langs = req.headers[' accept-language ']; Let Lan=getlanguage (Client_langs)//Print the language to client res.end (' <p>${lan.title}</p><p>${lan.content}</p> ')}). Listen (3000) ;
Anti-theft chain

This technology uses the most should be for the picture limit, only this domain name can be obtained, other domain names do not want to think.

Server Client
A picture was requested on a website
Through Referer , Referered found that this site domain name is not in our whitelist
This image is not available to a website
At this point, the PO has a million-piece of Earth
Support Genuine please on our website

Implementation principle, here I use an IFRAME to do an example, in fact, the principle is simple is to compare the source, either with the request for resources consistent or in the whitelist, otherwise refused. Of course, if there is no source to direct the release, in case someone is opened alone, not hotlinking:

Let HTTP = require (' http '), let FS = require (' FS '), let URL = require (' URL '), let path = require (' path ');//Set Whitelist let Whiteli    st = [' localhost:3000 '];http.createserver (function (req,res) {//Get request address let {pathname} = Url.parse (Req.url);    Get Physical Address let Realpath = Path.join (__dirname,pathname);            Gets the file status Fs.stat (Realpath,function (err,statobj) {if (err) {res.statuscode = 404;        Res.end ();            }else{//focus on let Referer = req.headers[' Referer '] | | req.headers[' referred ']; If there is a source if (Referer) {//Get the domain name let the current = Req.headers[' host '] R                Eferer = Url.parse (Referer). Host Console.log (Current,referer)//If the domain name is the same live in whitelist, release!                if (current = = = Referer | | whitelist.includes (Referer)) {Fs.createreadstream (Realpath). pipe (RES); }else{//Not released, this is hotlinking! Give you a look at yourself fs.cReatereadstream (Path.join (__dirname, ' files/2.html ')). pipe (RES); }}else{//No source, also release.            In case it is opened alone ~ fs.createreadstream (Realpath). pipe (RES); }}). Listen (3000);
Gzip

Modern browsers are very advanced and can already accept compressed packages. Admire admire. So how to transfer compressed Web pages?

Server Client
Threw it at the server.Accept-Encoding
The approximate structure is like this.gzip, deflate, br
Get to the other party's intentions, start configuring compression
If compression is supported, first set a headerContent-Encoding
There are a number of compression methods that follow the server-first supported match
Compress Web pages online and return to client after success
Joy saves traffic and does not affect the experience

With the suggestion code, when you test, don't forget to create the test HTML file

let http = require('http');//用于压缩文件所需的库let fs = require('fs');let path = require('path');//压缩的库let zlib = require('zlib');http.createServer(function (req,res) {    //获取客户端接受的压缩方式    let rule = req.headers['Accept-Encoding'];    // 创建原文件可读流    let originStream=fs.createReadStream(path.join(__dirname, '1.html'));    if(rule){        // 啊啊啊!正则是个坎,我怕我是跨不过去了。        if(rule.match(/\bgzip\b/)){            //如果支持压缩!一定要设置头部!            res.setHeader('Content-Encoding','gzip');            originStream=originStream.pipe(zlib.createGzip())        } else if (rule.match(/\bdeflate\b/)){            res.setHeader('Content-Encoding', 'deflate');            originStream=originStream.pipe(zlib.createDeflate())        }    }    // 输出处理后的可读流    originStream.pipe(res)}).listen(3000);
Intermediate operation

Most of the primary operations only need to rely on the configuration of the header can be achieved , intermediate of course we have to be difficult, most of the client and server to play with.

    • Client sends content to server ( Content-Type , Content-Length )
    • Client gets content from server ( Range , Content-Range )
    • Client crawler, crawling Web pages
Client sends content to server
Server Client
Give you a bunch of data that you give to deal with under
Unthinking, who knows what you're going to do, please set the header.
Well, tell you Content-Type andContent-Length
You can, the content type of the data is the length is necessary
I'm sending you the data, you see.
Received ~ Listen to the data received is a set of buffer
Accept complete, Merge buffer
Based Content-Type on the processing of the data
Format data, end

Server code

let http = require('http');let server = http.createServer();let arr=[]server.on('request', (req, res)=>{  req.on('data',function (data) {    //把获取到的Buffer数据都放入熟组    arr.push(data);  });  req.on('end',function() {    // 请求结束了,好了可以开始处理断断续续收到的Buffer了    // 合并buffer    let r = Buffer.concat(arr).toString();    if (req.headers['content-type'] === 'x-www-form-urlencoded'){        let querystring = require('querystring');        r = querystring.parse(r); // a=1&b=2然后格式化        console.log("querystring",r);      } else if (req.headers['content-type'] === 'application/json'){        //听说是JSON格式的        console.log("json",JSON.parse(r));      } else{        //没有格式?那原来是啥就是啥吧。        console.log("no type",r);      }      arr=[]      res.end('结束了!');  });})server.listen(3000,()=>{  console.log(`server start`);});

Client code

// 设置请求地址的配置let opts = {  host:'localhost',  port:3000,  path:'/',  // 头部设置很重要,头部设置很重要,头部设置很重要  headers:{    'Content-Type':'x-www-form-urlencoded',    //长度超过3就没有人理你了    "Content-Length":7  }}let http = require('http');let client = http.request(opts,function (res) {  res.on('data',function (data) {      console.log(data);  })});client.end("a=1&b=2");
Client gets some content from server /tr>
server client< /th>
I Want some content of the resource
Yes, tell me. Scope
I put it in the header Ra Nge , bytes=0-3
Content-range: Bytes 0-3/7 , please accept, this file total 8 bytes, the first 3 bytes have been given to you OK, then give me the next one, bytes=4-7
Here you are. end

We all found it, such a range to get data, is completely a simple version of the continuation of the breakpoint! But there's a point here. Easy to make mistakes is the calculation of file size, because the location of the file byte is calculated according to 0, so range is the full range 0~size-1/size-1 , everyone notice.

Server Side

let http = require('http');let fs = require('fs');let path = require('path');// 当前要下载的文件的大小let size = fs.statSync(path.join(__dirname, 'my.txt')).size;let server = http.createServer(function (req, res) {  let range = req.headers['range']; //获取client请求访问的部分内容  if (range) {    let [, start, end] = range.match(/(\d*)-(\d*)/);    start = start ? Number(start) : 0;    end = end ? Number(end) : size - 1; // 10个字节 size 10  (0-9)    console.log(`bytes ${start}-${end}/${size - 1}`)    res.setHeader('Content-Range', `bytes ${start}-${end}/${size - 1}`);    fs.createReadStream(path.join(__dirname, 'my.txt'), { start, end }).pipe(res);  } else {    // 会把文件的内容写给客户端    fs.createReadStream(path.join(__dirname, 'my.txt')).pipe(res);  }});server.listen(3000);

Client Side

let opts = {    host:'localhost',    port:3000,    headers:{}  }let http = require('http');let start = 0;let fs = require('fs');function download() {    //分流下载,部分下载    opts.headers.Range = `bytes=${start}-${start+3}`;    start+=4;    let client = http.request(opts,function (res) {        let total = res.headers['content-range'].split('/')[1];        res.on('data',function (data) {          fs.appendFileSync('./download.txt',data);        });        res.on('end',function () {            //结束之后,1s之后再下载          setTimeout(() => {              console.log(start,total)            if (start <= total)              download();          }, 1000);        })    });    client.end();}download()
Client crawl Web content, simple crawler

This piece of operation is actually very simple, just build a request to get to the Web page can be.
The difficulty lies in: How to take the useful information of the tour to peel off the webpage, filter out useless information.
I grabbed the entertainment version of Baidu here, Baidu is also a conscience, is utf8, or it will be garbled.

 Let http = require (' http '); let opts = {host: ' news.baidu.com ', Path: '/ent '}//create a request to get site content let client = Http.reques T (Opts,function (R) {let arr= []; Resources cannot be downloaded at one time, so each fetch to the data is push into arr r.on (' Data ', function (data) {Arr.push (data); }); R.on (' End ', function () {//merge resource let result = Buffer.concat (arr). ToString (); Dealing with resources can be an object like me, and it's easy to do whatever it is. Let content = Result.match (/<ul class= "Ulist mix-ulist" > (?: [\s\s]*?) <\/ul>/img). toString (). Match (/<li> (?: [\s\s]*?) <\/LI>/IMG); Content=content.map ((c) =>{let Href=/<a href= "(?: [\s]*?)" /img.exec (c) Let title=/"> (?: [\s\s]*?) <\/a>/img.exec (c) return {Href:href[0].replace (/"/img," "). Replace (" <a href= "," "), Title:title[0].replace (/">/img," "). Replace (" </a> "," ")}}) Console.log (json.s Tringify (content)) arr= []; }); Client.end (); 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.