node. js static file server combat [go]

Source: Internet
Author: User
Tags ranges readfile

This is a more detailed article, from the servo static files, to support folders, caches, Gzip/deflate,range, are complete with the explanation, the full text reproduced as follows:

The structure in our App.js file is clear:

var PORT = 8000;var http = require (' http '); var server = Http.createserver (function (request, response) {    //TODO}); ser Ver.listen (port); Console.log ("Server runing at Port:" + Port + ".");

Because the current functionality to implement is a static file server, for example, in Apache, let's recall what the static file server does.

The browser sends the URL, the server resolves the URL, and corresponds to the file on the hard disk. If the file exists, return the 200 status code and send the file to the browser side; If the file does not exist, return a 404 status code and send a 404 file to the browser side.

The following two graphs are two states of Apache Classic.

Now that the demand is clear, let's start to realize it.

Implementing Routing

The implementation of the routing section has been described in the Node beginner book, no exception here.

Adding a URL module is necessary, and then parsing pathname.

Here is the implementation code:

var server = Http.createserver (function (request, response) {    var pathname = Url.parse (request.url). Pathname;    Response.Write (pathname);    Response.End ();});

Now the code is the path to the browser-side output of the request, similar to an echo server. Next we add the ability to output the corresponding file.

reading static files

In order not to let the user in the browser side through request/app.js to see our code, we set the user to request only the file under the assets directory. The server maps the path information to the assets directory.

This part of the file read is involved, which naturally cannot avoid the FS (file system) module. Similarly, path processing is involved, and the path module is needed.

We use the Path.exists method of the path module to determine if a static file exists on disk. There is no 404 error that we respond directly to the client.

If the file exists, call the Fs.readfile method to read the file. If an error occurs, we respond to a client 500 error indicating an internal error. The normal state sends the read file to the client, indicating the 200 status.

 var server = http.createserver (function (request, response) {var pathname = Url.parse (request.url). Pathname;    var Realpath = "assets" + pathname; Path.exists (Realpath, function (exists) {if (!exists) {Response.writehead (404, {' Conte            Nt-type ': ' Text/plain '});            Response.Write ("This request URL" + pathname + "is not found on the this server.");        Response.End (); } else {fs.readfile (Realpath, "binary", function (err, file) {if (err) {RE                    Sponse.writehead ($, {' Content-type ': ' Text/plain '});                Response.End (ERR);                    } else {response.writehead, {' Content-type ': ' text/html '                    });                    Response.Write (File, "binary");                Response.End ();        }            }); }    });});

The above simple code together with a assets directory, constitutes our most basic static file server.

So sharp-eyed you and see, what are the problems with this most basic static file server? The answer is MIME type support. Because our server also has to store HTML, CSS, JS, PNG, GIF, JPG and so on files. Not every file has a MIME type of text/html.

MIME type support

Like other servers, MIME is supported by a mapping table.

Exports.types = {    "css": "Text/css",    "gif": "Image/gif",    "html": "Text/html",    "ico": "Image/x-icon",    "JPEG": "Image/jpeg", "    jpg": "Image/jpeg",    "JS": "Text/javascript",    "JSON": "Application/json",    "PDF": "Application/pdf",    "png": "Image/png",    "svg": "Image/svg+xml",    "swf": "application/ X-shockwave-flash ",    " TIFF ":" Image/tiff ",    " TXT ":" Text/plain ",    " wav ":" Audio/x-wav ",    " WMA ":" Audio/x-ms-wma ",    " wmv ":" Video/x-ms-wmv ",    " xml ":" Text/xml "};

The above code is otherwise present in the Mime.js file. This file only lists some of the commonly used MIME types, with the file suffix as the Key,mime type as value. Then introduce the Mime.js file.

var mime = require ("./mime"). Types;

We get the suffix name of the file by Path.extname. Because the Extname return value contains ".", the slice method is used to remove ".", and for files without a suffix, we are all considered to be unknown.

var ext = path.extname (realpath); ext = ext? Ext.slice (1): ' Unknown ';

Next, it's easy to get a real MIME type.

var contentType = Mime[ext] | | "Text/plain"; Response.writehead (+, {' Content-type ': ContentType}); Response.Write (file, "binary"); Response.End () ;

For unknown types, we return the Text/plain type.

Cache Support/ Control

After MIME support, the static file server looks perfect. Any static file will be ready to go if it is dropped into the assets directory. It appears that Apache has achieved the same effect as a static file server. We implement such a server with only so many lines of code. Isn't it simple?

However, we find that every time the user requests, the server calls the Fs.readfile method every time to read the files on the hard disk. When the server's request volume increases, hard disk IO will be unbearable.

Before we solve this problem, it is necessary to understand some of the mechanisms of front-end browser caching and the scenarios that improve performance.

    1. gzip compressed files can reduce the size of the response and can achieve the purpose of saving bandwidth.
    2. A conditional GET request is generated when a copy of the file is stored in the browser cache and cannot be determined when it is valid.
      1. The if-modified-since is included in the header of the request.
      2. If the server-side file has been modified since that time, the entire file is sent to the front end.
      3. If not modified, a 304 status code is returned. Does not send the entire file to the front end.
      4. Another kind of judging mechanism is the etag. Not discussed here.
    3. If the copy is valid, the GET request is omitted. The most important method of judging effectiveness is to take the expires head when the server responds.
      1. The browser will determine the expires header until the date is set to expire before a new request is initiated.
      2. Another way to achieve the same goal is to return cache-control:max-age=xxxx.

For more caching mechanisms, see the "High Performance Web Building Guide" from Steve Sounders's book.

To simplify the problem, here are just a few things we do:

    1. For files that specify several suffixes, add the Expires header and the Cache-control:max-age header in response. The timeout date is set to 1 years.
    2. Because this is a static file server, the last-modified header is returned for all requests, in response.
    3. For the request header with If-modified-since, make a date check, and if not modified, return 304. If modified, the file is returned.

For the above static file server, the response header given by node is very simple:

Connection:keep-alivecontent-type:text/htmltransfer-encoding:chunked

For the specified suffix file and expiration date, in order to ensure that it is configurable. Then it should be the case to build a config.js file.

Exports. Expires = {    Filematch:/^ (gif|png|jpg|js|css) $/ig,    maxage:60 * 60 * 24 * 365};

Introduce the Config.js file.

var config = require ("./config");

We determine if the suffix matches the condition we want to add the expiration time header to.

var ext = path.extname (realpath); ext = ext? Ext.slice (1): ' Unknown '; if (ext.match (config. Expires.filematch) {    var Expires = new Date ();    Expires.settime (expires.gettime () + CONFIG. Expires.maxage *);    Response.setheader ("Expires", expires.toutcstring ());    Response.setheader ("Cache-control", "max-age=" + CONFIG.) Expires.maxage);}

There are two more headers in the response header.

Cache-control:max-age=31536000connection:keep-alivecontent-type:image/pngexpires:fri, Nov 2012 12:55:41 Gmttransfer-encoding:chunked

The browser before sending the request due to detection of Cache-control and expires (Cache-control priority is higher than expires, but some browsers do not support Cache-control, this time using expires), if not expired, The request is not sent and the file is read directly from the cache.

Next we add last-modified headers for all the requested responses.

The last modification time of the read file is implemented through the Fs.stat () method of the FS module. For more information on stat, please see here .

Fs.stat (Realpath, function (err, stat) {    var lastmodified = stat.mtime.toUTCString ();    Response.setheader ("last-modified", LastModified);});

We also want to detect if the browser is sending a if-modified-since request header. If it is sent and the file is modified at the same time, we return to the 304 status.

if (request.headers[ifmodifiedsince] && lastmodified = = Request.headers[ifmodifiedsince]) {    Response.writehead (304, "not Modified");    Response.End ();}

If it is not sent or does not match the file modification time on the disk, it is sent back to the most recent file on the disk.

With the expires and last-modified two scenarios and co-operation with the browser, a significant portion of the network traffic can be saved, while some hard drive IO requests are reduced. If there was a CDN before that, the whole solution would be perfect.

Since both expires and Max-age are judged by the browser, if successful, HTTP requests are not sent to the server, which can only be tested by fiddler and browser mates. But last-modified can be tested by curl.

#:~$ Curl--header "If-modified-since:fri, one-19:14:51 GMT"-I http://localhost:8000HTTP/1.1 304 not Modifiedcon Tent-type:text/htmllast-modified:fri, 19:14:51 gmtconnection:keep-alive

Note that we see that the response of this 304 request is not with the body information. So, meet our bandwidth-saving needs. With just a few lines of code, you can save a lot of bandwidth costs.

However, it seems that we have mentioned something like gzip. For CSS, JS and other files if you do not use gzip, will still waste some of the network bandwidth. Then add the gzip code here.

GZip enabled

If you are a front-end person, you should be aware of Yui compressor or Google Closure complier such compression tools. On this basis, and then gzip compression, it will reduce a lot of network traffic. So, let's look at node, how to get the gzip up.

To use gzip, you need the Zlib module, which starts native support in node 0.5.8.

var zlib = require ("zlib");

Gzip compression is not required for files of the image class, so we configure a compression-enabled list in Config.js.

exports.compress = {    match:/css|js|html/ig};

In order to prevent large files, and in order to satisfy the call mode of the Zlib module, the read file is read in the form of stream.

var raw = Fs.createreadstream (Realpath), var acceptencoding = request.headers[' accept-encoding ' | | ""; var matched = Ext.match (config. Compress.match), if (matched && acceptencoding.match (/\bgzip\b/)) {    Response.writehead ($, "OK", {        ' Content-encoding ': ' gzip '    });    Raw.pipe (Zlib.creategzip ()). pipe (response);} else if (matched && acceptencoding.match (/\bdeflate\b/)) {    Response.writehead ($, "OK", {        ' Content-encoding ': ' Deflate '    });    Raw.pipe (Zlib.createdeflate ()). pipe (response);} else {    response.writehead ("OK");    Raw.pipe (response);}

For file formats that support compression and the browser side accepts gzip or deflate compression, we call compression. If not, the pipeline is forwarded to response.

Enabling compression is actually as simple as this. If you have fiddler, you can listen to the request and you will see the compressed request.

Security Issues

We've got a whole bunch of things, but there's not a lot of security. Think which place is the most prone to problems?

We found that the above code is still a bit tangled, usually such tangled code I do not want to take out to let people see. But what if a classmate visits Http://localhost:8000/../app.js with a browser?

Do not be afraid, the browser will automatically kill the two points as the parent path. The browser will assemble this path into a http://localhost:8000/app.js file that does not exist in the assets directory and returns 404 Not Found.

But the smart reunion was visited by Curl-i Http://localhost:8000/../app.js. So the problem arose.

# curl-i Http://localhost:8000/. /app.jshttp/1.1 Okcontent-type:text/javascriptlast-modified:thu, 17:16:51 Gmtexpires:sat, 2012 0 4:59:27 gmtcache-control:max-age=31536000connection:keep-alivetransfer-encoding:chunkedvar PORT = 8000;var http = Require ("http"), var url = require ("url"), var fs = require ("FS"), var path = require ("path"), var mime = require ("./mime"). Ty pes

So what do we do? The solution to the violent point is to disallow the parent path.

First replace all of the., and then call the Path.normalize method to dispose of the abnormal/.

var Realpath = Path.join ("Assets", Path.normalize (Pathname.replace (/\.\./g, "")));

So this time through Curl-i http://localhost:8000/../app.js visit,/. /app.js will be replaced by//app.js. The Normalize method returns//app.js as/app.js. Coupled with the real assets, it is actually mapped to assets/app.js. This file does not exist, so it returns 404. Fix the parent path problem. Consistent with the browser's behavior.

Welcome the icing on the page

Recall the common behavior of Apache. When a directory path is entered, the index.html page is searched, and if the index.html file does not exist, the directory index is returned. Directory Index We do not consider here, if the user requested the path is/end, we will automatically add the index.html file for it. If this file does not exist, continue to return a 404 error.

If the user requested a directory path and did not bring up/. Then we add a/index.html to it and then parse it again.

So do not like hard-coded you, must be to put this file into Config.js. This allows you to select various suffixes as welcome pages.

Exports. Welcome = {    file: "Index.html"};

Then the first step, for the/end of the request, is automatically added on "index.html".

if (Pathname.slice ( -1) = = = "/") {    pathname = pathname + CONFIG. Welcome.file;}

The second step, if a directory path is requested, and does not end with/. Then we need to make a judgment. If the current read path is a directory, you need to add the up/and index.html

if (Stats.isdirectory ()) {    Realpath = Path.join (Realpath, "/", CONFIG. Welcome.file);}

Because our present structure has undergone a little change. So we need to refactor the function. Moreover, the Fs.stat method has more features than the Fs.exsits method. We're just replacing it.

That's it. A static file server that is relatively complete in all its aspects is built to completion.

Range support, Fix media breakpoint support

For a range definition in http1.1, you can see these two articles:

    • Http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
    • Http://labs.apache.org/webarch/http/draft-fielding-http/p5-range.html

Next, I'll briefly describe the role of range and its definition.

When the user listens to a song, if hears half (the network downloads half), the network is broken, the user needs to continue to listen to the time, the file server does not support the breakpoint, then the user needs to download this file again. With range support, the client should record the range of files that have been read before, and after the network is restored, send a request to the server to read the remaining range, which will save network bandwidth by sending only the portion of the client request, rather than sending the entire file back to the client.

So what is the range of the HTTP1.1 specification as a convention?

      1. If the server supports range, first tell the client that we support range before the client can initiate a request with range.
Response.setheader (' accept-ranges ', ' bytes ');
      1. The server uses the range:bytes=0-xxx in the request header to determine if the range request is being made, and if the value exists and is valid, only the part of the file that was requested is sent back, the status code of the response becomes 206, which indicates partial Content, and set the Content-range. If not, a 416 status code is returned, indicating that the request Range not satisfiable (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html# sec10.4.17). If you do not include a request header for range, you continue to respond in a regular manner.
      2. It is necessary to explain the range request.
Ranges-specifier = Byte-ranges-specifierbyte-ranges-specifier = Bytes-unit "=" Byte-range-setbyte-range-set = # ( Byte-range-spec | SUFFIX-BYTE-RANGE-SPEC) Byte-range-spec = First-byte-pos "-" [Last-byte-pos]first-byte-pos = 1*DIGITlast-byte-pos = 1* DIGIT

The above definition is derived from the protocol http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35defined by W3. Can be broadly expressed as range:bytes=[start]-[end][,[start]-[end]]. In short, there are several situations:

bytes=0-99, data bytes from 0 to 99.

Bytes=-100, the last 100 bytes of a file.

bytes=100-, all bytes after the beginning of the 100th byte.

bytes=0-99,200-299, data bytes from 0 to 99 and 200 to 299 bytes.

So, let's get to it. Start by judging the range request and detecting if it is valid. In order to keep the code clean, we encapsulate a Parserange method, which belongs to the util nature, then we put in the Utils.js file.

var utils = require ("./utils");

We don't support multiple intervals for the moment. So I met a comma and reported 416 mistakes.

Exports.parserange = function (str, size) {    if (Str.indexof (",")! =-1) {        return;    }    var range = Str.split ("-"),        start = parseint (Range[0], ten),        end = parseint (Range[1], ten);    Case: -100    if (IsNaN (start)) {        start = Size-end;        end = Size-1;        case:100-    } else if (IsNaN (end)) {        end = size-1;    }    Invalid    if (IsNaN (start) | | IsNaN (END) | | Start > End | | end > Size) {        return;    }    return {        Start:start,        end:end    };};

If the criteria for range are met, add the Content-range and modify content-lenth for the response.

Response.setheader ("Content-range", "bytes" + Range.Start + "-" + Range.End + "/" + stats.size); Response.setheader ("Conte Nt-length ", (Range.end-range.start + 1));

A very happy thing is that node read the file stream, native support range read.

var raw = Fs.createreadstream (Realpath, {"Start": Range.Start, "End": Range.End});

Set the status code to 206.

After selecting range, you still need to go through gzip. So the code has a bit of noodle flavor. Re-construct it. So the code is roughly the same:

var compresshandle = function (Raw, StatusCode, reasonphrase) {var stream = raw; var acceptencoding = request.headers[' accept-encoding ' | |    "";    var matched = Ext.match (Config.Compress.match);        if (matched && acceptencoding.match (/\bgzip\b/)) {Response.setheader ("content-encoding", "gzip");    stream = Raw.pipe (Zlib.creategzip ()); } else if (matched && acceptencoding.match (/\bdeflate\b/)) {Response.setheader ("content-encoding", "Deflat        E ");    stream = Raw.pipe (Zlib.createdeflate ());    } response.writehead (StatusCode, reasonphrase); Stream.pipe (response);};    if (request.headers["range"]) {var range = Utils.parserange (request.headers["range"], stats.size);        if (range) {response.setheader ("content-range", "bytes" + Range.Start + "-" + Range.End + "/" + stats.size);        Response.setheader ("Content-length", (Range.end-range.start + 1)); var raw = Fs.createreadstream (Realpath, {"Start": Range.stArt, "End": Range.End});    Compresshandle (Raw, 206, "Partial Content");        } else {Response.removeheader ("content-length");        Response.writehead (416, "Request Range not satisfiable");    Response.End ();    }} else {var raw = Fs.createreadstream (Realpath); Compresshandle (Raw, $, "OK");}

Try the test with Curl--header "range:0-20"-I http://localhost:8000/index.html request.

http/1.1 206 Partial contentserver:node/v5accept-ranges:bytescontent-type:text/htmlcontent-length:21last-modified : Fri, 19:14:51 gmtcontent-range:bytes 0-20/54connection:keep-alive

The index.html file was not sent to the client as a whole. There is no complete 21 bytes, because \ t and \ r each count one byte.

Then use Curl--header "range:0-100"-I http://localhost:8000/index.html reverse test it.

http/1.1 416 Request Range not Satisfiableserver:node/v5accept-ranges:bytescontent-type:text/htmllast-modified:fri, Nov 19:14:51 gmtconnection:keep-alivetransfer-encoding:chunked

Well, that's the effect. At this point, range support is complete, and this static file server supports some streaming media files.

Well. It's so simple.

This article is reproduced from the Cnode community Tianyong article.

About the author

Tianyong, Sina Weibo @ Park, front-end engineers, currently in SAP, engaged in mobile WEB app research and development work, Nodejs hold a high degree of enthusiasm, hope to get through the front-end JavaScript and Nodejs barrier, will nodejs referral to more front-end engineers. Interest: Read thousands of books, traveling the road. Personal GitHub Address:  http://github.com/jacksontian . The address of this project is https://github.com/jacksontian/ping   Welcome to continuous tracking.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.