This article mainly introduces how to use the NodeJSurl verification (url-valid) module, and finally provides the instance code for you to refer to Javascript For url verification. It is usually determined using regular expressions, the format is correct, for example:
The Code is as follows:
/^ Https? : \/\ //. Test (url );
Of course, there are also better detection methods such as the valid-url library for Verification Based on RFC 3986, RFC 3966, RFC 4694, RFC 4759, and other standards.
However, the validation based on the format is certainly not sure whether the url exists, so there is a url-valid, we verify based on the HTTP request.
Interface Design
In fact, we only need a function to input a url address and call back to return whether the link is available.
However, the request is prone to unknown errors. Therefore, an error parameter is input in the callback function. If it is not null, an error is generated.
We may also want to obtain webpage-related data, which will be used for page information extraction in the future.
Chain the operation as much as possible.
So the final use is probably like this:
The Code is as follows:
Valid (url)
. On ('check', function (err, status ){
If (err) throw err;
Status?
Console. log ('url is usable '):
Console. log ('url is unavailable ');
})
. On ('data', function (err, data ){
Console. log (data );
})
. On ('end', function (err, data ){
Console. log ('request termination ');
})
Http get or HTTP HEAD
Originally we wanted to use the http head request, because the HEAD request will only return header information, which can reduce the request time. However, not all links in the HEAD request are supported.
In the end, we use the http get method and immediately abort the request after obtaining the correct statusCode.
Processing 301-303
Because 301 to 303 are redirected, we need to continue to check whether the corresponding Location still exists.
Asynchronous execution using process. nextTick
To register the listener and run the code again, we use process. nextTick to perform one-step operations.
Implementation
The Code is as follows:
/*!
* Valid
* MIT Licensed
*/
Module. exports = (function (){
'Use strict ';
Var http = require ('http ')
, Https = require ('https ')
, EventEmitter = require ('events'). EventEmitter
, URL = require ('url ')
, UrlReg =/^ (https ?) :\/\//;
/**
* Valid
* @ Class
*/
Function Valid (url, callback ){
Var that = this;
This. url = url;
This. emitter = new EventEmitter ();
Process. nextTick (function (){
That. get (url );
});
This. fetch = false;
Callback & this. emitter. on ('check', callback );
}
Valid. prototype = {
Constructor: Valid,
/**
* Get
* @ Param {String} url
*/
Get: function (url ){
Var match = url. match (urlReg)
, That = this;
If (match ){
Var httpLib = (match [1]. toLowerCase () = 'http ')? Http: https
, Opts = URL. parse (url)
, Req;
Opts. agent = false;
Opts. method = 'get ';
Req = httpLib. request (opts, function (res ){
Var statusCode = res. statusCode;
If (statusCode = 200 ){
That. emitter. emit ('check', null, true );
That. fetch?
(Res. on ('data', function (data ){
That. emitter. emit ('data', null, data );
}) & Res. on ('end', function (){
That. emitter. emit ('end ');
})):
(Req. abort () | that. emitter. emit ('end '));
} Else if (300 <statusCode & statusCode <304 ){
Req. abort ();
Var emitter = that. emitter
, Valid = one (URL. resolve (url, res. headers. location), function (err, valid ){
Emitter. emit ('check', err, valid );
});
That. fetch & valid. on ('data', function (err, data ){
Emitter. emit ('data', err, data );
});
Valid. on ('error', function (err ){
That. emitter. emit ('error', err );
});
Valid. on ('end', function (){
That. emitter. emit ('end ');
});
} Else {
That. emitter. emit ('check', null, false );
}
Res. on ('error', function (err ){
Req. abort ();
That. emitter. emit ('data', err );
});
});
Req. on ('error', function (err ){
Req. abort ();
Return that. emitter. emit ('check', null, false );
});
Req. end ();
} Else {
Return that. emitter. emit ('check', null, false );
}
},
/**
* On
* @ Param {Stirng} event
* @ Param {Function} callback
*/
On: function (event, callback ){
(Event = 'data') & (this. fetch = true );
This. emitter. on (event, callback );
Return this;
},
/**
* Destroy
*/
Destroy: function (){
This. emitter. removeAllListeners ();
This. url = undefined;
This. emitter = null;
This. fetch = undefined;
},
/**
* RemoveAllListeners
* @ Param
*/
RemoveAllListeners: function (event ){
Event?
This. emitter. removeAllListeners (event ):
This. emitter. removeAllListeners ();
Return this;
},
/**
* Listeners
* @ Param
*/
Listeners: function (event ){
If (event ){
Return this. emitter. listeners (event );
} Else {
Var res = []
, That = this
, _ Push = Array. prototype. push;
Object. keys (this. emitter. _ events). forEach (function (key ){
_ Push. apply (res, that. emitter. listeners (key ));
});
Return res;
}
}
}
/**
* One
* @ Param {String} url
* @ Param {Function} callback
* @ Return {Valid}
*/
Function one (url, callback ){
Return (new Valid (url, callback ));
}
One. one = one;
Return one;
})();