Crawl Online public free agent (http/socks), analytical storage, to meet the need to switch the IP scene (crawler, voting, etc.) demand.
Project Address: Https://github.com/Jwnie/proxyservice
1, the use of Springboot rapid development, MySQL storage, HttpClient 4.x, selenium+chrome and jsoup download analysis, and the agent has been crawled to effectively check the timing of the unicom;
2, currently supports two proxy query interface, see need to be able to expand:
(1) Http://localhost:8888/proxy/getProxy?isDemostic=true&anonymousType=elite&protocolType=https
By default, the first 100 available proxies are returned;
Parameter description:
(1) isdemostic: Optional parameter, whether it is a domestic agent, the value is true and false;
(2) Anonymoustype: Optional parameters, Proxy anonymous type, divided into four kinds: transparent (transparent), Anonymous (anonymous), distorting (confusion), Elite (high stealth);
(3) ProtocolType: Optional parameters, Agent protocol type, divided into HTTP, https, SOCKS4, SOCKS5 and socks (not SOCKS4 and SOCKS5 subdivision, collectively referred to as socks)
Return Data:
(2) Http://localhost:8888/proxy/proxyStatistic
Number of query agents, by agent site statistics:
Open source Project-Open HTTP proxy crawl, simple classification online