Apache httpclient simulated Logon

Source: Internet
Author: User
Apache httpclient simulated login httpclient is a subproject under Apache Jakarta common. It can be used to provide an efficient, up-to-date, and function-rich client programming toolkit that supports HTTP, it also supports the latest HTTP Version and recommendations.

Httpclient

The main functions provided by httpclient are listed below. For more details, see the httpclient homepage.

  • All HTTP methods (get, post, put, Head, etc.) are implemented)
  • Automatic redirection is supported.
  • Support for HTTPS
  • Support for proxy servers

The following describes how to use these functions one by one. First, we must install httpclient.

  • Httpclient can beHttp://jakarta.apache.org/commons/httpclient/downloads.htmlDownload
  • Httpclient uses the sub-project logging under Apache Jakarta common.Http://jakarta.apache.org/site/downloads/downloads_commons-logging.cgiDownload to common logging, remove the commons-logging.jar from the downloaded package and add it to classpath
  • Httpclient uses the sub-project codec under Apache Jakarta common.Http://jakarta.apache.org/site/downloads/downloads_commons-codec. Cgi download to the latest Common codec, remove the commons-codec-1.x.jar from the downloaded package and add it to the classpath

The following is a program I wrote: Package com. newpalm. unicomfetch. threads; import java. util. vector; import org. Apache. commons. httpclient. Cookie;
Import org. Apache. commons. httpclient. httpclient;
Import org. Apache. commons. httpclient. namevaluepair;
Import org. Apache. commons. httpclient. Cookie. cookiepolicy;
Import org. Apache. commons. httpclient. Cookie. cookiespec;
Import org. Apache. commons. httpclient. Methods. getmethod;
Import org. Apache. commons. httpclient. Methods. postmethod;
Import org.html parser. node;
Import org.html parser. parser;
Import org.html parser. tags. tablecolumn;
Import org.html parser. tags. tablerow;
Import org.html parser. tags. tabletag;
Import org.html parser. Visitors. objectfindingvisitor;
/**
* Shows an example of a logon form.
* @ Author liudong
*/
Public class formlogindemo {

Public static void main (string [] ARGs) throws exception {
Parser = NULL;
Objectfindingvisitor visitor = NULL;
Httpclient client = new httpclient ();

// Simulate logon page
Postmethod post = new postmethod ("http: // 211.90.119.58: 9999/splogin. aspx ");
Namevaluepair name = new namevaluepair ("txthandsetnumber", "123456 ");
Namevaluepair pass = new namevaluepair ("txtpassword", "123456 ");
Namevaluepair _ viewstate = new namevaluepair ("_ viewstate ", "signature + o2w8ddxwpgw8vgv4dds + signature/Signature + xdxiclw + xdwvbwfycxvlzvw + oz4 + ozs + oz4 + oz4 + Signature = ");
Namevaluepair btnlogstores = new namevaluepair ("btnlogin. X", "0 ");
Namevaluepair btnloginy = new namevaluepair ("btnlogin. Y", "5 ");
Post. setrequestbody (New namevaluepair [] {name, pass ,__ viewstate, btnlogstores, btnloginy });
Int status = client.exe cutemethod (post );

Post. releaseconnection ();
// View Cookie Information
Cookiespec = cookiepolicy. getdefaspec SPEC ();
Cookie [] cookies = client. getstate (). getcookies ();
If (cookies. Length = 0 ){
System. Out. println ("NONE ");
} Else {
For (INT I = 0; I <cookies. length; I ++ ){
System. Out. println (Cookies [I]. tostring ());
}
}

// Access the required page
Getmethod get = new getmethod ("http: // 211.90.119.58: 9999/phonesearch. aspx ");
Client.exe cutemethod (get );
Visitor = new objectfindingvisitor (tabletag. Class );
Parser = new Parser ();
Parser. setinputhtml (get. getresponsebodyasstring ());
Parser. setencoding ("GBK ");
Parser. visitallnodeswith (visitor );

// Retrieve the number of pages to be parsed
Node [] tables = visitor. gettags ();
Tabletag = (tabletag) tables [tables. Length-1];
Tablerow [] rows = tabletag. getrows ();
Tablerow ROW = rows [0];
Tablecolumn [] col = row. getcolumns ();
Int pagenumber = integer. parseint (COL [0]. getchildrenhtml (). substring (25, 29 ));
Get. releaseconnection ();

For (INT I = 1; I <pagenumber; I ++ ){
Postmethod Pt = new postmethod ("http: // 211.90.119.58: 9999/phonesearch. aspx ");
Namevaluepair txtpage = new namevaluepair ("txtpage", integer. tostring (I ));
_ Viewstate = new namevaluepair ("_ viewstate ","")
Namevaluepair _ eventtarget = new namevaluepair ("_ eventtarget ","");
Namevaluepair _ eventargument = new namevaluepair ("_ eventargument ","");
Namevaluepair tbmdn = new namevaluepair ("tbmdn ","");
Namevaluepair tbservicetype = new namevaluepair ("tbservicetype ","");
Namevaluepair tbstarttime = new namevaluepair ("tbstarttime ","");
Namevaluepair tbendtime = new namevaluepair ("tbendtime ","");
Namevaluepair btngotox = new namevaluepair ("btngoto. X", "26 ");
Namevaluepair btngotoy = new namevaluepair ("btngoto. Y", "13 ");
PT. setrequestbody (New namevaluepair [] {__ eventtarget ,__ eventargument ,__ viewstate, tbmdn, tbservicetype, tbstarttime, tbendtime, txtpage, btngotox, btngotoy });
Int A = client.exe cutemethod (PT );


Parser. setinputhtml (Pt. getresponsebodyasstring ());
Parser. setencoding ("GBK ");
Parser. visitallnodeswith (visitor );

Tables = visitor. gettags ();
Tabletag = (tabletag) tables [tables. Length-3];

Rows = tabletag. getrows ();
Row = rows [1];
Col = row. getcolumns ();
System. Out. println (COL [4]. getchildrenhtml (). tostring ());
Get. releaseconnection ();
}

}
}

Some common problems during httpclient usage

The following describes some common problems when using httpclient.

Character encoding

The encoding of a target page may appear in two places. The first place is the HTTP header returned by the server, and the other is the HTML/XML page.

  • The Content-Type field in the HTTP header may contain character encoding information. For example, the returned header may contain such information: Content-Type: text/html; charset = UTF-8. This header indicates that the page is encoded as a UTF-8, but the header information returned by the server may not match the content. For example, for some double byte countries, the server may return the encoding type is UTF-8, but the real content is not UTF-8 encoding, so you need to get the page encoding information in another place; but if the server returns a code that is not a UTF-8, but a specific code, such as gb2312, the server may return the correct encoding information. You can use the getresponsecharset () method of the method object to obtain the encoding information in the HTTP header.
  • For files such as XML or HTML, the author is allowed to specify the encoding type directly on the page. For example, a tag such as <meta http-equiv = "Content-Type" content = "text/html; charset = gb2312"/> or <? XML version = "1.0" encoding = "gb2312"?> In such cases, tags may conflict with the encoding information returned in the HTTP header. You need to determine whether the encoding type is actually a real encoding.

Automatic Steering

According to rfc2616's definition of automatic steering, there are two main types: 301 and 302. 301 indicates permanent removal (moved permanently). When 301 is returned, it indicates that the requested resource has been moved to a fixed new place, any request initiated to this address will be forwarded to the new address. 302 indicates temporary redirection. For example, if the server-side servlet program calls the sendredirect method, the client will get a 302 code, in this case, the Location Value in the header information returned by the server is the destination address of the sendredirect redirection.

Httpclient supports automatic redirection, but requests for subsequent services such as post and put are not supported at the moment, therefore, if 301 or 302 is returned after the POST method is submitted, you must handle it yourself. As shown in the postmethod example above: If you want to enter the page after logging on to BBS, you must re-initiate the login request. The request address can be obtained in the header field location. However, it should be noted that sometimes location may return relative paths, so you need to process the value returned by location to initiate a request to the new address.

Besides the information contained in the header, page redirection may also occur on the page. The label that causes automatic page Forwarding is: <meta http-equiv = "refresh" content = "5; url = http://www.ibm.com/us">. If you want to handle this situation in the program, you have to analyze the page to achieve the redirection. Note that the URL value in the tag above can also be a relative address. If so, you need to process it before forwarding.

Process HTTPS protocol

Httpclient provides SSL support. JSSE must be installed before using SSL. In Versions later than sun 1.4, JSSE has been integrated into JDK. If you are using a version earlier than jdk1.4, JSSE must be installed. Different JSSE manufacturers have different implementations. The following describes how to use httpclient to open an HTTPS connection. There are two ways to enable the HTTPS connection. The first is to obtain the certificate issued by the server and import it to the local keystore; another way is to automatically accept certificates by extending the httpclient class.

Method 1: obtain the certificate and import the local keystore:

  • Install JSSE (skip this step if you are using JDK 1.4 or later ). This document uses IBM's JSSE as an example. Download the JSSE installation package from the IBM website. Decompress the package and copy the ibmjsse. jar package to the <Java-Home>/lib/EXT/directory.
  • Obtain and import the certificate. The certificate can be obtained through ie:

    1. Use IE to open the https url to be connected. The following dialog box is displayed:

    2. Click "view Certificate", select "details" in the pop-up dialog box, and click "Copy to file" to generate a certificate file to access the Web Page Based on the Wizard provided.

    3. Step 1 of The Wizard. On the welcome page, click "Next ",

    4. in step 2 of the wizard, select the exported file format. By default, click "Next ",

    5. Step 3 of The Wizard: Enter the exported file name, enter the exported file name, and click "Next ",

    6. Step 4 of the wizard, click "finish" to complete the wizard

    7. The last dialog box is displayed, indicating that the export is successful.

  • Use the keytool tool to import the exported certificate to the local keystore. Run the keytool command in <Java-Home>/bin/, open the command line window, and run the following command in the <Java-Home>/lib/security/directory:

    keytool -import -noprompt -keystore cacerts -storepass changeit -alias yourEntry1 -file your.cer

    The alias parameter is followed by the unique identifier of the current certificate in the keystore, but the case is case insensitive. The parameter file is followed by the path and file name of the certificate exported through IE; to delete the certificate just imported to the keystore, run the following command:

    keytool -delete -keystore cacerts -storepass changeit -alias yourEntry1

  • Write the program to access the HTTPS address. To test whether the request can be connected to https, you just need to change the getsample example to an HTTPS address.
    GetMethod getMethod = new GetMethod("https://www.yourdomain.com");

    Problems that may occur when running the program:

    1. Thrown exception java.net. socketexception: algorithm SSL not available. This exception may occur because jsseprovider is not added. If an ibm jsse provider is used, add the following line to the program:

     if(Security.getProvider("com.ibm.jsse.IBMJSSEProvider") == null) Security.addProvider(new IBMJSSEProvider()); 

    Alternatively, you can open <Java-Home>/lib/security/Java. Security.

    security.provider.1=sun.security.provider.Sunsecurity.provider.2=com.ibm.crypto.provider.IBMJCE

    Add security. provider.3 = com. IBM. JSSE. ibmjsseprovider

    2. Thrown exception java.net. socketexception: SSL implementation not available. This exception may occur because you have not copied ibmjsse. jar to the <Java-Home>/lib/EXT/directory.

    3. thrown an exception javax.net. SSL. sslhandshakeexception: Unknown certificate. This exception indicates that your JSSE has been correctly installed, but it may be because you have not imported the certificate into the keystore currently running JRE. Please follow the steps described above to import your certificate.

Method 2: Expand the httpclient class to automatically accept certificates

Because this method automatically receives all certificates, there are certain security issues, so please carefully consider your system security requirements before using this method. The procedure is as follows:

  • Provides a custom socket Factory (test. mysecureprotocolsocketfactory ). This custom class must implement the org. apache. commons. httpclient. protocol. secureprotocolsocketfactory, which calls the custom x509trustmanager (test. myx509trustmanager). You can obtain
  • Create an org. Apache. commons. httpclient. Protocol. protocol instance and specify the protocol name and default port number.
    Protocol myhttps = new Protocol("https", new MySecureProtocolSocketFactory (), 443);

  • Register the created HTTPS protocol object
    Protocol.registerProtocol("https ", myhttps);

  • Then open the HTTPS target address in normal programming mode. For the code, see test. nocertifhtthttpsgetsample.

Processing Proxy Server

It is very easy to use the proxy server in httpclient. You can call the setproxy method in httpclient. The first parameter of the method is the proxy server address, and the second parameter is the port number. In addition, httpclient also supports socks proxy.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.