HttpClient Getting Started

Source: Internet
Author: User
Tags throw exception

HttpClient Introduction

The HTTP protocol is probably the most widely used and most important protocol on the Internet now, and more and more Java applications need to access network resources directly through the HTTP protocol. While the basic functionality of accessing the HTTP protocol has been provided in the JDK's java.net package, the JDK library itself provides a lack of functionality and flexibility for most applications. HttpClient is a sub-project under Apache Jakarta Common to provide an efficient, up-to-date, feature-rich client programming toolkit that supports the HTTP protocol, and it supports the latest versions and recommendations of the HTTP protocol. HttpClient has been used in many projects, such as the two other open source projects that are famous on Apache Jakarta Cactus and Htmlunit both use HttpClient, and more applications that use HttpClient can be found in Http://wi Ki.apache.org/jakarta-httpclient/httpclientpowered. The HttpClient project is very active and uses a lot of people. Currently the HttpClient version is in 4.4 (Beta) "Editor's note, 2014-11-28".

HttpClient Function Introduction

The main features provided by HttpClient are listed below, and for more detailed features see the HttpClient homepage.

    • Implements all HTTP methods (Get,post,put,head, etc.)
    • Supports automatic steering
    • Support for HTTPS protocol
    • Support for proxy servers, etc.

The following describes how to use these features. First of all, we must install the HttpClient.

    • HttpClient can be downloaded from http://jakarta.apache.org/commons/httpclient/downloads.html
    • HttpClient uses Apache Jakarta Common under the sub-project logging, you can from this address Http://jakarta.apache.org/site/downloads/downloads_ commons-logging.cgi download to common logging, remove Commons-logging.jar Add to CLASSPATH from the downloaded zip package
    • HttpClient uses Apache Jakarta Common under the sub-project codec, you can from this address Http://jakarta.apache.org/site/downloads/downloads_ commons-codec.cgi download to the latest common codec, remove Commons-codec-1.x.jar from the downloaded package and add to CLASSPATH

Even with HttpClient, the above three jar packages are used. namely HttpClient, code, logging. None of the above three jar package programs will error.

HttpClient use of basic functions 1, GET method

Use of HttpClient requires the following 6 steps:

1. Create an instance of HttpClient

2. Create an instance of some kind of connection method, here is GetMethod. Incoming address to be connected in GetMethod constructor

3. Invoke the Execute method of the created instance in the first step to perform the creation of a good instance of method in the second step

4. Read response

5. Release the connection. The connection must be released regardless of the success of the execution method

6. Processing the content after it is received

Based on the above steps, we write code that uses the Get method to get the content of a Web page.

  • In most cases, the HttpClient default constructor is sufficient for use.
    HttpClient HttpClient = new HttpClient (); "Editor's note: Different construction methods depending on version"
  • Creates an instance of the Get method. In the Get method's constructor, pass in the address you want to connect to. Using GetMethod will automatically handle the forwarding process, and if you want to remove the automatic processing forwarding process, you can call the method Setfollowredirects (false).
    GetMethod GetMethod = new GetMethod ("http://www.ibm.com/");
  • The
  • invokes the Executemethod method of the instance HttpClient to execute the GetMethod. Due to the execution of the program on the network, when running the Executemethod method, there are two exceptions to be handled, namely HttpException and IOException. The cause of the first anomaly is probably the wrong protocol when constructing the GetMethod, such as accidentally writing "http" to "HTP", or the server side of the returned content is not normal, and the exception occurs is not recoverable, and the second exception is generally due to network causes of the exception, For this exception (IOException), HttpClient automatically tries to re-execute the Executemethod method based on the recovery policy you specify. HttpClient's recovery strategy can be customized (implemented by implementing interface Httpmethodretryhandler). Using the HttpClient method Setparameter to set the recovery strategy that you implement, this article uses the system-provided default recovery policy, which automatically retries 3 times when encountering the second class of exceptions. The Executemethod return value is an integer that indicates the status code returned by the server after executing the method, which can indicate whether the method executes successfully, requires authentication, or the page has a jump (by default the instance of GetMethod is automatically processed for jumps), etc.
    //is set to the default recovery policy, automatically retries 3 times when an exception occurs, where you can also set a custom recovery policy getmethod.getparams (). Setparameter ( Httpmethodparams.retry_handler, New Defaulthttpmethodretryhandler ()); Execute Getmethodint statusCode = Client.executemethod (GetMethod); if (statusCode! = HTTPSTATUS.SC_OK) {System.err.println ("Method failed:" + getmethod.getstatusline ());} 
  • After the returned status code is correct, the content can be obtained. There are three ways to get the content of the destination address: The first, Getresponsebody, the method returns the binary byte stream of the target, and the second, Getresponsebodyasstring, which returns a string type, It is worth noting that the encoding of the string returned by the method is based on the default encoding of the system, so the returned string value may be of the wrong encoding type, which is described in detail in the "Character encoding" section of this article; third, GetresponsebodyasStream. , this method is best for transferring large amounts of data to the destination address. Here we use the simplest method of getresponsebody.
    byte[] Responsebody = Getmethod.getresponsebody ();
  • Release the connection. The connection must be released regardless of the success of the execution method.
    Getmethod.releaseconnection ();
  • Process the content. In this step, the content is handled according to your needs, and in the example it is simple to print the content to the console.
    System.out.println (New String (responsebody));

The following is the complete code for the program, which is also available in the attachment of test. Found in Getsample.

Package Test;import Java.io.ioexception;import Org.apache.commons.httpclient.*;import Org.apache.commons.httpclient.methods.getmethod;import Org.apache.commons.httpclient.params.HttpMethodParams; public class getsample{public static void Main (string[] args) {//instance of construction HttpClient HttpClient HttpClient = new Httpclie  NT ();  Create an instance of the Get method GetMethod GetMethod = new GetMethod ("http://www.ibm.com"); Use the system-provided default recovery policy getmethod.getparams (). Setparameter (Httpmethodparams.retry_handler, new  Defaulthttpmethodretryhandler ());   try {//execute GetMethod int statusCode = Httpclient.executemethod (GetMethod);   if (statusCode! = HTTPSTATUS.SC_OK) {System.err.println ("Method failed:" + getmethod.getstatusline ());   }//read content byte[] responsebody = Getmethod.getresponsebody ();  Process content System.out.println (new String (responsebody)); } catch (HttpException e) {//Fatal exception, either protocol is incorrect or the content returned is problematic System.out.println ("Please check your provided HTTP address!"   );  E.printstacktrace (); } catch (IOException e) {//Network exception occurred e.printstacktrace ();  } finally {//release connection getmethod.releaseconnection (); } }}
Post method

According to RFC2616, the post is interpreted as follows: The Post method is used to make a request to the destination server that accepts the entity attached to the request and treats it as an additional new subkey for the resource specified in the request queue (Request-line) in the request URI. Post is designed to implement the following functions in a unified manner:

    • Annotations to existing resources (Annotation of existing resource)
    • Send a message to an e-bulletin board, newsgroup, mailing list, or similar discussion group
    • Submitting data blocks, such as submitting the results of a form to a data processing process
    • To extend a database with additional actions

Calling Postmethod in HttpClient is similar to GetMethod, except that there are some differences between setting the instance of Postmethod and GetMethod, and the steps are the same. In the following example, the same steps as the GetMethod are omitted, only the different places are explained, and the example is illustrated by logging in to Tsinghua University BBS.

  • The steps before constructing Postmethod are the same, and as with GetMethod, a URI parameter is required for the construction Postmethod, in this case, the address of the login is http://www.newsmth.net/bbslogin2.php. After you have created an instance of Postmethod, you need to populate the value of the form with the method instance, you need two fields in the BBS login form, the first is the user name (domain name ID), and the second is the password (domain name passwd). The fields in the form are represented by the class Namevaluepair, the first parameter of the class is the domain name, the second parameter is the value of the field, and all the values of the form are set to the method Setrequestbody in Postmethod. In addition, due to the success of BBS login will be redirected to another page, but httpclient for requests to accept the subsequent service, such as post and put, does not support automatic forwarding, so you need to turn the page to do processing. Refer to the "Automatic Steering" section below for specific page steering processing. The code is as follows:
    String url = "http://www.newsmth.net/bbslogin2.php"; Postmethod Postmethod = new Postmethod (URL);//fill in the values of each form field namevaluepair[] data = {New Namevaluepair ("id", "youusername"), n EW Namevaluepair ("passwd", "yourpwd")};//Place the value of the form into Postmethod postmethod.setrequestbody (data);//Execute Postmethodint StatusCode = Httpclient.executemethod (Postmethod);//HttpClient for requests to receive subsequent services, such as post and put cannot automatically process forwarding//301 or 302if ( StatusCode = = Httpstatus.sc_moved_permanently | | StatusCode = = httpstatus.sc_moved_temporarily) {    //Remove the head of the steering address    Header Locationheader = Postmethod.getresponseheader ("location");    String location = null;    if (Locationheader! = null) {Location     = Locationheader.getvalue ();     SYSTEM.OUT.PRINTLN ("The page was redirected to:" + location);    } else {     System.err.println ("Location field, value is null.")    ;    return;}

      

  

Some common problems in using the httpclient process

The following are some common problems that are commonly encountered during the use of httpclient.

Character encoding

The encoding of a target page may appear in two places, the first place is the HTTP header returned by the server, and the other is the Html/xml page that gets it.

    • The Content-type field in the HTTP header may contain character encoding information. For example, the header that may be returned contains information such as: content-type:text/html; Charset=utf-8. This header information indicates that the encoding of the page is UTF-8, but the header information returned by the server may not match the content. For example, for some double-byte language countries, the type of encoding that the server may return is UTF-8, but the real content is not UTF-8 encoded, so you need to get the encoding information of the page in another place, but if the encoding returned by the server is not UTF-8, it is a specific encoding. such as gb2312, the server may return the correct encoding information. The encoding information in the HTTP header can be obtained by means of the Getresponsecharset () method object.
    • For files such as XML or HTML, the author is allowed to specify the encoding type directly in the page. For example, in HTML there will be <meta http-equiv= "Content-type" content= "text/html; charset=gb2312 "/> Such a label, or in XML there will be <?xml version=" 1.0 "encoding=" gb2312 "?> such a label, in these cases, may be associated with the HTTP header returned in the Encoding information conflict, It's up to the user to judge exactly which encoding type should be true.
Automatic steering

According to the definition of automatic steering in RFC2616, there are two main types: 301 and 302. 301 represents a permanent removal (Moved permanently), when 301 is returned, indicating that the requested resource has been moved to a new fixed location, and any request to that address will be forwarded to the new address. 302 indicates a temporary turn, such as the server-side servlet program calls the Sendredirect method, the client will get a 302 code, the server returns the header information in the location of the value of the Sendredirect to the target address.

The httpclient supports automatic steering, but the request for a successor service such as Post and put is not supported for the time being, so it is necessary to deal with it if it returns 301 or 302 after the post is submitted. Just like in the Postmethod cited example: If you want to enter the login BBS page, you must re-initiate the login request, the requested address can be obtained in the header field location. However, it is important to note that sometimes location returns a relative path, so there is a need to do some processing on the value returned by the locations to initiate a request to the new address.

In addition to the information contained in the header may cause the page to redirect, the page may also be redirected. The label that causes the page to be forwarded automatically is: <meta http-equiv= "Refresh" content= "5; Url=http://www.ibm.com/us ">. If you want to handle this situation in your program, you have to analyze the page yourself to achieve the turn. It is important to note that the value of the URL in the above tag can also be a relative address, and if so, it needs to be processed before it can be forwarded.

Handling HTTPS protocol

HttpClient provides support for SSL and must be installed Jsse before using SSL. In the 1.4 versions that Sun provides, Jsse is already integrated into the JDK, and if you are using a previous version of JDK1.4, you must install Jsse. Jsse different manufacturers have different implementations. Here's how to use httpclient to open an HTTPS connection. There are two ways to open an HTTPS connection, the first is to get the certificate issued by the server, and then import it into the local keystore, and the other way is to implement automatic acceptance of the certificate by extending the HttpClient class.

Method 1, obtain the certificate, and import the local keystore: "Editor Note: This method is basically not used, but can come oh pick up un away"

  • Install Jsse (You can skip this step if you are using a JDK version of 1.4 or more than 1.4). This article takes IBM's Jsse as an example to illustrate. Download the Jsse installation package on the IBM website first. The Ibmjsse.jar package is then copied to the <java-home>\lib\ext\ directory after unpacking.
  • Obtain and import the certificate. Certificates can be obtained through IE:

    1. Using IE to open the HTTPS URL that needs to be connected, the following dialog box pops up:

    2. Click View Certificate, select Details in the Pop-up dialog box, and then click Copy to file, and then generate the certificate file for the Web page you want to access based on the wizard provided.

    3. The first step of the wizard, welcome interface, directly click "Next",

    4. Second step of the wizard, select the exported file format, by default, click "Next",

    5. Wizard step three, enter the exported file name, enter it, click "Next",

    6. Wizard step Fourth, click Finish to complete the wizard

    7. Finally, a dialog box appears, showing that the export was successful

  • Use the Keytool tool to pour the certificate you just exported into the local keystore. Keytool command under <java-home>\bin\, open a command-line window, and under <java-home>\lib\security\ directory, run the following command:

    keytool-import-noprompt-keystore cacerts-storepass changeit-alias yourentry1-file Your.cer 

    Where the value that is followed by the parameter alias is the unique identifier of the current certificate in KeyStore, but the case is not distinguished; The parameter file is followed by the path and file name of the certificate that you just exported through IE. If you want to delete the certificate that you just imported into KeyStore, you can use the command:

    keytool-delete-keystore cacerts-storepass Changeit- Alias YourEntry1 
  • The
  • write program accesses the HTTPS address. If you want to test if you can connect to HTTPS, just change the getsample example a little bit, and turn the target of the request into an HTTPS address.
    getmethod GetMethod = new GetMethod ("https://www.yourdomain.com"); 

    Problems that may occur when you run the program:

    1. Throws an exception Java.net.SocketException:Algorithm SSL not available. This exception can occur because there is no jsseprovider, if you are using IBM Jsse Provider, add such a line to the program:

     if ( Security.getprovider ("com.ibm.jsse.IBMJSSEProvider") = = null) security.addprovider (new Ibmjsseprovider ()); 

    Or you can open <java-home>\lib\security\java.security, line

    security.provider.1=sun.security.provider.sunsecurity.provider.2= Com.ibm.crypto.provider.IBMJCE 

    After adding Security.provider.3=com.ibm.jsse.ibmjsseprovider

    2. Throws an exception Java.net.SocketException:SSL implementation not available. The exception may be that you did not copy the Ibmjsse.jar to the <java-home>\lib\ext\ directory.

    3. Throw exception Javax.net.ssl.SSLHandshakeException:unknown certificate. This exception indicates that your Jsse should be installed correctly, but probably because you did not import the certificate into the KeyStore that is currently running the JRE, follow the steps described earlier to import your certificate.

Method 2, extending the HttpClient class implementation to automatically accept the certificate

Because this method automatically receives all certificates, there are some security issues, so consider your system's security needs carefully before using this method. The specific steps are as follows:

    • Provides a custom socket factory (test. Mysecureprotocolsocketfactory). This custom class must implement interface Org.apache.commons.httpclient.protocol.SecureProtocolSocketFactory, calling custom X509trustmanager in the class that implements the interface ( Test. Myx509trustmanager)
    • Create an instance of Org.apache.commons.httpclient.protocol.Protocol, specifying the protocol name and the default port number
      Protocol Myhttps = new Protocol ("https", new Mysecureprotocolsocketfactory (), 443);
    • Register the HTTPS protocol object you just created
      Protocol.registerprotocol ("https", Myhttps);
    • Then open the target address of HTTPS in normal programming mode, see Test. Nocertificationhttpsgetsample
Package Com.ipmotor.sm.db;import Java.io.bufferedreader;import Java.io.file;import java.io.fileinputstream;import Java.io.inputstream;import Java.io.inputstreamreader;import Java.security.keystore;import Org.apache.http.httpresponse;import Org.apache.http.client.httpclient;import Org.apache.http.client.methods.httpget;import Org.apache.http.conn.scheme.scheme;import Org.apache.http.conn.ssl.sslsocketfactory;import org.apache.http.impl.client.defaulthttpclient;/** * Using httpclient , Analog HTTPS Connection * using 4.1 version * @since 2011.7.7 */public class test{/** * Run Main method * @param args * @throws excepti On */public static void main (string[] args) throws Exception {//Get HttpClient object HttpClient HttpClient =      New Defaulthttpclient ();      Access key library KeyStore Truststore = Keystore.getinstance (Keystore.getdefaulttype ());      FileInputStream instream = new FileInputStream (New File ("D:/zzaa"));      Key Library Password truststore.load (instream, "123456". ToCharArray ());   Register Key Library   Sslsocketfactory socketfactory = new Sslsocketfactory (Truststore);      Do not verify the domain name Socketfactory.sethostnameverifier (sslsocketfactory.allow_all_hostname_verifier);      Scheme sch = new scheme ("https", socketfactory);      Httpclient.getconnectionmanager (). Getschemeregistry (). Register (Sch);      Get HttpGet object HttpGet httpget = null; HttpGet = new HttpGet ("https://10.15.32.176:800/cgi-bin/service.cgi?session=caef0c3742c8f8ef4c98772e860c9fd2&      Rand=128&domain=sun.com&type=domain&cmd=disable ");      Send Request HttpResponse response = Httpclient.execute (HttpGet);      Output return value InputStream is = response.getentity (). getcontent ();      BufferedReader br = new BufferedReader (new InputStreamReader (IS));      String line = "";      while (line = Br.readline ())!=null) {System.out.println (line); }}} depends on the jar package Commons-codec-1.4.jarcommons-logging-1.1.1.jarhttpclient-4.1.1.jarhttpclient-cache-4.1.1.jarhttpcore-4.1.ja Rhttpmime-4.1.1.jar

  

Processing a proxy server

It is very simple to use a proxy server in httpclient, it is possible to call the SetProxy method in HttpClient, the first parameter of the method is the proxy server address, and the second parameter is the port number. In addition HttpClient also support socks agent.

Httpclient.gethostconfiguration (). SetProxy (Hostname,port);
Conclusion

From the above introduction, you can know that httpclient to the HTTP protocol support is very good, easy to use, version update fast, the function is also very powerful, with sufficient flexibility and scalability. HttpClient is a great tool for programmers who want to access HTTP resources directly in Java applications.

HttpClient Getting Started

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.