HttpClient can automatically manage cookies, including allowing the server to set up cookies and automatically return cookies to the server when needed, and it also supports manually setting cookies and sending them to the server side. Unfortunately, there are several specifications conflicting about how to handle cookies: Netscape Cookie Draft, RFC2109, RFC2965, and a large number of software vendors with cookie implementations that do not follow any specification. To address this situation, HttpClient provides a policy-driven cookie management approach. The cookie specifications supported by HttpClient are:
Netscape Cookie Draft, is the earliest cookie specification, based on rfc2109. Although this specification differs considerably from rc2109, it can be compatible with some servers.
RFC2109 is the first official cookie specification published by the Consortium. In theory, all servers follow this specification when processing cookies (version 1), and for this reason, httpclient sets them to the default specification. Unfortunately, this specification is so restrictive that many servers incorrectly implement the specification or still function with Netscape specifications. In this case, the compatibility specification should be used.
Compatibility specifications that are designed to be compatible with as many servers as possible, even if they do not follow the standard specification. You should consider adopting a compatibility specification when parsing a cookie problem.
The RFC2965 specification is temporarily not supported by HttpClient (which will be added later), it defines cookie version 2, and explains the lack of version 1cookie, RFC2965 intentionally replacing rfc2109 for a long time.
In HttpClient, there are two ways to specify the use of the cookie specification.
HttpClient client = new HttpClient ();
Client.getstate (). Setcookiepolicy (cookiepolicy.compatibility);
The specification set by this method is valid only for the current httpstate, and the parameter is Cookiepolicy.compatibility,cookiepolicy.netscape_draft or cookiepolicy.rfc2109.
System.setproperty ("Apache.commons.httpclient.cookiespec", "Compatibility");
This method refers to the specification, which is valid for each newly established Httpstate object, the parameter is compatibility, "Netscape_draft" or "RFC2109".
It is often impossible to parse cookies, but changing to a compatible specification can be a solution.
9, the use of httpclient encountered problems how to do?
Use a browser to access the server to confirm that the server is responding properly
If you're making a proxy, try shutting down the agent.
Find another server to try (if running different Server software is better)
Check if the code is written in the tutorial
Set log level to debug to find out why the problem occurred
Open Wiretrace to track client-server communication to where the problem occurs
Use Telnet or netcat to manually send information to the server, which is suitable for guessing when a reason has been found
The netcat is run on a listener basis and used as a server to check how HttpClient handles the response.
Try using the latest httpclient, bugs may be fixed in the latest version
Ask the mailing list for help
Report Bugs to Bugzilla.
10, SSL
With Java secure Socket Extension (JSSE), HttpClient fully supports Layer on secure Sockets Transport (SSL) or IETF Layer htt security (TLS) protocol P. Jsse has been jre1.4 and later versions, the previous version will need to install the settings manually, see the Sun website or this learning notes.
The use of SSL in HttpClient is very simple, for reference to the following two examples:
HttpClient httpclient = new HttpClient ();
GetMethod httpget = new GetMethod ("https://www.verisign.com/");
Httpclient.executemethod (HttpGet);
System.out.println (Httpget.getstatusline (). toString ());
, if you pass a proxy that requires authorization, the following:
HttpClient httpclient = new HttpClient ();
Httpclient.gethostconfiguration (). SetProxy ("Myproxyhost", 8080);
Httpclient.getstate (). Setproxycredentials ("My-proxy-realm", "Myproxyhost"),
New Usernamepasswordcredentials ("My-proxy-username", "My-proxy-password"));
GetMethod httpget = new GetMethod ("https://www.verisign.com/");
Httpclient.executemethod (HttpGet);
System.out.println (Httpget.getstatusline (). toString ());
The steps for customizing SSL in HttpClient are as follows:
Provides a socket factory that implements the Org.apache.commons.httpclient.protocol.SecureProtocolSocketFactory interface. This socket factory is responsible for hitting a port to the server, using a standard or Third-party SSL library, and initiating operations like a connection handshake. Typically, this initialization occurs automatically when a port is created.
Instantiates a Org.apache.commons.httpclient.protocol.Protocol object. When you create this instance, you need a legitimate protocol type (such as HTTPS), a custom socket factory, and a default port medium (such as 443 ports for HTTPS).
Protocol Myhttps = new Protocol ("https", new Mysslsocketfactory (), 443);
This instance can then be set to the processor of the Protocol.
HttpClient httpclient = new HttpClient ();
Httpclient.gethostconfiguration (). Sethost ("www.whatever.com", 443, Myhttps);
GetMethod httpget = new GetMethod ("/");
Httpclient.executemethod (HttpGet);
By calling the Protocol.registerprotocol method, the custom instance is registered as the default processor for a particular protocol. From this, you can easily customize your own protocol type (such as Myhttps).
Protocol.registerprotocol ("Myhttps",
New Protocol ("https", new Mysslsocketfactory (), 9443));
...
HttpClient httpclient = new HttpClient ();
GetMethod httpget = new GetMethod ("myhttps://www.whatever.com/");
Httpclient.executemethod (HttpGet);
If you want to replace the HTTPS default processor with your own custom processor, simply register it as "https."
Protocol.registerprotocol ("https",
New Protocol ("https", new Mysslsocketfactory (), 443));
HttpClient httpclient = new HttpClient ();
GetMethod httpget = new GetMethod ("https://www.whatever.com/");
Httpclient.executemethod (HttpGet);
Known limitations and issues
Persistent SSL connections do not work on Sun's lower 1.4JVM, as a result of a bug in the JVM.
Non-preemptive authentication (non-preemptive authentication) fails when accessing the server through a proxy, due to a httpclient design flaw, which will be modified in future releases.
The handling of problems encountered
Many problems, especially when the JVM is below 1.4, are caused by the installation of Jsse.
The following code can be used as the final detection method.
public static final String target_https_server = "www.verisign.com";
public static final int target_https_port = 443;
public static void Main (string[] args) throws Exception {
Socket socket = Sslsocketfactory.getdefault ().
Createsocket (Target_https_server, Target_https_port);
try {
Writer out = new OutputStreamWriter (
Socket.getoutputstream (), "iso-8859-1");
Out.write ("Get/http/1.1rn");
Out.write ("Host:" + target_https_server + ":" +
Target_https_port + "RN");
Out.write ("agent:ssl-testrn");
Out.write ("RN");
Out.flush ();
BufferedReader in = new BufferedReader (
New InputStreamReader (Socket.getinputstream (), "iso-8859-1"));
String line = null;
while (line = In.readline ())!= null) {
System.out.println (line);
}
finally {
Socket.close ();
}
}
}
11, HttpClient multithread Processing
The main purpose of using multithreading is to achieve parallel downloads. In the process of httpclient running, each HTTP protocol method uses a httpconnection instance. Because connections are a limited resource, each connection can only be used by one thread and method at a time, you need to ensure that the connection is properly allocated when needed. HttpClient uses a JDBC Connection pool approach to manage connections, and this management is done by Multithreadedhttpconnectionmanager.
Multithreadedhttpconnectionmanager ConnectionManager =
New Multithreadedhttpconnectionmanager ();
HttpClient client = new HttpClient (ConnectionManager);
This is where the client can be used to execute multiple methods in multiple threads. Each call to the Httpclient.executemethod () method will go to the chain manager to apply for a connection instance, the successful application of this link is checked out (checkout), and after the link is used, must return the manager. The manager supports two settings: Maxconnectionsperhost the maximum number of parallel links per host, default is 2
Maxtotalconnections Client Total parallel link maximum, defaults to 20
When the manager reuses the link, take the way that the early-returning person first reuses (least recently used approach).
Because the HttpClient program is used instead of the httpclient itself to read the body of the answer package, HttpClient cannot decide what time connection is no longer in use, This also requires that you explicitly call Releaseconnection () to release the link to the request after you have read the body of the answer package.
Multithreadedhttpconnectionmanager ConnectionManager = new Multithreadedhttpconnectionmanager ();
HttpClient client = new HttpClient (ConnectionManager);
...
In a thread.
GetMethod get = new GetMethod ("http://jakarta.apache.org/");
try {
Client.executemethod (get);
Print response to stdout
System.out.println (Get.getresponsebodyasstream ());
finally {
Be sure the connection are released back to the connection
Manager
Get.releaseconnection ();
}
Each httpclient.executemethod must have a method.releaseconnection () to match it.
12. http method
There are 8 types of HTTP methods supported by HttpClient, as described below.
1. Options
The HTTP method options are used to send a request to the server for functional options that can be used in the request/reply communication process for resources flagged by the request URL. In this way, the client can decide what action to take and/or the necessary conditions for a resource before taking concrete action, or understand the functionality provided by the server. The most typical application of this method is to obtain which HTTP methods the server supports.
There is a class called Optionsmethod in HttpClient that supports this HTTP method, and by using the Getallowedmethods method of this class, it is simple to implement the typical application described above.
Optionsmethod options = new Optionsmethod ("http://jakarta.apache.org");
Execute the method and do the corresponding exception handling
...
Enumeration allowedmethods = Options.getallowedmethods ();
Options.releaseconnection ();
2, get
http method get is used to retrieve any information (in the form of an entity (entity)) of the request URI (Request-uri) flag, and the word "getting" is meant to "fetch". If the request URI points to a data processing process, the data generated by the process is returned as an entity in the answer rather than the return of the code of the process.
If the HTTP package contains If-modifiedsince, If-unmodified-since, If-match, If-none-match, or If-range header fields, then the get becomes the condition got. That is, only entities that meet the conditions described in the above fields are retrieved, which can reduce some non-essential network traffic or reduce multiple requests for a resource (such as the first check, the second download). (General browser, there is a temporary directory, used to cache some Web page information, when browsing a page again, only to download those modified content, to speed up browsing speed, this is the truth.) As for inspection, it is commonly used to achieve a better method head than get. If the HTTP package contains a range header field, then the part of the entity specified by the request URI is returned only if the portion that determines the scope condition is taken. (Friends who have used the multithreaded download tool may be more likely to understand this)
A typical application of this method is used to download documents from a Web server. HttpClient defines a class called GetMethod to support this method, using the GetMethod class Getresponsebody, Getresponsebodyasstream, or The Getresponsebodyasstring function takes you to the document (such as HTML page) information in the answer package body. Of these three functions, Getresponsebodyasstream is usually the best method, mainly because it avoids caching all downloaded data before processing the downloaded document.
GetMethod get = new GetMethod ("http://jakarta.apache.org");
Executes the method and processes the failed request.
...
InputStream in = Get.getresponsebodyasstream ();
Use input streams to process information.
Get.releaseconnection ();
The most common incorrect use of GetMethod is not to read all the respondents ' data. Also, be aware that you want to manually release the link explicitly.
3, head
The HTTP head method is exactly the same as the Get method, except that the server cannot include the principal (message-body) in the answer package, and must not contain the principal. Using this method, the customer can get some basic information about it without having to download the resource back. This method is commonly used to check the accessibility of the hyperlink and the resource has not been modified recently.
The most typical application of the HTTP head method is to obtain the basic information of the resource. HttpClient defines the Headmethod class to support this method, Headmethod class, like other *method classes, uses Getresponseheaders () to retrieve header information without its own special method.
Headmethod head = new Headmethod ("http://jakarta.apache.org");
Executes the method and processes the failed request.
...
Retrieves the header field information for the answer package.
header[] headers = head.getresponseheaders ();
Retrieve only the last Modified Date field information.
String lastmodified = Head.getresponseheader ("last-modified"). GetValue ();
4, Post
Post in English has the meaning of "residency", HTTP method post is to require the server to accept entities in the request package, and as a subordinate resource of the request URI. Essentially, this means that the server is saving this entity information and is typically handled by a server-side program. The Post method is designed to implement the following features in a unified manner:
Commentary on existing resources
Publish information to a BBS, newsgroup, mailing list, or a similar group of articles
Submit a piece of data to the data processing process
To extend a database by appending operations
These operations are expected to produce a certain "side effects" on the server side, such as modifying the database.
HttpClient defines the Postmethod class to support the HTTP method, in HttpClient, there are two basic steps to using the POST method: Preparing the data for the request package and then reading the information from the server's reply package. The Setrequestbody () function is invoked to provide data for the request package, which can receive three types of parameters: an input stream, an array of name values, or a string. As for reading the answer package, you need to invoke the Getresponsebody* series of methods, the same way that the Get method handles the answer package.
The common problem is that none of the answers are read (whether it is useful to the program) or the linked resource is not released.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.