Session working mechanism explanation and security issues (PHP instance explanation) _ PHP Tutorial

Source: Internet
Author: User
Tags php session what php
Session working mechanism explanation and security issues (PHP instance explanation ). First, let's briefly understand some http knowledge to understand the stateless features of the protocol. Then, learn some basic operations on cookies. Finally, I will explain how to use http step by step. First, we will briefly understand some http knowledge to understand the stateless features of the protocol. Then, learn some basic operations on cookies. Finally, I will explain step by step how to use simple and efficient methods to improve the security and stability of your php application.

I think most php beginners will surely think that the security of the default session mechanism of php seems to be guaranteed, the opposite is true-the php team only provides a set of convenient session solutions for programmers. as for security, it should be enhanced by programmers, which is the responsibility of the application development team. This is because there are many methods, so let's say that there is no best, but better. The attack methods are constantly changing, and the defender also needs to change their tactics. Therefore, I personally think the php team's approach is wise.

I. stateless HTTP

Http is a stateless protocol. This is because this protocol does not require the browser to indicate its identity in each request, and there is no persistent connection between the browser and the server for access between multiple pages. When a user accesses a site, the user's browser sends an http request to the server, and the server returns an http response to the browser. In fact, a simple concept is that the client sends a request and the server sends a response. this is the entire http-based communication process.
Because web applications communicate based on the http protocol, we have already mentioned that http is stateless, which increases the difficulty of maintaining the status of web applications. for developers, it is not a small challenge. Cookies were born as an extension of http. They are mainly used to make up for the stateless feature of http and provide a way to maintain the status between the client and the server, however, for security reasons, some users disable cookies in browsers. In this case, the status information can only be transmitted to the server through parameters in the url. However, this method is of poor security. In fact, according to the general idea, there should be a client to indicate its identity, so as to maintain a state with the server, but for the sake of security, we should all understand that the information from the client cannot be fully trusted.
Despite this, there are still elegant solutions to the problem of maintaining the status of web applications. However, it should be said that there is no perfect solution, and no good solution can be applied to all situations. This article will introduce some technologies. These technologies can be used to maintain the application state and defend against session attacks, such as session hijacking. You can also learn how cookies work, what php sessions do, and how they can be hijacked.

II. http overview

How can we maintain the status of web applications and select the most appropriate solution? Before answering this question, you must first understand the underlying web Protocol Hypertext Transfer Protocol (HTTP ).

When the user accesses the http://example.com domain name, the browser will automatically establish a TCP/IP connection with the server, and then send an http request to port 80 of the server of example.com. The request syntax is as follows:

The code is as follows:


GET, HTTP, 1.1
Host: example.org

The first line is called the request line, and the second parameter (a backslash in this example) indicates the path of the requested resource. The backslash represents the root directory. The Server converts the root directory to a specific directory in the server file system.
Apache users often use the DocumentRoot command to set the root path of this document. If the requested url is http://example.org/path/to/script.php, then the requested url is: // path/to/script.php. If document root is defined as usr/lcoal/apache/htdocs, the resource path of the entire request is/usr/local/apache/htdocs/path/to/script. php.
The second line describes the syntax of the http header. In this example, the header is Host, which identifies the domain name Host that the browser wants to obtain resources. There are many other request headers that can be included in http requests, such as the user-Agent header, in php, you can use $ _ SERVER ['http _ USER_AGENT '] to obtain the header information carried in the request.
Unfortunately, in this request example, there is no information that uniquely identifies the client currently sending the request. Some developers use the ip header in the request to uniquely identify the client sending the request, but this method has many problems. For example, if user A connects to www.example.com through proxy B, the server obtains the ip address assigned by proxy B to user, if the user disconnects the proxy and then connects to the proxy again, the proxy IP address changes again, that is, a user corresponds to multiple IP addresses. in this case, if the server identifies a user based on the IP address, the request is considered to be from different users, but actually the same user. Another scenario is that many users connect to the Internet through a route in the same LAN and then access www.example.com. because these users share the same Internet IP address, this causes the server to think that these users are requests sent by the same user because they are accessed from the same IP address.
The first step to keep the application state is to know how to uniquely identify each client. Because only the information contained in the http request can be used to identify the client, the request must contain information that can be used to identify the unique identity of the client. Cookie is designed to solve this problem.

III. cookies

If you think of Cookies as an extension of the http protocol, it will be much easier to understand. In fact, cookies are essentially an extension of http. Two http headers are responsible for setting and sending cookies. they are Set-cookie and Cookie. When the server returns an http response to the client, if it contains the Set-Cookie header, it indicates that the client creates a cookie, in addition, the cookie is automatically sent to the server in subsequent http requests until the cookie expires. If the cookie is stored throughout the session, the browser saves the cookie in the memory and automatically clears the cookie when the browser closes. In another case, the cookie is stored in the client's hard disk. if the browser is closed, the cookie will not be cleared. The next time you open the browser to access the corresponding website, this cookie is automatically sent to the server again. The setting and sending process of a cookie are divided into the following four steps:

1. the client sends an http request to the server
2. the server sends an http response to the client, which contains the Set-Cookie header.
3. the client sends an http request to the server, including the Cookie header.
4. the server sends an http response to the client.
The communication process can also be described as follows:


The Cookie header contained in the second request of the client is provided to the server for uniquely identifying the client identity. Then, the server can determine whether the client has enabled cookies. Although the user may suddenly disable the use of cookies when interacting with the application, this situation is basically unlikely, so you may not consider it, this is also proved to be true in practice.

4. get and post data

In addition to cookies, the client can also include the data sent to the server in the request url, such as the request parameters or request path. Let's look at an example:

The code is as follows:


GET/index. php? Fool = bar HTTP/1.1
Host: example.org

The above is a general http get request, which is sent to the index under the web server corresponding to the example.org domain name. php script, in index. in the php script, you can use $ _ GET ['foo'] to obtain the value of the foo parameter in the corresponding url, that is, 'bar '. Most php developers call this type of data GET data, and a few call it query data or url variables. However, you must note that the GET data can only be contained in http get requests, and can also contain GET data in http post requests, you only need to include the relevant GET data in the request url. that is to say, the transmission of GET data does not depend on the specific request type.

Another way for the client to transmit data to the server is to include the data in the content area of the http request. For this method, the request type is POST. See the following example:

The code is as follows:


POST/index. php HTTP/1.1
Host: example.org
Content-Type: application/x-www-form-urlencoded
Content-Length: 7

Foo = bar


In this case, you can call $ _ POST ['foo'] in the index. php script to obtain the corresponding bar value. The developer calls this data as POST data, that is, the well-known form Method for submitting requests in post mode.

A request can contain both types of data:

The code is as follows:


POST/index. php? Myget = foohttp/1.1
Host: example. orgContent-Type: application/x-www-form-urlencoded
Content-Length: 11

Mypost = bar
[Code]
These two data transmission methods are more stable than using cookies to transmit data, because cookies may be disabled, but this does not happen when data is transmitted using GET or POST methods. We can include PHPSESSID in the http request url, just as in the following example:
[Code]
GET/index. php? PHPSESSID = 12345 HTTP/1.1
Host: example.org


In this way, the session id can be passed in the same way as the session id in the cookie header. However, the disadvantage is that the developer needs to append the session id to the url or add the session id to the form as a hidden field. Unlike cookies, as long as the server instructs the client to successfully create a cookie, the client will automatically pass the corresponding non-expired cookie to the server in subsequent requests. Of course, after enabling session. use_trans_sid, php can automatically append the session id to the url and hidden fields of the form. However, this option is not recommended because of security issues. In this way, the session id is easily disclosed. for example, if some users mark a url or share a url, the session id will be exposed and the session id has not expired, there are some security issues, unless the server side, in addition to the session id, it also adds other methods to verify the user's legality!

Although the session id is passed in POST mode, the GET method is much safer. However, the disadvantage of this method is that it is troublesome, because in this case, in your application, it is obviously inappropriate to convert all requests to post requests.

V. session management

Until now, I have only discussed how to maintain the state of the application, but simply involves maintaining the relationship between requests. Next, I will explain how to use more technology-Session management in practice. When session management is involved, it is not only necessary to maintain the status between requests, but also to maintain the data used for each specific user during the session. We often call this data session data because the data is associated with sessions between a specific user and the server. If you use the php built-in session management mechanism, session data is generally stored in the/tmp server-side folder, the session data is automatically saved to the Super array $ _ SESSION. The simplest example of using a session is to transfer the relevant session data from one page (note: the session id is actually passed) to another page. The following uses example code 1, start. php to demonstrate this example:

Sample Code 1-start. php

The code is as follows:




Continue. php



If you click start. php connection to continue. php, then in continue. in php, you can use $ _ SESSION ['foo'] to get it at start. the value 'bar' defined in php '. See the following sample code 2:

Sample Code 2-continue. php

The code is as follows:



Is it very simple, but I want to point out that if you write code like this, it means that you are not very familiar with the underlying session implementation mechanism of php. Without understanding how many things php has automatically done for you, you will find that such code will become difficult to debug if the program fails. In fact, such code is completely insecure.

VI. session Security issues

Many developers have always believed that php's built-in session management mechanism is secure and can defend against General session attacks. In fact, this is a misunderstanding. The php team only implements a convenient and effective mechanism. Specific security measures should be implemented by the application development team. As mentioned in the beginning, there is no best solution, but the best solution for you.

Now, let's look at the next common session attack:

1. the user accesses the http://www.example.org and logs in.
2.example.org server settings instruct the client to set related cookies-PHPSESSID = 12345
3. the attacker accesses http://www.example.org/and carries the cookie-PHPSESSID = 12345 in the request.
4. in this case, because the example. orge server uses PHPSESSID to identify the corresponding user, the server mistakenly treats the attacker as a legal user.
The entire process is described in the following example:


Of course, the prerequisite for this attack is that the attacker must use some fixed means to hijack or guess the PHPSESSID of a legitimate user. Although this seems difficult, it is not impossible.

VII. enhanced security

There are many technologies that can be used to enhance Session security. The main idea is to make the verification process simpler and better for legal users, and then the more complicated the steps for attackers, the better. Of course, this seems to be more difficult to balance. you need to make decisions based on the specific design of your application.

The simplest form of an HTTP/1.1 request includes the request line and some Host headers:

The code is as follows:


GET, HTTP, 1.1
Host: example.org


If the client passes the relevant session identifier through PHPSESSID, you can put PHPSESSID in the cookie header for transmission:

The code is as follows:


GET, HTTP, 1.1
Host: example.org
Cookie: PHPSESSID = 12345


Similarly, the client can put the session identifier in the request url for transmission.

The code is as follows:


GET /? Phpsessid= 12345
HTTP/1.1 Host: example.org


Of course, session identifiers can also be included in POST data, but this affects the user experience, so this method is rarely used.

Because TCP/IP information may not be fully trusted, it is not suitable for web developers to use TCP/IP information to enhance security. However, an attacker must also provide a unique identifier of a valid user to impersonate a legitimate user into the system. Therefore, it seems that the only effective measure to protect the system is to hide the session identifier as much as possible or make it difficult to guess. It is best to implement both.

PHP will automatically generate a random session ID, which is basically impossible to guess, so this security is guaranteed. However, it is quite difficult to prevent attackers from obtaining a valid session ID, which is basically beyond the control of developers.

In fact, in many cases, session IDs may be leaked. For example, if the session ID is transmitted through GET data, this sensitive identity information may be exposed. Some users may cache links with session IDs, add them to favorites, or send them to emails. Cookies are a relatively secure mechanism, but users can disable cookies on the client! In some IE versions, there are also serious security vulnerabilities. The most famous one is that cookies are leaked to some evil websites with security risks.

Therefore, as a developer, it is certain that the session ID cannot be guessed, but it may still be obtained by attackers using some methods. Therefore, you must take additional security measures to prevent such situations from occurring in your application.

In fact, in addition to the Host header, a standard HTTP request also contains some optional headers. for example, let's look at the following request:

The code is as follows:


GET, HTTP, 1.1
Host: example.org
Cookie: PHPSESSID = 12345
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: 1.8.1.1) Gecko/20061204 Firefox/2.0.0.1
Accept: text/html; q = 0.9, */*; q = 0.1
Accept-Charset: ISO-8859-1, UTF-8; q = 0.66, *; q = 0.66
Accept-Language: en


The preceding request example contains four additional headers: User-Agent, Accept, Accept-Charset, and Accept-Language. Because these headers are not mandatory, it is unwise to rely entirely on them to play a role in your application. However, if a user's browser does send these headers to the server, it is certain that in the next request sent by the same user through the same browser, these headers must also be carried. Of course, there will also be a very small number of special cases. If the preceding example is a request sent by a user who has established a session with the server, consider the following request:

The code is as follows:


GET, HTTP, 1.1
Host: example.org
Cookie: PHPSESSID = 12345
User-Agent: Mozilla/5.0



Because the same session id is contained in the Cookie header of the request, the same php session will be accessed. However, the User-Agent header in the request is different from the information in the previous request. Can the system assume that these two requests are sent by the same User?

In this case, the browser's header is changed, but it cannot be certain whether this is a request from an attacker. a better way is to bring up an input box asking users to enter the password, in this way, the user experience will not be greatly affected, and the attack can be effectively prevented.

Of course, you can add the code to check the User-Agent header in the system, similar to the code in Example 3:

Sample Code 3:

The code is as follows:




Of course, you must first encrypt the user agent information and save it in the session with the MD5 algorithm during the first session initialization request, similar to the code in example 4 below:

Sample Code 4:

The code is as follows:



Although it is not necessary to use MD5 to encrypt the User-Agent information, you do not need to filter the $ _ SERVER ['http _ USER_AGENT '] data after using this method. Otherwise, you must filter the data before using the data, because any data from the client is untrusted.

After you check the header information of the User-Agent client, the attacker must complete two steps to hijack a session:

1. obtain a valid session id
2. include an identical User-Agent header in a forged request

You may say that an attacker can obtain a valid session id, so it is not difficult to forge the same User-Agent at his level. Yes, but we can say that this at least adds some trouble to him, and also increases the security of the session mechanism to a certain extent.

You can also think of it. since we can check the User-Agent header to enhance security, we may wish to use other header information to combine them to generate an encrypted token, and let the client carry this token in subsequent requests! In this case, attackers basically cannot guess how such a token is generated. This is like payment by credit card at the supermarket. you must have a credit card (like a session id) and a payment password (like a token ), if both of these conditions are met, you can successfully enter the account for payment. See the following code:

The code is as follows:





Note: the Accept header should not be used to generate a token, because some browsers automatically change this header when refreshing the browser.

After you add this token to your verification mechanism that is difficult to guess, the security will be greatly improved. If the token is passed in the same way as the session id, in this case, an attacker must complete three steps to hijack the user's session:

1. obtain a valid session ID
2. add the same User-Agent header to the request, and use
3. carry the attacker's token in the request
4. there is a problem here. If both the session id and token are transmitted through GET data, attackers can also obtain this token for attackers who can obtain the session ID. Therefore, it is safer and more reliable to transmit session IDs and tokens using two different data transmission methods. For example, the session id is passed through the cookie, and the token is passed through the GET data. Therefore, if an attacker obtains the unique user identity through some means, it is unlikely that the token can be obtained at the same time, which is relatively secure.

There are also many technical means to enhance the security of your session mechanism. I hope that after you get a general idea of the internal nature of the session, you can design a verification mechanism suitable for your application system to greatly improve the security of the system. After all, you are one of the developers who are most familiar with the current system and can implement some unique and additional security measures according to the actual situation.

VIII. Summary

The above section briefly describes the working mechanism of the session and briefly describes some security measures. However, remember that the above methods can enhance security, not to say that they can fully protect your system, and hope that readers will investigate the relevant content on their own. During this survey, I believe you will learn a solution that is of great practical use value.

Bytes. Then, learn some basic operations on cookies. Finally, I will explain step by step how to use...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.