Session blocking and garbage collection Redis sharing session in PHP

Source: Internet
Author: User
Tags add time current time garbage collection session id php script php source code sleep account security

What's the relationship between session and cookies?

Cookies are also technologies that are generated by HTTP stateless features. It is also used to save the identity of the visitor and some data. Each time the client initiates an HTTP request, the Cookie data is added to the HTTP header and submitted to the server. This allows the server to know the visitor's information based on the contents of the Cookie.

It can be said that the session and cookies do similar things, but the session is to save the data on the server, through the client submitted to the session_id to obtain the corresponding data, while the cookie is to save the data on the client, each time the request is submitted to the server side of the data.

As mentioned above, session_id can be passed by URL or cookie, because the way of the URL is more unsafe and inconvenient than the way of the cookie, so the cookie is used to pass the session_id.

The server generates SESSION_ID, sends the HTTP message to the client (such as the browser), and when the client receives it, it creates a cookie that holds the session_id. Cookies are stored in key/value form, which looks likely to be the case:

Phpsessid=e4tqo2ajfbqqia9prm8t83b1f2

In PHP, the name of the cookie that holds the session_id is called PHPSESSID, which can be modified by session.name in PHP.ini, or by the function session_name ().



Why is it not recommended to use the files-type session processor with PHP

In PHP, the default session processor is files, and the processor can be implemented by the user themselves (see: Customizing the Conversation manager). I know there are a lot of mature session Processor: Redis, Memcached, MongoDB ... Why is it not recommended to use PHP's own files type processor, which is given in the official PHP manual:

Whether you open a session manually by calling a function session_start (), or by using a configuration item Session.auto_start to automatically open a session, the session data file is locked at the beginning of the session, for file-based session data saving (the default behavior of PHP). Save session data until the PHP script finishes executing or explicitly calls Session_write_close (). During this time, other scripts may not have access to the same session data file.

See above references: basic usage of Session

To prove this, we created 2 files:

File: session1.php

<?php
session_start ();
Sleep (5);
Var_dump ($_session);
? >


File: session2.php

<?php
session_start ();
Var_dump ($_session);
? >


In the same browser, access the http://127.0.0.1/session1.php first, and then access http://127.0.0.1/session2.php immediately on the new tab in the current browser. The experiment found that session1.php waited 5 seconds for output, and session2.php waited nearly 5 seconds for output. and the individual access session2.php is seconds open. Access session1.php in one browser and immediately access session2.php in another browser. The result is that session1.php waits 5 seconds for output, while session2.php is seconds away.

Analyze the reason for this: In the example above, a cookie is used by default to pass session_id, and the scope of the cookie is the same. In this way, the 2 addresses are accessed in the same browser, and the session_id submitted to the server is the same (so that visitors can be tagged, which is the effect we expect). When accessing session1.php, PHP saves the path of the session file on the server based on the submitted session_id (TMP/php.ini) through Session.save_path or functions in the SESSION_SAVE_PA Th () to modify) to find the corresponding session file, and lock it. If Session_write_close () is not explicitly invoked, the file lock is not released until the current PHP script completes execution. If there are more time-consuming operations in the script (such as sleep (5) in the example), then another request holding the same session_id can only be forced to wait because the file is locked, so there is a request blocking condition.

In this case, after using the session, immediately show call Session_write_close () is not solve the problem? For example, in the above example, call Session_write_close () before sleep (5).

Indeed, such session2.php will not be blocked by session1.php. However, the display called session_write_close () means that the data is written to a file and the current session ends. Then, when you want to use session in later code, you must recall session_start ().

For example:

<?php
session_start ();
$_session[' name ' = ' Jing ';
Var_dump ($_session);
Session_write_close ();
Sleep (5);
Session_Start ();
$_session[' name ' = ' mr.jing ';
Var_dump ($_session);
? >



The official program:

This can be a serious problem for Web sites that use Ajax or concurrent requests in large numbers. The easiest way to solve this problem is if you modify the variables in the session, you should call Session_write_close () as soon as possible to save the session data and release the file lock. Another option is to replace the file session Save manager with a session save manager that supports concurrent operations.

The way I recommend it is to use Redis as the processor for the session.

When was the session data deleted?

This is a question that is often asked by the interviewer.

First look at the instructions in the Official Handbook:

SESSION.GC_MAXLIFETIME specifies the number of seconds after which the data will be considered "garbage" and cleared. Garbage collection may start when the session starts (depending on session.gc_probability and session.gc_divisor). Session.gc_probability and Session.gc_divisor are used to manage the probability that the GC (garbage collection garbage collection) process starts. This probability is calculated by Gc_probability/gc_divisor. For example, 1/100 means that the GC process is initiated in a 1% probability per request. session.gc_probability defaults to 1,session.gc_divisor defaults to 100.

Keep using the less-than-appropriate analogy: if we put things in a supermarket locker and don't take them away, it takes a long time (say one months) for the security guard to clean up the items in these lockers. Of course it's not over the deadline. Security will come clean, maybe he's lazy, or he doesn't even think about it.

And take a look at the two-paragraph manual references:

If you use the default file-based session processor, the file system must maintain trace access time (atime). Windows FAT file system is not good, so if you must use the FAT file system or other file systems that cannot track atime, you will have to think of a different way to handle the garbage collection of session data. Mtime (modified time) has been substituted for atime from PHP 4.2.3. So there's no problem with file systems that can't track atime.

The timing of the GC is not accurate, with a certain probability, so this setting does not ensure that the old session data is deleted. Some session storage processing modules do not use this setting item.

I am doubtful about such a deletion mechanism.

For example, gc_probability/gc_divisor settings are larger, or a large number of Web site requests, then the GC process will start more frequently.

Also, the GC process starts by traversing the session file list, comparing the file's modification time to the current time on the server, and determining whether the file is out-of-date and deciding whether to delete the file.

That's why I don't think you should use PHP's own files-type session processor. Redis or Memcached are inherently supportive of the key/value expiration mechanism, which is appropriate for a session processor. or implement a file-based processor yourself to determine whether the file expires when you get the corresponding individual session file based on session_id.


Why the session data cannot be removed when the browser is restarted

SESSION.COOKIE_LIFETIME Specifies the life cycle of cookies sent to the browser in seconds. A value of 0 indicates "until the browser is closed." The default is 0.

In fact, not the session data is deleted (and it is possible that the probability is relatively small, see the previous section). Just when you close the browser, save the session_id Cookie. That is, you lost the key to open the supermarket locker (session_id).

Similarly, browser cookies are manually cleared or other software cleanup can result.

Why the browser is open, I have not operated for a long time was logged out

This is called "anti-Stay", in order to protect the user account security.

This section is put in because the implementation of this function may be related to the deletion mechanism of the session (it is possible because this feature does not have to be implemented by the session and can be implemented with cookies).

To put it simply, there is no operation for a long time, the server side session file expiration is deleted.


An interesting thing.

In the course of my experiment, I found a little interesting thing: I set the probability of GC startup to 100%. If there is only one visitor request, and the visitor initiates a second request even after a long time (exceeding the expiration date), then the session data still exists (the session file under the ' Session.save_path ' directory exists). Yes, obviously exceeded the expiration time, but was not deleted by GC. At this point, when I accessed another browser (as opposed to another visitor), the request generated a new session file, and the session file that the previous browser requested was finally gone (the previous session file was in ' Session.save_path ' Disappeared below the table of contents).

Also, found that the session file was deleted, again request, or will generate and previous file name of the same sessions file (because the browser is not closed, the request sent again session_id is the same, so the regenerated session file file name is the same). But what I don't understand is this: the creation time for this new file is the first time it was created, is it coming back from the Recycle Bin? (Yes, I did this experiment under window.)

The reason I guess is this: when you start a session, PHP finds and opens the corresponding sessions file based on session_id, before starting the GC process. The GC process only checks for files other than the current session file and finds that the expiration is off. All, even if the current session file has expired, the GC has not deleted it.

I think this is unreasonable.

Because of the impact of this situation is not very small (after all, the number of online requests, the current request of the expired file by other requests aroused the possibility of the GC kill is relatively large), I do not have the confidence to see the PHP source code, I do not use the online PHP with the file-type session processor Therefore, I have not studied the problem in depth, please understand.

<?php
//Expiration time set to 30 seconds
ini_set (' session.gc_maxlifetime ', ') ';
The GC start-up probability is set to 100%
ini_set (' session.gc_probability ', ' m ');
Ini_set (' Session.gc_divisor ', ' m ');
Session_Start ();
$_session[' name ' = ' Jing ';
Var_dump ($_session);
? >





how to set a session that is strictly 30 minutes overdue

The first kind of answer

The most common answer, then, is to set the expiration time of the session, that is, session.gc_maxlifetime, which is incorrect for the following reasons:

1. First of all, this PHP is a certain probability to run the session of the GC, that is, session.gc_probability and session.gc_divisor (see the in-depth understanding of the PHP principle of the session A small probability of GC notice), this default value is 1 and 100, that is, 1% of the opportunity, PHP will be at the start of the session GC. There is no guarantee that it will expire in 30 minutes.

2. What about setting a big chance to clean up? It's still wrong. Because PHP uses the stat session file modification time to judge whether the expiration, if increases this probability to be able to reduce the performance, second, PHP uses "one" the file to save and a conversation correlation time variable, assumes that I had set up 5 minutes ago A a=1 the time variable, After 5 minutes to set up a b=2 seesion variable, then the session file modified time to add a time of B, then a can not be cleaned up in 30 minutes. There are also the following third reasons.

3. PHP Default (Linux for example) is to use/TMP as the default storage directory for the session, and the manual also has the following description:

Note: If different scripts have different session.gc_maxlifetime values but share the same place to store session data, the script with the smallest number will clean up the data. In this case, use this directive with Session.save_path.

That is, if there are two applications that do not specify their own independent Save_path, one setting expires at 2 minutes (assuming a) and one setting of 30 minutes (assuming B), then each time a session GC is run, it deletes sessions that belong to application B. Files.

So the first answer is not "completely strict" right.

The second kind of answer

Another common answer is to set the carrier for the session ID, the expiration time of the cookie, which is session.cookie_lifetime. This answer is also incorrect for the following reasons:

This expiration is only a cookie expired, in other words, the difference between the cookie and session, the session expires is the server expired, and cookies expire is the client (browser) to ensure that even if you set the cookie expired, This will only ensure that when the standard browser expires, the cookie (which contains the session ID) is not sent, and the value of this session ID can be used if the request is constructed.

The Third kind of answer

Using Memcache, Redis, etc., Okey, this answer is a correct answer. However, it is obvious that the creator will continue to ask you, if you just use PHP?

The fourth kind of answer

Of course, the interview is not for the sake of you, but to examine the thoughtful nature of thinking. In the process I will be prompted with these traps, so generally speaking, the practice is:

1. Set the cookie expiration time by 30 minutes and set the session lifetime to 30 minutes.
2. Add time stamp for each session value.
3. Before each visit, judge the time stamp.

Finally, some students asked why to set the 30-minute expiration time: This, first of all, this is for the interview, second, the actual use of the scene, such as 30 minutes on the expiration of the discount??



why can't I store session with Memcached

Titas Norkūnas is the co-founder of DevOps, the Bear Mountain of the consultancy service provider. Seeing that the ruby/rails community ignores the issues identified in Dormando's two articles, he recently wrote a further elaboration on this. The problem, he argues, is that memcached is a system designed to cache data rather than store it, and should not be used to store sessions.

For Dormando's two articles, he thought the reasons given in the first article were easy to understand, and people often had insufficient knowledge of the reasons given in the second article. So he made a detailed exposition of this reason:

Memcached uses the least recently used (LRU) algorithm to recycle the cache. But Memcached's LRU algorithm executes for each slab class, not for the whole.

This means that if all sessions are roughly the same size, they will be divided into two or three slab classes. All other data of roughly the same size will be placed in the same slab, with session contention storage space. Once the slab is full, even if there is room in the larger slab, the data will be recycled rather than put into a larger slab ... In a particular slab, the oldest user of the session will drop the line. The user will start to randomly drop the line, and worst of all, you probably won't even notice it until the user starts complaining ...

In addition, Norkūnas mentioned that if the session added new data, then the session will be larger may also lead to the problem of line drop.

It was suggested that the session and other data be used separately for the memcached cache. However, because Memcached's LRU algorithm is local, that approach not only leads to low memory usage, but also eliminates the risk that users may be randomly dropped because of session recycling.

If the reader is very interested in using memcached to improve session reading speed, then you can learn from the Norkūnas proposed Memcached+rdbms (in some cases, NoSQL can also) mode:

When the user logs in, the session "set" is memcached and written to the database;
Adds a field in the session that identifies when the session was last written to the database;
When each page loads, it takes precedence to read the session from the memcached and then read from the database;
After each load n pages or Y minutes, the session is written to the database again;
Gets the expiration session from the database, giving priority to getting the latest data from the memcached.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.