A detailed explanation of the principles of PHP serialization and deserialization

Source: Internet
Author: User
Tags object serialization
This article for you to share the next PHP Anti-serialization Vulnerability series of PHP serialization and anti-serialization principle of the relevant knowledge, there is a need for friends to learn from this.

0. Preface

The serialization and deserialization of objects is no longer mentioned, and the result of serialization in PHP is a PHP custom string format, a bit like JSON.

We need to solve a number of problems in the serialization and deserialization of our design objects in any language

After serializing an object, the serialized result has a self-describing function (the specific type of the object is known from the serialized result,

Knowing the type is not enough, and of course you need to know the exact value of the type.

Permissions control at serialization time, you can customize the serialization fields, etc., such as Golang, is very convenient.

Time performance issues: In some performance-sensitive scenarios, object serialization cannot be dragged down, for example: high-performance services (I often use protobuf to serialize).

Spatial performance problem: The result after serialization is not too long, such as an int object in memory, the length of the data after serialization becomes 10 times times int, the serialization algorithm is problematic.

This article explains the process of serialization and deserialization in PHP only from the point of view of PHP code. Remember that a little bit of serialization and deserialization is just the object's data, which is a bit easier to understand with object-oriented development experience.

1. Serialization of serialize and deserialization methods Unserialize

PHP Native provides object serialization functionality, unlike C + + ... ^_^. It is also very simple to use, on two interfaces.

Class fobnn{public $hack _id; private $hack _name; public function __construct ($name, $id) {  $this->hack_name = $nam e;  $this->hack_id = $id; Public Function Print () {  echo $this->hack_name. Php_eol; }} $obj = new Fobnn (' Fobnn ', 1); $obj->print (); $serializedstr = serialize ($obj); Serializes the echo $SERIALIZEDSTR through the Serialize interface. Php_eol;; $toobj = Unserialize ($SERIALIZEDSTR);//Deserialize $toobj->print () by Unserialize;

Fobnno:5: "FOBNN": 2:{s:7: "hack_id"; i:1;s:16: "Fobnnhack_name"; s:5: "Fobnn";} Fobnn

See the output of the second row, this string is the result of serialization, the structure is actually very readable, can be found by the object name/member name to map, of course, the different access rights of the members after the serialization of the label name slightly different.

According to the 3 questions I mentioned above, then we can take a look at

1. Self-describing function

O:5: "FOBNN": 2 where O represents the type of object, and the type name is FOBNN, in this format, the following 2 indicates that there are 2 member objects.

As for the member object, it is actually the same condom description, which is a recursive definition.

The self-describing feature is implemented primarily through string recording of object and member names.

2. Performance issues

PHP Serialization Time performance This article does not analyze, see later, but the serialization results are actually similar to the Json/bson defined protocol, there is a protocol header, the protocol header describes the type, the protocol body describes the value of the type, and does not compress the serialization results.

2. The Magic method in deserialization

Corresponding to the second problem mentioned above, in fact, PHP also has a solution, one is through the Magic method, the second is a custom serialization function. Let's introduce the Magic method __sleep and __wakeup.

Class fobnn{public $hack _id; private $hack _name; public function __construct ($name, $id) {  $this->hack_name = $nam e;  $this->hack_id = $id; Public Function Print () {  echo $this->hack_name. Php_eol; } public Function __sleep () {  return array ("Hack_name"), Public function __wakeup () {  $this->hack_name = ' hah A '; }} $obj = new Fobnn (' Fobnn ', 1); $obj->print (); $serializedstr = serialize ($obj); Echo $serializedstr. Php_eol;; $toobj = Unserialize ($SERIALIZEDSTR); $toobj->print ();

Fobnno:5: "FOBNN": 1:{s:16: "Fobnnhack_name"; s:5: "Fobnn";} haha

A call to __sleep before serialization returns an array of member names that need to be serialized, so that we can control the data that needs to be serialized, and in the case I return only hack_name, you can see that only the Hack_name members are serialized in the results.

After serialization is complete, the __wakeup is skipped. Here we can do some follow-up work, such as re-connecting the database.

3. Custom Serializable interface

Interface Serializable {Abstract public string serialize (void), abstract public void Unserialize (string $serialized)}

Through this interface we can customize the serialization and deserialization behavior, which can be used to customize our serialization format.

Class FOBNN implements serializable{public $hack _id; private $hack _name; public function __construct ($name, $id) {  $t His->hack_name = $name;  $this->hack_id = $id; Public Function Print () {  echo $this->hack_name. Php_eol; } public Function __sleep () {  return array (' Hack_name '),} public function __wakeup () {  $this->hack_name = ' hah A '; The public Function serialize () {  return Json_encode (array (' id ' = = $this->hack_id, ' name ' = = $this->hack_ name)); Public Function Unserialize ($var) {  $array = Json_decode ($var, true);  $this->hack_name = $array [' name '];  $this->hack_id = $array [' id ']; }} $obj = new Fobnn (' Fobnn ', 1); $obj->print (); $serializedstr = serialize ($obj); Echo $serializedstr. Php_eol;; $toobj = Unserialize ($SERIALIZEDSTR); $toobj->print ();

Fobnnc:5: "FOBNN": 23:{{"id": 1, "name": "FOBNN"}}FOBNN

Our Magic method is useless when we use a custom serialization interface.

4.PHP dynamic type and PHP deserialization

Since the self-describing feature mentioned above, the type of object is saved in the serialization result, and PHP is a dynamic type language, then we can do a simple experiment.

Class fobnn{public $hack _id, public $hack _name, public function __construct ($name, $id) {  $this->hack_name = $name ;  $this->hack_id = $id; Public Function Print () {  var_dump ($this->hack_name);}} $obj = new Fobnn (' Fobnn ', 1); $obj->print (); $serializedstr = serialize ($obj); Echo $serializedstr. Php_eol;; $toobj = Unserialize ($SERIALIZEDSTR); $toobj->print (); $toobj 2 = unserialize ("o:5:\" fobnn\ ": 2:{s:7:\" hack_id\ "; I : 1;s:9:\ "hack_name\"; i:12345;} "); $toobj 2->print ();

We modified hack_name to deserialize the result of the int type, i:12345

String (5) "FOBNN" O:5: "FOBNN": 2:{s:7: "hack_id"; I:1;s:9: "Hack_name"; s:5: "Fobnn";} String (5) "FOBNN" Int (12345)

It can be found that the object is successfully serialized back! and can work properly!. Of course, this mechanism of PHP provides a flexible syntax, but also introduces security risks. Subsequent parsing of the security issues posed by the PHP serialization and deserialization features continues.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.