Python source code analysis Reading Notes: Chapter 3-string objects

Source: Internet
Author: User
Chapter 3-String object

String object definition:
Typedef struct {
Pyobject_var_head
Long ob_shash;
Int ob_sstate;
Char ob_sval [1];
} Pystringobject;
Because the string is a variable-length object, there is a variable-length object header.
Ob_shash is used to cache the hash value of the current string, which is useful when querying dict objects whose string is used as the key.
Ob_sstate indicates whether the string is interned.
Finally, this ob_sval is clever and defines an array of characters. Its role is actually a pointer to the string buffer after the memory is actually allocated. When actually allocating memory to string objects, the memory size applied by python is pystringobject + String Length. Python requires that the end of the string buffer must be '\ 0'. Therefore, the length of the buffer actually pointed to by ob_sval is exactly Length + 1, and ob_sval is exactly the pointer to the buffer at the end of the string object.

The intern mechanism of string objects:
For the same two strings, Python does not create two string objects for them to save memory. The intern mechanism is maintained by a global intern dictionary. When a new string is created, the system queries the intern dictionary to check whether the string already exists. If yes, the reference count of the string object in the dictionary is directly increased. If no, add it to the dictionary.

Like an integer object, string objects also have an object pool mechanism to quickly reference ASCII code objects. The principle is that when a string is created, the pystring_fromstring function checks the string length. If the length is 1, the string is created first, and then the intern string is created, finally, the string object pointer is saved in the characters Object pool. If the length of the subsequent access string is 1, the object pointer is directly returned from the object pool.

Efficiency of String concatenation: There are two methods for String concatenation: the "+" operator and the str. Join (iterable) function. The "+" operator calls the string_concat function. It will re-apply for a piece of memory, and then copy the two strings to the new memory, return a String object. That is to say, N String concatenation will call the N-1 string_concat function, that is, to apply for N-1 times memory, efficiency is very bottom. So STR is officially recommended. the join (iterable) function accepts a list, tuple, and other iteratable objects, counts the number of strings, uniformly allocates memory, and copies the left and right strings to the new memory, and returns a String object. The latter is much more efficient.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.