What is the string interning (string resident) and the string intern mechanism in python, interningintern
Incomputer science, string interning is a method of storing only onecopy of each distinct string value, which must be immutable. interning strings makes some stringprocessing tasks more time-or space-efficient at the cost of requiring moretime when the string is created or interned. the distinct values are stored ina string intern pool. -- imported from Wikipedia
That is to say, a string object with the same value will only save one copy, which is shared. This also determines that the string must be an immutable object. Think about it. Just like the numeric type, you only need to save a copy of the same numeric value, and there is no need to distinguish it with different objects.
The string in python adopts the intern mechanism and will automatically intern.
> A = 'yzc'
> B = 'K' + 'zc'
> Id ()
55704656
> Id (B)
55704656
We can see that they are the same object.
The advantage of the intern mechanism is that strings with the same value (such as identifiers) are used directly from the pool to avoid frequent creation and destruction, improve efficiency, and save memory. The disadvantage is that splicing strings and modifying strings affect performance. Because it is immutable, modifying the string is not an inplace operation. You need to create a new object. This is why join () is not recommended when splicing multiple strings (). Join () is to calculate the length of all strings first, then copy them one by one, and only create the object once.
Be carefulPitfallNot all strings use the intern mechanism.Only contains underlines, numbers, and lettersIs intern.
> A = 'Hello world'
> B = 'Hello world'
> Id ()
56400384
> Id (B)
56398336
Here, there are spaces, and all are not intern.
But why? Since pythonBuilt-in function intern ()Allows you to explicitly intern any string. It is not a problem of implementation difficulty.
The answer can be found in the comment in source code stringobject. h,
/* ...... This is generally restricted tostrings that"Looklike" Python identifiers, Although the intern () builtincan be used to force interning of any string ......*/
That is to say, only intern is performed for those that look like python identifiers.
Next let's look at anotherPitfall,
Example 1.
> 'Yz' + 'C' is 'yzc'
True
Example 2.
> S1 = 'kz'
> S2 = 'yzc'
> S1 + 'C' is 'kzc'
False
Why is the second chestnut False, which only contains letters? Shouldn't it be automatically intern?
This is because in the first Chestnut, 'kz' + 'C' is evaluated in compile time and replaced with 'kzc '.
In the second case, s1 + 'C' is spliced at run-time, resulting in no automatic intern.