About pointers and stacks

Source: Internet
Author: User

Reprint: http://blog.qdac.cc/?p=2804

"Code God" [Changchun]swish (109867294) 21:17:40

This piece wants to understand a thing, the data that our program can manipulate, from the CPU point of view, just the few things in the register. The rest of the memory, the stuff on the disk, is actually something external to the CPU core.

"Code God" [Changchun]swish (109867294) 21:19:07
As for a person, your brain can manipulate your own body and limbs, manipulating people and things around you through your body and language.

"Code Emperor" "Ningbo" Empty (9534557) 21:19:09
A variable is a nickname for an address.

"Code God" [Changchun]swish (109867294) 21:19:32
Yes, for the CPU, memory is not directly manipulated.

"Code God" [Changchun]swish (109867294) 21:20:24
So, there is a problem, what do we do to manipulate a piece of memory?

"Code God" [Changchun]swish (109867294) 21:21:54
We have to call a person, first of all know that this person is there, you can't even find this person, you want to manipulate him, that is the daydream. When you manipulate a piece of memory, of course, you know that the memory is there, and that's the chance to manipulate it.

"Code of God" [Shenzhen] Voice small white (2514718952) 21:22:08
@[Changchun]swish So we look at the disassembly, there's a bunch of mov eax, [xxxx]

"Code of God" [Shenzhen] Voice small white (2514718952) 21:22:44
The memory address must be placed in the register before the operation can be performed

"Code God" [Changchun]swish (109867294) 21:23:33
The location we use to mark the memory block is the address. The variable used to store this address is called a pointer in the programming language.

"Code God" [Changchun]swish (109867294) 21:24:17
This means that you should understand that pointers are variables used in programming languages to store the address of a piece of memory.

"Code God" [Changchun]swish (109867294) 21:25:11
So the problem is, the variable itself also needs to take up memory, so it will have its own address. This address, where does it exist again?

"Code God" [Changchun]swish (109867294) 21:27:15
We declare a pointer-type variable, where 99% is present in the stack, and the remaining 1% is the heap

"Code God" [Changchun]swish (109867294) 21:27:52
Like what:
Var
P:pointer;
...

"Code God" [Changchun]swish (109867294) 21:28:15

The address of the P itself is present in the stack, not in the heap.

"Code God" [Changchun]swish (109867294) 21:29:26
But the interesting is the address that is stored in the pointer, 99% is in the heap

"Code God" [Changchun]swish (109867294) 21:31:16
Let's use a simple example to illustrate the pointer:
Var
s:unicodestring;
Begin
SetLength (s,0);
SetLength (s,100);
s:= ' abc ';
S:= ' Def ';
End

"Code God" [Changchun]swish (109867294) 21:31:47
This example is simple, unicodestring is the type of the default String corresponding to XE

"Code God" [Changchun]swish (109867294) 21:32:24
Note here that a String is not a simple type.

"Code Demon" [Qingdao] Itsuki (345148965) 21:33:05
A string is a special object of its own life cycle

"Code God" [Changchun]swish (109867294) 21:33:11
To understand the nature of this type of String, we move its corresponding C + + definition.

"Code God" [Changchun]swish (109867294) 21:34:34
We'll pick up the core.
"Code God" [Changchun]swish (109867294) 21:34:35
#pragma pack (push,1)
struct Strrec {
#ifdef _win64
int _padding;
#endif/* _win64 */
unsigned short codePage;
unsigned short elemsize;
int refcnt;
int length;
};
#pragma pack (POP)

Const strrec& GETREC () const;
strrec& Getrec ();

Private
Widechar *data;
"Code God" [Changchun]swish (109867294) 21:35:12
This is its definition and you can see that it is a STRREC record + a data pointer form

"Code God" [Changchun]swish (109867294) 21:35:27
Now back to the example:
Var
s:unicodestring;

"Code God" [Changchun]swish (109867294) 21:35:39
Let's talk about what's going on here.

"Code of God" [Shenzhen] Voice small white (2514718952) 21:36:47
Built a strrec on the stack.

"Code God" [Changchun]swish (109867294) 21:36:52
First, because the memory is allocated on the stack, so the address of S is the address of the current stack, and then the address is the address of the variable s, which is the address we get when we @S.

"Code God" [Changchun]swish (109867294) 21:37:19
The voice dropped the data member

"Code God" [Changchun]swish (109867294) 21:37:58
is the value of ESP increased by at least strrec+data this value

"Code of God" [Shenzhen] Voice small white (2514718952) 21:38:08
Well
"Code God" [Changchun]swish (109867294) 21:38:46
@S this gets the value if we assign to a variable, this variable is a pointer-type variable

"Code God" [Changchun]swish (109867294) 21:39:09
Then go down and see what happens to SetLength (s,0).

"Code God" [Changchun]swish (109867294) 21:39:43
In fact, unicodestring as the internal type of Delphi, enjoy a lot of advanced treatment.

"Code God" [Changchun]swish (109867294) 21:40:07
1, it is not like a normal record, it will be initialized
2, its value will automatically clean up;
3. It manages the release of Data members based on reference counting;

"Code God" [Changchun]swish (109867294) 21:41:58
Now actually before SetLength, it already sounds like a sound, the content of the memory block on the stack that initializes the original ESP address to the new ESP address is 0.

"Code God" [Changchun]swish (109867294) 21:42:53
Then call SetLength, it will compare strrec.length with the length of your needs, and if you do nothing

"Code God" [Changchun]swish (109867294) 21:45:13
SetLength (s,100)
The thing to do is to compare the length, then allocate 100 bytes of memory with Getmem, and save its address to the data member, this data is a pointer to the new memory, the original is pointing to the empty address, now has a new address.

"Code God" [Changchun]swish (109867294) 21:46:41
We run to the next step:
s:= ' abc ';
What's going on here?

"Code God" [Changchun]swish (109867294) 21:49:22
ABC is first a string constant, this thing will first in the memory of some obscure corner of the stay

"Code God" [Changchun]swish (109867294) 21:49:57
This actually happened a couple of things.

"Code God" [Changchun]swish (109867294) 21:53:23
The first sentence is to save the address of S to EAX, and the second step is to put the ' ABC ' constant in the memory address into edx

Then call the USTRLASG function to complete the actual assignment process.

"Code God" [Changchun]swish (109867294) 21:54:12
Procedure _ustrlasg (var dest:unicodestring; const source:unicodestring); Locals
Var
P:pointer;
Begin
If Pointer (Source) <> Nil Then
_ustraddref (Pointer (Source));
P: = Pointer (Dest);
Pointer (Dest): = Pointer (Source);
_USTRCLR (P);
End

"Code God" [Changchun]swish (109867294) 21:54:29
Well, we can see the source of this function.

"Code God" [Changchun]swish (109867294) 21:56:23

The above code is: if the source is not empty, then the reference count of the source is +1, then save the source address to the destination address

"Code God" [Changchun]swish (109867294) 21:57:36
But you notice the argument, ' ABC ' is already a unicodestring, so actually the string constants you've defined have become unicodestring at startup.

"Code God" [Changchun]swish (109867294) 21:57:50

Note that widestring does not have a reference count

"Code God" [Changchun]swish (109867294) 21:59:40
So what we're actually doing is going to be the whole story:
1, the ABC through the empty said move way to save, at this time its reference count is 1;
2. Assign the address of the data in this string to the target variable s, and increase the reference count;
3, clean off the original memory block;

"Code God" [Changchun]swish (109867294) 22:01:01
OK, now let's pull back the pointer.

"Code God" [Changchun]swish (109867294) 22:01:41
See what has changed, what has been swimming.

"Code God" [Changchun]swish (109867294) 22:02:11
Throughout the process, the position of the stack where S is located is constant, so the value of @S is fixed

"Code God" [Changchun]swish (109867294) 22:02:43
Whether you assign a value to S or delete an element, @S is always the same as when your function just came in.

"Code God" [Changchun]swish (109867294) 22:04:02
Similarly, because @S this address to @s+sizeof (S) address of the various elements of the address is constant, that is, the data of the pointer's own address is constant, change is only the content of data this address

"Code God" [Changchun]swish (109867294) 22:07:26
Let's simplify, assuming that the address on the stack is the address of the 0,data itself we don't forget, arbitrarily specify a, assuming 16, after the initialization is complete, this memory block is like this:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
At this point, the content of Data from 16-19 of these four bytes of content is 0 0 0 0
When we use SetLength or assignment and so on, Data saves a new address, assuming that this address is 1, the contents of the above will be changed to
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0

"Code God" [Changchun]swish (109867294) 22:07:55
The reason why 1 in front is because Windows is le encoded, small head in front, this simple say

"Code God" [Changchun]swish (109867294) 22:08:28
At this point, the @Data address is unchanged, the content of Data changed, became 1

"Code God" [Changchun]swish (109867294) 22:09:51
Now continue, since the pointer is pointing to an address, then, according to the principle of a radish a pit, through the pointer, when you access the data, you can always only directly access the current element

"Code God" [Changchun]swish (109867294) 22:10:36
But the radish can be dialed each, so the address can be a jump down, you can jump forward

"Code God" [Changchun]swish (109867294) 22:12:56
Speaking of which, we need to know the rules of the jumping pit.

"Code Demon" [Qingdao] Itsuki (345148965) 22:13:00
is the data content on a heap or a stack?

"Code Emperor" "Ningbo" Empty (9534557) 22:13:11
@[Qingdao] Amu

"Code God" [Changchun]swish (109867294) 22:13:55
The principle of the jumping pit of the pointer type is that only the integer jumps, not half a half jump

"Code Demon" [Zhejiang] Empty number Waxberray (286195153) 22:14:30
How big can the heap be allocated?

"Code God" [Changchun]swish (109867294) 22:15:13
If you do not consider the distinction between 32-bit and 64-bit programs, the size of the heap can be limited by the physical available memory

"Code God" [Changchun]swish (109867294) 22:15:34
We've talked about this before.

"Code Demon" [Zhejiang] Empty number Waxberray (286195153) 22:15:34
What about the stack?

"Code God" [Changchun]swish (109867294) 22:15:45
Stack default is 1MB

"Code God" [Changchun]swish (109867294) 22:15:57
Can be modified in the project settings

"Code God" [Changchun]swish (109867294) 22:16:14
Let's go ahead and discuss heap and stack issues later.

"Code God" [Changchun]swish (109867294) 22:17:16
For example a 32 integer type pointer P, when you add 1 o'clock, it jumps 4 bytes, and when you subtract 1 o'clock, it jumps-4 bytes, not other values

"Code God" [Changchun]swish (109867294) 22:17:31
Similarly, a pointer to a pbyte type is a byte-skipping

"Code God" [Changchun]swish (109867294) 22:18:18
It is equivalent to:
P:=integer (P) +sizeof (TYPEOFP);

"Code God" [Changchun]swish (109867294) 22:20:26
It is because the length of the pointer type jump is fixed, and the pointer to the address corresponding to the memory block is continuous (99.9%), so you can jump more than one pit at a time, of course, jumping over the boundary, falling down is your own business.

"Code God" [Changchun]swish (109867294) 22:20:36
This is also where the risk of using pointers

"Code Demon" [Zhejiang] Empty number Waxberray (286195153) 22:21:39
So how do you avoid this risk?

"Code God" [Changchun]swish (109867294) 22:21:50
Because the pointer can jump around casually, in C + +, it is classified as a random iterator. About the concept of random iterators, we do not expand, know this is good.

"Code God" [Changchun]swish (109867294) 22:22:11
You will learn about these when you are interested in learning C + + STL in the future.

"Code of God" [Shenzhen] Voice small white (2514718952) 22:22:26
Judging is not over the upper limit

"Code of God" [Shenzhen] Voice small white (2514718952) 22:22:31
P + Len

"Code God" [Changchun]swish (109867294) 22:22:39
This risk is controlled by programmers, and the compiler can't control it for you.

"Code God" [Changchun]swish (109867294) 22:23:54
Out of bounds, the most typical is AV error

"Code God" [Changchun]swish (109867294) 22:24:39
But the attention is not 100% out, cannot say cross-border must be out

"Code God" [Changchun]swish (109867294) 22:24:59
Now our pointers are basically about the same.

"Code God" [Changchun]swish (109867294) 22:25:10
Come back and put the heap and the stack is going on simple say

"Code God" [Changchun]swish (109867294) 22:26:23
The memory on the stack is pre-allocated, and it's there for you, no matter what you use.

"Code Demon" [Zhejiang] Empty number Waxberray (286195153) 22:26:58
Why do you design stacks like this?

"Code God" [Changchun]swish (109867294) 22:27:09
The memory on the heap is on demand, only when used, you will apply, the operating system assigned to you, run out, you need to be responsible for the memory back

"Code God" [Changchun]swish (109867294) 22:27:44
Why is this design another lengthy speech, in short, for convenience and performance

"Code God" [Changchun]swish (109867294) 22:28:57
Because it is a continuous pre-allocation, so you declare a new variable, there is almost no extra overhead, the only thing to do is to adjust the value of the ESP, memory allocation even if it is done, release, and then adjust the value of the ESP, memory release is done, of course, this is said simple type.

"Code God" [Changchun]swish (109867294) 22:29:36
In the case of a heap, memory allocation is only limited by the size of contiguous memory blocks in physical memory because it is used with the application.

"Code God" [Changchun]swish (109867294) 22:30:09
The theoretical upper limit of the memory request on the heap is the size of the maximum contiguous block memory in your physical memory.

"Code Emperor" "Ningbo" Empty (9534557) 22:30:36
@[Shenzhen] Tone small white This is why this month on the XE7 after my suffering. Tangled for a long time.

"Code God" [Changchun]swish (109867294) 22:30:54
Because the memory on the heap is repeatedly applied and released, the concept of memory fragmentation is deduced, but not in the scope of our discussion.

"Code God" [Changchun]swish (109867294) 22:31:10
Let's talk about this first, what's the problem?

question: Is the Length of unicodestring the number of characters?

"Code God" [Changchun]swish (109867294) 22:34:12
The number of characters and length of unicodestring is not necessarily equal

"Code of God" [Shenzhen] Voice small white (2514718952) 22:34:19
What is that?

"Code Emperor" "Ningbo" Empty (9534557) 22:34:40
No. The character count is not the case. I'm not sure about that either. Because the concept count of Unicode is the number of words, 1 characters may have 2 Widerchar

"Code Demon" [Zhejiang] Empty number Waxberray (286195153) 22:34:46
Always thought it was the number of characters

"Code God" [Changchun]swish (109867294) 22:35:06
Unicode has the concept of a so-called expansion zone

"Code God" [Changchun]swish (109867294) 22:35:15
The character of the expansion area is 4 bytes

"Code of God" [Shenzhen] Voice small white (2514718952) 22:35:20
Alas, we Chinese, English, numbers are right, is the number of characters

"Code God" [Changchun]swish (109867294) 22:35:24
$DB 00~ $DFFF

"Code of God" [Shenzhen] Voice small white (2514718952) 22:35:40
It's like other languages might not be the same.

"Code God" [Changchun]swish (109867294) 22:35:41
Sounds like a problem when it comes to Chinese.

"Code God" [Changchun]swish (109867294) 22:36:02
Say the word.

"Code of God" [Shenzhen] Voice small white (2514718952) 22:36:05
Chinese in UnicodeString, will a character have a condition that is not two bytes?

"Code of God" [Shenzhen] Voice small white (2514718952) 22:36:50
The group owner said the exception?

"Code God" [Changchun]swish (109867294) 22:37:00
This word estimates that many people do not know:

"Code God" [Changchun]swish (109867294) 22:46:23
There is also a hint, GBK, kanji is also 2 or 4 bytes

"Code God" [Changchun]swish (109867294) 22:46:38
GB2312 to make a subset of GBK

Question 2: What is the most space-saving encoding?

The most space-saving is the GBK code, which was not explained last night because it was too late. Many people think that UTF8 coding is the most space-saving, but in fact not, UTF8 and GBK processing English is 1 bytes, but when processing Chinese, GBK encoding is either 2 bytes, or 4 bytes (very few), and UTF8 encoding is generally 3-4 bytes, synthesis down, GBK encoding actually accounted for The space used is still small. But the disadvantage of GBK coding is that it is based on the development of GB2312, converted to Unicode encoding, in fact, after a table conversion process, the commonality is slightly worse.

About pointers and stacks

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.