Linux-0.11 Core Source Code Analysis series: Memory management Get_free_page () function analysis

Source: Internet
Author: User

Linux-0.11 Memory Management module is the source of the more difficult to understand the part, now the author's personal understanding published
First hair Linux-0.11 kernel memory management get_free_page () function analysis

Have time to write other functions or files:)

/* *author:davidlin *date:2014-11-11pm *email: [email protected] or [email protected] *world:the City of SZ      , in China *ver:000.000.001 *history:editor time do 1) Linpeng 2014-11-11            Created this file! 2) */

Here is the source code for Linus:

/* Get Physical Address of first (actually last:-) free page, and mark it * used. If no free pages left, return 0. */unsigned long get_free_page (void) {Register unsigned long __res asm ("Ax"); __asm__ ("STD; Repne; Scasb\n\t "    " jne 1f\n\t "    " Movb $1,1 (%%edi) \n\t "    " Sall $12,%%ecx\n\t "    " Addl%2,%%ecx\n\t "    " Movl% %ecx,%%edx\n\t "    movl $1024,%%ecx\n\t"    "Leal 4092 (%%edx),%%edi\n\t"    "rep; stosl\n\t"    "Movl%%edx,% %eax\n "    1:"    : "=a" (__res)    : "0" (0), "I" (Low_mem), "C" (paging_pages),    "D" (mem_map+paging_ PAGES-1)    : "Di", "CX", "DX"); return __res;}

1. Function Purpose:
Looking for mem_map[0 ... ( PAGING_PAGES-1)]. Spare items in the. That is, the mem_map[i]==0 item, if found.
Returns the physical address, unable to find return 0
2. Tips:
Why is this code implemented in C nested assembly?
I personally think the C function will open up the stack frame. The task stack may be contaminated.
At the same time, the function needs to be frequently called, in the assembly, the Register-level assembly instruction operation efficiency is higher than C:)
3. Code Analysis:
(1) Register unsigned long __res asm ("Ax");
The __res is a register-level variable, and the value is stored in the AX register, which means that the operation of the __res equals the operation of the AX register for efficiency considerations
(2) __asm__ ("STD; Repne; Scasb\n\t "
The cycle is relatively comparative. Find the Mem_map[i]==0 page;
STD setting df=1, so SCASB run decrement operation, involving register AL, ecx, es: (e) di three registers, in definition of function tail
: "0" (0), "I" (Low_mem), "C" (Paging_pages),
"D" (mem_map+paging_pages-1)
: "Di", "CX", "DX");
that there

Al = 0; Suppose mem_map[i] = = 0, denoted as spare page, otherwise allocated occupied, Al save 0 value for comparison

ECX = paging_pages; Number of main memory leaf tables

Es:di = (mem_map+paging_pages-1); Memory management Array Last item
The meaning of this instruction is from the array mem_map[0: ( PAGING_PAGES-1)] "The last item
MEM_MAP[PAGING_PAGES-1] Start, compare mem_map[i] is equal to 0 (0 values are saved in the AL register);
Each comparison, Es:di value minus 1, assuming unequal, es:di value minus 1, that is mem_map[i--], continue to compare, until ecx = = 0;
If they are equal, jump out of the loop

C language implementations such as the following:

    ......    index_ = 0;    for (i = paging_pages-1; I! = 0; i--)    {        if (0! = Mem_map[i]) {            continue;   Continue loop        }        else {            index_ = i;//jump out of loop break            ;        }    }        if (0 = = index_) {      goto label_1;    }    Label_1:           return index_;  ......

(3) "Jne 1f\n\t"
Suppose Mem_map[0: ( PAGING_PAGES-1)] are not equal to 0,
Jump to the label 1f Run, NF represents the forward label, NB for the backward label, n is the value of 1-10 decimal digits
(4) "Movb $1,1 (%%edi) \n\t"
Mem_map[i]==0 is mem_map[0. PAGING_PAGES-1)] In reverse order the first found equal to 0 of the target,
The lowest position of the EDI is 1. That is mem_map[i]=1, the flag is occupied for the page, not the spare bit
(5) "Sall $12,%%ecx\n\t"
At this time ecx save is mem_map[i] subscript I, that is, the relative number of pages,
Example:
If Mem_map[0: ( PAGING_PAGES-1)] [last parameter]
MEM_MAP[PAGING_PAGES-1] = = 0. i.e. i = = (paging_pages-1),
So at this time *ecx = = paging_pages-1;
At this point the relative page address is 4k* (paging_pages-1),
1024 4-byte physical pages per page, 12-bit left shift equals 4096 (2 12-square),
(6) "Addl%2,%%ecx\n\t"
Add a low-end memory address to get the actual physical address
%2 equals LOW_MEM, defined in the following statement, for example
"0" (0), "I" (Low_mem), "C" (Paging_pages),
Questions:
Why is 4k* (paging_pages-1) not the actual physical address?
The answer is initialized with the following examples:
Mem_map[0.. ( Paging_pages)] is the main memory management array
Management is only 1-16m space, that is paging_memory = ((16-1) *1024*1024)

does not contain 0-1m (0-1m, in fact 0-640k is already occupied by the kernel)

        #define LOW_MEM 0x100000        #define Paging_memory (15*1024*1024)        #define Paging_pages (paging_memory>>12)        #define MAP_NR (Addr) (((addr)-low_mem) >>12)        void Mem_init (Long start_mem, long End_mem)        {            int i;            High_memory = End_mem;            for (i=0; i<paging_pages; i++) {                mem_map[i] = used;            } All main memory areas are initialized to be occupied            i = MAP_NR (START_MEM);            End_mem-= Start_mem;            End_mem >>=;            while (end_mem-->0)            mem_map[i++]=0;        }

(7) "Movl%%ecx,%%edx\n\t"
Save the value of the ECX register in the edx register. The actual physical address is saved to the EDX register.


(8) "Movl $1024,%%ecx\n\t"
Save 1024 to the ECX register, because each page occupies 4096 bytes (4K),
Actual physical memory, 4 bytes per item, 1024 items.
(9) "Leal 4092 (%%edx),%%edi\n\t"
Because of the 4-byte alignment, each item occupies 4 bytes,
Take the last item of the current physical page 4096 = 4096-4 = 1023*4 = (1024-1).


Save the end of the physical page in the EDI register,
The address at ecx+4092 is stored in the EDI register.


(Ten) "rep; Stosl\n\t "
Start from ecx+4092, reverse direction, step 4, repeat 1024 times,
Fill in the value of the EAX register with all 1024 items of the physical page.
In the following code definition, for example, EAX is initialized to 0 (al=0,eax =0,ax =0)
: "0" (0), "I" (Low_mem), "C" (Paging_pages),
So the physical page of all 1024 items is zeroed.


(one) "Movl%%edx,%%eax\n"
Place the physical page start address into the EAX register.
In Intel's Eabi rules,
EAX registers are used to hold function return values
(12) "1:"
Label 1, for the "jne 1f\n\t" statement, jumps back 0 values.
Attention:
The EAX register is only assigned in "Movl%%edx,%%eax\n",
EAX Register initial value is ' 0 ', assuming jump to label "1:" Place,
The return value is 0. Indicates that there is no spare physical page.
(+): "=a" (__res)
The list of output registers. There is only one, where a represents the EAX register
(+): "0" (0), "I" (Low_mem), "C" (Paging_pages),
"0" means the same register as the output of the same position above, that is, "0" is equal to the output register EAX.
That is, the EAX is both an output register and an input register at the same time.
Of course, in the case of minimal time granularity, eax cannot be used as an input or output register
Can only be used as input or output registers;

"I" (low_mem) is%2, from the output register to the input register sequence number%0,%1,%2.....%n,
The "I" means the number immediately. is not the code of EDI, the code of EDI is "D";

"C" (paging_pages) means that the ECX register is stored in Paging_pages,
ECX Register Code "C".

(+) "D" (mem_map+paging_pages-1)
"D" uses the EDI register, which is the value saved by the EDI Register (MEM_MAP+PAGING_PAGES-1)
i.e.%%edi = &mem_map[paging_pages-1].
(+): "Di", "CX", "DX");
Keep registers, tell the compiler "Di", "CX", "DX" Three registers have been assigned,
In compiler compilation, these three registers are not assigned as input or output registers.
(+) return __res;
Returns the value saved by the __res.
Equivalent to the assembly of RET, implicitly returning the EAX register.
The C language is explicitly returned.

4. Assembly instructions and parsing of grammatical rules. Official Intel Document "Volume 2A instruction Set Reference (a-m)"
"Volume 2B instruction Set Reference (n-z)", GNU Assembly rules
(1) Std:
The ESI and/or EDI direction is primarily set to decrement. corresponding CLD (for direction set to increment)
1) operation
Sets the DF flag in the EFlags register. When the DF flag was set to 1, string operations
Decrement the index registers (ESI and/or EDI).
This instruction ' s operation are the same in non-64-bit modes and 64-bit mode.
2) operation
DF-1;
(2) Repne:
1) Description
Repeats A string instruction the number of times specified in the Count register or
Until the indicated condition of the ZF flag is no longer met. The REP (repeat), Repe
(Repeat while equal), Repne (repeat when not equal), REPZ (repeat while zero), and
REPNZ (repeat while not zero) mnemonics is prefixes that can is added to one of
The string instructions. The REP prefix can added to the INS, outs, MOVS, LODs,
and stos instructions, and the Repe, Repne, REPZ, and REPNZ prefixes can be
Added to the CMPS and SCAS instructions. (The REPZ and REPNZ prefixes are synonymous
Forms of the Repe and Repne prefixes, respectively.) The behavior of the REP
Prefix is undefined if used with non-string instructions.
The REP prefixes apply only to one string instruction at a time. To repeat a block of
Instructions, use the LOOP instruction or another looping construct. All of these
Repeat prefixes cause the associated instruction to being repeated until the count in
Register is decremented to 0. See Table 4-13.

2) operation

    IF addresssize = when        use            CX for Countreg;        ELSE IF addresssize = and REX. W used then use            RCX for Countreg; FI;        ELSE use            ECX for Countreg;    FI;    While Countreg = 0        does            Service pending interrupts (if any);            Execute associated string instruction;            Countreg <-(countreg–1);            IF countreg = 0 Then            exit while loop; FI;            IF (Repeat prefix is REPZ or Repe) and (ZF = 0)            or (Repeat prefix are REPNZ or REPNE) and (ZF = 1) then            exit Whil E Loop; FI;        OD;

(3) SCASB:
GNU Compendium
In assembly language, SCASB is a string manipulation instruction derived from the abbreviation "SCAN string Byte". The detailed operation of this instruction is:
    ---------------------------------------------------------------------------------------Code |    Mnemonic |   Description---------------------------------------------------------------------------------------AE |    SCAS M8 | Compare AL with a byte at ES: (E) DI and set status flags-----------------------------------------------------------------   ----------------------AF |    SCAS M16 | Compare AX with Word at ES: (E) DI and set status flags-----------------------------------------------------------------   ----------------------AF |    SCAS M32 | Compare EAX with Doubleword at ES (E) DI and set status flags-----------------------------------------------------------   ----------------------------AE |    SCASB | Compare AL with a byte at ES: (E) DI and set status flags-----------------------------------------------------------------   ----------------------AF |    SCASW | Compare AX with Word at ES: (E) DI and SET status flags---------------------------------------------------------------------------------------AF |    SCASD | Compare EAX with Doubleword at ES: (E) DI and set status flags---------------------------------------------------------- -----------------------------

Calculates al-byte of [Es:edi], setting the value of the corresponding flag register.
Change the value of the Register EDI: Assuming the flag df is 0, then Inc EDI; Suppose DF is 1. The Dec EDI.
The SCASB instruction is often used in combination with the circular instruction REPZ/REPNZ.

Like what. The REPNZ SCASB statement indicates that when the register is ecx>0 and the flag register is zf=0, the SCASB instruction is run again.
If the value of the Register Al is not equal, the word is searched repeatedly
(4) Sall
such as Sall,%ECX.
This instruction is the algorithm left shift, equivalent to the C language of the left shift operator <<
The SAL in the Intel Assembler Directive (shit arithmetic left).
According to the grammar rules,
Because it is a long-form operation (ECX),
So add an "L" to the Intel assembly instruction Sal,
That is converted into Sall.


(5) Stosl
The STOSL instruction is equivalent to saving the value in EAX to the address that Es:edi points to.
If the orientation position in the eflags is set (that is, the STD instruction is used before the STOSL command)
Then the EDI is self-minus 4. Otherwise (using the CLD Directive) EDI self-increment 4.

(6) Eax,ax,ah,al

        00000000 00000000 00000000 00000000        |===============eax===============|--32 0, 4 bytes, 2 characters, 1 double word                          |======ax==== ===|--16 0, 2 bytes, 1 characters                          |==ah===|-----------8 0, 1 bytes                                  |===al==|---8 0, 1 bytes

The EAX is a 32-bit transmitter that simply adds one more bit of data to the original 8086CPU register ax.

So eax and Ax are not independent at all, they are the relationship between the whole and the part.
for EAX direct assignment, changing the low 16 bit naturally changes the AX value,
Ax, in turn, can affect eax overall. The same is true for the relationship between the Ah,al register and AX.

Reprint please indicate the source, thank:-)
Redatoms Come on. To Levin and Chenhao.
myblog:http://blog.csdn.net/linpeng12358
MyMail: [email protected] or [email protected]
mygithub:davillin1577
Welcome everybody!
:-)

Linux-0.11 Core Source Code Analysis series: Memory management Get_free_page () function analysis

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.