Kernel those things. Memory management (one)---kernel mapping (top)

Source: Internet
Author: User

The previous three different address spaces are briefly described, followed by a focus on the mapping of the linear address space to the physical address space.


Let's start with the 32-bit system first.

In a 32-bit system, the size of the linear address space is 2^32, which is 4GB. Kernel generally divides the linear address space into two parts at a rate of 3:1:

    • 0~3GB User address space

    • 3GB~4GB the kernel address space.

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/70/D9/wKiom1W_NS6isOSgAABJiSmORuo084.jpg "title=" 32- Mapping.png "alt=" wkiom1w_ns6isosgaabjismoruo084.jpg "width=" with "height=" 253 "border=" 0 "hspace=" 0 "vspace=" 0 "style = "width:230px;height:253px;"/>


The management and mapping of user address space is a big topic. We'll tell you more about it later.


The kernel address space is only 1GB in size and can only map up to 1GB of physical memory. The question is: What about physical memory beyond 1GB?

The solution given by Kernel is to divide the 1GB kernel address space into two parts:

    • The front 896MB: 3GB ~ (3GB+896MB), directly mapped to the 0~896MB of the physical memory.

    • Back 128MB: That is (3GB+896MB) ~ 4GB, this part can dynamically map the physical memory of different locations.


In this way, the time-sharing of the last 128MB linear address space, Kernel can access all the physical memory. Just so-called, a step back to the sky AH.


Because of this, larger than 896MB of physical memory cannot be mapped directly to the kernel address space, which is often referred to as high-end memory.


How does this 128MB in the kernel address space dynamically map physical memory at different locations? Kernel provides three mechanisms for this, as shown in.


650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/70/DA/wKiom1W_PXKQDfdtAAHt0J3gQhU569.jpg "title=" Kernel-mapping.png "width=" 720 "height=" 194 "border=" 0 "hspace=" 0 "vspace=" 0 "style=" width:720px;height:194px; "alt=" Wkiom1w_pxkqdfdtaaht0j3gqhu569.jpg "/>


Although the mechanism is different, they all need to modify the page table to complete the mapping. We first introduce a special kind of page table: Kernel page tables.


We know that each process has its own set of page tables that are used to complete the transformation of the linear address to the physical address. The kernel also maintains a set of page tables, called kernel page tables.

However, this set of pages is a bit different. After the system initialization is complete, the kernel page tables is not used directly by any user-state or kernel-state processes. Hey? Not being used by any process, what's the use of it?

The page directory table for this set of pages, the so-called Master kernel page Global directory, has some of the highest page catalog entries (128/4 = 32), providing a reference model for the process's page table. If you modify these page catalog entries in master kernel page Global directory, these changes will eventually propagate to the page table used by the process. So how does this "spread" come true? We'll tell you in more detail later.


Persistent kernel mapping (persistent Kernel Mappings)

Persistent kernel mappings can be used to map high-end memory pages to the kernel address space in a more persistent manner. Since it is a mapping, it involves three elements: the set L, the set p, and the mapping m between the two sets.


Set L: The kernel reserves an area in the 128MB linear address space. As shown, the range of the area is Pkmap_base ~ Fixaddr_start.

Set P: High-end memory page.

Map M: This is what you want to talk about here.


Data

The persistent kernel mapping uses a page table in the Master kernel page tables for its own purpose: pkmap_page_table.

Macro Last_pkmap defines how many page table entries are in the page table.

#ifdef CONFIG_X86_PAE #define LAST_PKMAP #else #define LAST_PKMAP #endif

If PAE is not turned on, the page table has 1024-page table entries, so the persistent kernel mapping can map 4MB of high-end memory at once.


An integer array pkmap_count is used to describe the usage of each page table item in the page table pkmap_page_table.

----static int pkmap_count[last_pkmap];
    • Counter equals 0: The corresponding page table entry does not have a mapped high-end memory page, which can be used.

    • Counter equals 1: The corresponding page table entry does not map the high-end memory page, but it cannot be used because its corresponding TLB entry has not been refreshed.

    • Counter greater than 1: the corresponding page table entry maps the high-end memory page and is used (n-1) times.


In order to be able to conveniently find a high-end memory page mapped to a linear address through the persistent kernel mapping mechanism, the kernel uses a hash table: Page_address_htable.

Each entry in the hash table is a struct PAGE_ADDRESS_MAP:

234/*235 * Describes one page->virtual association236 */237 struct Page_address_map {238 struct page *page;239 void *virtual;240 struct List_head list;241};


Before describing how the kernel can use these data structures to create persistent kernel mappings, let's look at how the kernel locates the linear addresses to which a physical page is mapped.


Find

262 void *page_address (struct page *page) 263 {264      unsigned long flags;265     void *ret;266      struct page_address_slot *pas;267268     if  (! Pagehighmem (page)) 269         return lowmem_page_address ( page); 270271     pas = page_slot (page); 272     ret  = null;273     spin_lock_irqsave (&pas->lock, flags);274      if  (!list_empty (&PAS->LH))  {275          struct page_address_map *pam;276277          list_for_each_entry (pam, &pas->lh, list)  {278              if&nbsp (pam->page == page)  {279                  ret = pam->virtual;280                  goto done;281              }282          }283     }284 done:285     spin_unlock_ Irqrestore (&pas->lock, flags); 286     return ret;287 }


The function page_address () distinguishes two cases:

1. This page does not belong to high-end memory. Then the page is in physical memory 0 ~ 896MB, the range of physical memory pages are directly mapped to the kernel address space, so the page's linear address always exists, and can be computed by the page index.

536 static __always_inline void *lowmem_page_address (struct page *page) 537 {538 return __va (PAGE_TO_PFN (page) <&L T Page_shift); 539}


2. This page belongs to high-end memory. At this point we use the hash table that I said earlier. If the page is found in the hash table, indicating that the page has established a persistent kernel map, it returns its linear address, otherwise null is returned.


Mapping

Establishing a persistent kernel map is done by the function Kmap ().

4 void *kmap (struct page *page) 5 {6 might_sleep (); 7 if (!  Pagehighmem (page)) 8 return page_address (page); 9 return Kmap_high (page); 10}

If the page does not belong to high-end memory, the corresponding linear address is returned directly. Otherwise, the function Kmap_high () is used to complete the mapping operation.


166 void fastcall *kmap_high (struct page *page) 167 {168      unsigned long vaddr;169170     /*171       * for highmem pages, we can ' t trust  "virtual"  until172       * after we have the lock.173       *174      * We cannot call this from  interrupts, as it may block175      */176      spin_lock (&kmap_lock);177     vaddr =  (Unsigned long) Page_address (page);178     if  (!VADDR) 179          vaddr = map_new_virtual (page); 180     pkmap_count[pkmap_ NR (VADDR)]++;181     bug_on (PKMAP_COUNT[PKMAP_NR (VADDR)] < 2); 182     spin_ Unlock (&kmap_lock);183     return  (void*)  vaddr;184 }

The function first determines whether the page has been mapped. If not, a new mapping is established through the function map_new_virtual ().

Finally increments the count value in the array pkmap_count.


116 static inline unsigned long map_new_virtual (struct page *page) 117  {118     unsigned long vaddr;119     int  count;120121 start:122     count = last_pkmap;123      /* Find an empty entry */124     for  (;;)  {125         last_pkmap_nr =  (last_pkmap_nr  + 1)  & LAST_PKMAP_MASK;126         if  (! LAST_PKMAP_NR)  {127             flush_ All_zero_pkmaps ();128             count  = last_pkmap;129         }130          if  (!pkmap_count[last_pkmap_nr]) 131              break;  /* Found a usable entry */132          if  (--count) 133              continue;134135         /*136           * sleep for somebody else to unmap  their entries137          */138          {139              declare_waitqueue (wait, current);140141              __set_current_state (task_uninterruptible);142           &nbsP;  add_wait_queue (&pkmap_map_wait, &wait);143              spin_unlock (&kmap_lock);144              schedule ();145              remove_wait_queue (&pkmap_map_wait, &wait);146              spin_lock (&kmap_lock);147148              /* Somebody else might have  mapped it while we slept */149              if  (page_address (page)) 150                  return  (Unsigned long) page_address (page); 151152              /* Re-start */153              goto start;154        &NBSP;&NBSP;}155&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;}156&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;VADDR&NBSP;=&NBSP;PKMAP_ADDR ( LAST_PKMAP_NR); 157     set_pte_at (&init_mm, vaddr,158             & (Pkmap_page_table[last_pkmap_nr]),  mk_pte (page ,  kmap_prot));159160     pkmap_count[last_pkmap_nr] = 1;161      set_page_address (page,  (void *) vaddr);162163      return vaddr;164 }

The function does two things:

    • Find an available page table entry by looking up the array pkmap_count. With this page table entry, the mapping of the linear address to the physical page is established.

    • After the mapping is established, the mapping relationship is inserted into the hash table to facilitate later lookup operations.


When an available page table entry is found, it starts at the last location (LAST_PKMAP_NR). If the end of the array pkmap_count has been traversed, it is then traversed from the beginning of the array header. However, before traversing, the function flush_all_zero_pkmaps () is called first. This function specifically looks for page table entries with a counter of 1, and then does four things for them:

    • Reset the counter to 0.

    • Clears the corresponding page table entry in the page table pkmap_page_table.

    • Deletes the corresponding element in the hash table.

    • Refreshes all the TLB entries in the linear address space that belong to the persistent kernel map.


If no available page table entries are found, the current process goes to sleep until another process frees a page table entry in pkmap_page_table. Because of this, the function kmap () blocks the current process and cannot be used in the context of the interrupt.

If an available page table entry is found, the corresponding linear address is computed, the mapping is established in the page table pkmap_page_table corresponding to the page table, the counter in Pkmap_count is set to 1, and the mapping relationship is inserted into the hash table.


Releasing a persistent kernel mapping is done by the function Kunmap ().

The Void Kunmap (struct page *page) (In_interrupt ()) () () Pagehighmem (page)) + return; Kunmap_high (page); 19}

If the page does not belong to high-end memory, there is nothing to do. Otherwise, the function Kunmap_high () is used to de-map.


188 void fastcall kunmap_high (struct page *page) 189 {190      unsigned long vaddr;191     unsigned long nr;192      int need_wakeup;193194     spin_lock (&kmap_lock); 195      vaddr =  (Unsigned long) page_address (page);196    &NBSP;&NBSP;BUG_ON (!VADDR); 197     nr = pkmap_nr (VADDR);203      need_wakeup = 0;204     switch  (--pkmap_count[nr])  {205     case 0:206          BUG ();207     case 1:218          Need_wakeup = waitqueue_active (&pkmap_map_wait);219     }220      spin_unLock (&kmap_lock);221222     /* do wake-up, if needed,  race-free outside of the spin lock */223     if  ( Need_wakeup) 224         wake_up (&pkmap_map_wait); 225 }

The implementation of this function is very simple. Decrements the corresponding counter in the array pkmap_count. If counter is found to be 1, it indicates that its corresponding page table entry is no longer in use, and wakes up in function map_new_virtual () in the process of waiting for the available page table entries to go to sleep.


This article is from the "Kernel blogs" blog, so be sure to keep this source http://richardguo.blog.51cto.com/9343720/1682364

Kernel those things. Memory management (one)---kernel mapping (top)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.