- Brief introduction
- Many types of driver programming require knowledge about how some virtual memory subsystems work
- When it comes to more complex, performance-demanding subsystems, the content discussed in this chapter will sooner or later be used
- The contents of this chapter are divided into three parts
- Describe the implementation process for MMAP system calls
- Memory pages that tell how to access user space directly across boundaries
- Describes direct memory access (DMA) I/O operations, which gives peripherals the ability to directly access system memory
- Memory Management for Linux
- Address Type
- Linux is a virtual memory system, which means that the address used by the user program is not equivalent to the physical address used by the hardware
- has virtual memory, and programs running in the system can allocate more memory than physical memory. Even a single process can have more virtual address space than the system's physical memory
- Below is a list of address types used by Linux
- user virtual address
- This is the general address that the user space program can see
- Physical Address
- This address is used between the processor and the system memory
- bus address
- The address is on the perimeter bus and Used between memory, usually they are the same as the physical address used by the processor
- kernel logical addresses
- kernel logical addresses make up the general address space of the kernel
- in most architectures, logical Address is different from the physical address associated with it, there is only a fixed offset between them
- Kmalloc The memory returned is the kernel logical address
- kernel virtual address
- and The kernel logical addresses are identical in that they all map the address of the kernel space to the physical address. The mapping of the
- kernel virtual address to the physical address does not have to be linear, one-to-one
- all logical addresses are kernel virtual addresses, but many kernel virtual addresses are not logical addresses
- Vmalloc Allocated memory has a virtual address
- <asm/page.h>
- __pa ()
- return its corresponding physical address
- __va ()
- reverses the physical address to the logical address, but this is valid only for low-end memory pages
- Physical Address and Page
- The physical address is divided into discrete units called pages.
- <asm/page.h>
- Most systems currently use 4,096 bytes per page
- High-end and low-end memory
- Use 32-bit systems to address only 4GB of memory
- The kernel divides the virtual address space of the 4GB into user space and kernel space, a typical partition is to allocate 3GB to the user space, and 1GB to allocate the kernel space
- Low-end memory
- Logical address memory that exists on the kernel space
- High-end memory
- Memory that does not exist in logical addresses
- Memory Mapping and page structure
- <linux/mm.h>
- struct page
- atomic_t count;
- The count of access to the page. When the count value is 0 o'clock, the page is returned to the idle list
- void *virtual;
- If the page is mapped, it points to the kernel virtual address of the page, or null if it is not mapped
- unsigned long flags;
- A series of flags describing the state of a page
- Pg_locked indicates that the in-memory page is locked
- Pg_reserved indicates that the memory management system is forbidden to access this page
- struct page *virt_to_page (void *kaddr);
- struct page *pfn_to_page (int pfn);
- Returns the page structure pointer for the given page frame number
- void *page_address (struct page *page);
- Returns the kernel virtual address of the page
- <linux/highmem.h>
- <asm/kmap_types.h>
- void *kmap (struct page *page);
- For low-end memory pages, return the logical address of the page
- For high-end memory, create special mappings in the dedicated kernel address space
- void Kunmap (struct page *page);
- void *kmap_atomic (struct page *page, enum km_type type);
- void kunmap_atomic (void *addr, enum km_type type);
- Page table
- The processor must use some mechanism to convert the virtual address to the corresponding physical address, a mechanism called the page table
- It is basically a multi-layered tree structure with structured data that contains mappings of virtual addresses to physical addresses and associated flags
- Virtual Memory Area
- Virtual memory Area (VMA) is used to manage kernel data structures in different areas of the process address space
- The memory map for the process contains the following areas
- Executable code area of the program
- Multiple data areas containing initialization data, non-initialized data, and program stacks
- The region corresponding to the memory map for each activity
- /proc/<pid>/maps
- Start-end Perm offset Major:minor inode image
- VM_AREA_STRUCT structure
- <linux/mm.h>
- struct VM_AREA_STRUCT
- unsigned long vm_start;
- unsigned long vm_end;
- struct file *vm_file;
- unsigned long vm_pgoff;
- unsigned long vm_flags;
- struct Vm_operations_struct *vm_ops;
- void *vm_private_data;
- struct VM_OPERATIONS_STRUCT
- void (*open) (struct vm_area_struct *vma);
- void (*close) (struct vm_area_struct *vma);
- struct page * (*nopage) (struct vm_area_struct *vma, unsigned long address, int *type);
- Int (*populate) (struct vm_area_struct *vm, unsigned long address, unsigned long len, pgprot_t prot, unsigned long pgoff, int nonblock);
- Memory-mapped processing
- <linux/sched.h>
- current->mm
- Mmap Device operation
- Memory mapping provides the ability for a user program to directly access device memory
- Mapping a device means that a piece of memory in the user space is associated with the device memory.
- Mmap abstractions like serial ports and other streaming-oriented devices cannot be
- Must be mapped in page_size units
- The Mmap method is part of the file_operations structure
- Mmap (caddr_t addr, size_t len, int prot, int flags, int fd, off_t offset)
- Int (*mmap) (struct file *filp, struct vm_area_struct *vma);
- There are two ways to create a page table
- Use the Remap_pfn_range function to build all at once
- Create a page table each time by Nopage VMA method
- Using Remap_pfn_range
- int Rempa_pfn_range (struct vm_area_struct *vma, unsigned long virt_addr, unsigned long pfn, unsigned long size, pgprot_t p ROT);
- int Io_remap_page_range (struct vm_area_struct *vma, unsigned long virt_addr, unsigned long phys_addr, unsigned long size, pgprot_t prot);
- Vma
- Virt_addr
- The starting user virtual address when remapping
- Pfn
- The page frame number corresponding to the physical memory that the virtual memory will be mapped to
- The page frame number simply shifts the physical address to the right page_shift bit
- Size
- Prot
- New VMA required "Protection (protection)" Property
- A simple implementation
- Drivers/char/mem.c
- Remap_pfn_range (VMA, Vma->vm_start, Vm_.vm_pgoff, Vma->vm_end–vma->vm_start, Vma->vm_page_prot)
- Add an action for VMA
- struct Vm_operations_struct simple_remap_vm_ops = {. open = Simple_vma_open,. Close = Simple_vma_close,}
- Mapping memory using Nopage
- If you want to support MREMAP system calls, you must implement the Nopage function
- struct page * (*nopage) (struct vm_area_struct *vma, unsigned long address, int *type);
- Get_page (struct page *pageptr);
- static int simple_nopage_mmap (struct file *filp, struct vm_area_struct *vma)
- {
- unsigned long offset = Vma->vm_pgoff << page_shift;
- if (offset >= __pa (high_memory) | | (Filp->f_flags & O_sync))
- Vm->vm_flags |= vm_reserved;
- Vm->vm_ops = &simple_nopage_vm_ops;
- Simple_vma_open (VMA);
- return 0;
- }
- struct page *simple_vma_nopage (struct vm_area_struct *vma, unsigned long address, int *type)
- {
- struct page *pageptr;
- unsigned long offset = Vma->vm_pgoff << page_shift;
- unsigned long physaddr = Address–vma->vm_start + offset;
- unsigned long pageframe = physaddr >> page_shift;
- if (!pfn_valid (pageframe))
- Pageptr = Pfn_to_page (pageframe);
- Get_page (PAGEPTR);
- if (type)
- return pageptr;
- }
- Remap Ram
- One limitation to the Remap_pfn_range function is that it can access only the physical address of the reserved page and beyond the physical memory
- Remap_pfn_range does not allow remapping of regular addresses
- Remap RAM using the Nopage method
- Using Vm_ops->nopage to process one page error at a time
- Remap the kernel virtual address
- page = Vmalloc_to_page (pageptr);
- Get_page (page);
- Perform direct I/O access
- If the amount of data that needs to be transferred is very large, the data can be transmitted directly without the need to take part in the extra copy of the operation from the kernel space, which will greatly increase the speed
- The overhead of setting up direct I/O is huge
- Use direct I/O requires write system call synchronous execution
- The application cannot be stopped until each write operation is complete
- <linux/mm.h>
- int get_user_pages (struct task_struct *tsk, struct mm_struct *mm, unsigned long start, int len, int write, int force, Stru CT page **pages, struct vm-area_struct **vmas);
- Tsk
- A pointer to a task that performs I/O, which is almost current
- Mm
- Pointer to memory management structure that describes the mapped address space
- For the driver, this parameter is always current->mm
- Force
- Write permission on the mapped page if write is nonzero
- The driver is always set to 0 for this parameter
- Pages
- If the call succeeds, pages contains a list of pointers describing the page structure of the user space buffer
- VMAs
- If the call succeeds, VMAs contains a pointer to the corresponding VMA
- Devices that use direct I/O typically use DMA operations
- Once the direct I/O operation is complete, the user memory page must be freed
- <linux/page-flags.h>
- void Setpagedirty (struct page *page);
- void page_cache_release (struct page *page);
- asynchronous I/O
- <linux/aio.h>
- ssize_t (*aio_read) (struct KIOCB *iocb, char *buffer, size_t count, loff_t offset);
- ssize_t (*aio_write) (struct KIOCB *iocb, const char *buffer, size_t count, loff_t offset);
- Int (*aio_fsync) (struct KIOCB *iocb, int datasync);
- int IS_SYNC_KIOCB (struct KIOCB *IOCB);
- int aio_complete (struct KIOCB *IOCB, long res, long res2);
- Direct Memory Access
- DMA is a hardware mechanism that allows peripheral devices and primary memory to transfer their I/O data directly, without the need for system processor involvement
- Using this mechanism can greatly increase the throughput of communication with the device
- DMA Data Transfer Overview
- There are two ways to trigger data transfer
- Software requests for data
- When the process calls read, the driver function allocates a DMA buffer and lets the hardware transfer the data to this buffer, and the process is in a sleep state
- The hardware writes data to the DMA buffer, and when the write is complete, an interrupt is generated
- The interrupt handler obtains the input data, answers the interrupt, and wakes up the process, which can now read the data
- Hardware asynchronously passes data to the system
- The hardware interrupts, announcing the arrival of new data
- The interrupt handler allocates a buffer and tells the hardware where to transfer the data
- Peripheral device writes data to buffer, resulting in another interrupt after completion
- The handler distributes the new data, wakes up any related processes, and then performs cleanup work
- Allocating a DMA buffer
- The main problem with DMA buffers is that when they are larger than one page, they must occupy the physical page of the connection because the device transmits data using either ISA or the PCI system bus, both of which use physical addresses
- DIY Assignment
- The Get_free_pages function can allocate up to a few m bytes of memory, but for a larger number of requests, even requests that are much less than 128KB will typically fail because the system memory is full of memory fragmentation
- When the kernel cannot return the requested amount of memory or needs more than 128KB of memory, in addition to returning-enomem, another method is to allocate memory at boot time or retain the top physical RAM for the buffer
- Another method is to use the Gfp_nofail allocation flag to allocate memory for the buffer
- Bus Address
- Device drivers that use DMA will communicate with the hardware that is connected to the bus interface, the hardware uses the physical address, and the program code uses the virtual address
- <asm/io.h>
- unsigned long virt_to_bus (volatile void *address);
- void *bus_to_virt (unsigned long address);
- Universal DMA Layer
- The kernel provides a DMA layer independent of the bus architecture
- <linux/dma-mapping.h>
- Dealing with complex hardware
- int dma_set_mask (struct device *dev, u64 mask);
- The mask displays bits corresponding to the ability of the device to address
- If Dma_set_mask returns 0, the DMA is not available for the device
- DMA Mapping
- A DMA mapping is a combination of the DMA buffers to be allocated and the device-accessible addresses generated for the buffer
- The DMA mapping establishes a new struct type--dma_addr_t to represent the bus address
- PCI code distinguishes between two types of DMA mappings based on the length of time the DMA buffer expects to be retained
- Consistent DMA Mapping
- This type of mapping exists in the driver life cycle
- The buffer of the consistency map must be accessible both by the CPU and the peripheral device
- The overhead of establishing and using a consistency map is significant
- Streaming DMA Mapping
- Typically, flow mapping is established for individual operations
- Kernel developers recommend using streaming mapping as much as possible before considering consistency mapping
- In systems that support map registers, each DMA map uses one or more mapping registers on the bus
- In some hardware, streaming mapping can be optimized, but the optimized method is not valid for consistency mapping
- Establishing a consistent DMA mapping
- void *dma_alloc_coherent (struct device *dev, size_t size, dma_addr_t *dma_handle, int flag);
- The return value is the kernel virtual address of the buffer
- Associated bus address, saved in Dma_handle
- void dma_free_coherent (struct device *dev, size_t size, void *vaddr, dma_addr_t dma_handle);
- DMA Pool
- DMA pool is a mechanism for generating small, consistent DMA mappings
- <linux/dmapool.h>
- struct Dma_pool *dma_pool_ Create (const char *name, struct device *dev, size_t size, size_t align, size_t allocation);
- allocation is nonzero to indicate that memory boundaries cannot exceed allocation
- void Dma_pool_destroy (struct dma_pool *pool);
- void *dma_pool_alloc (struct dma_pool *pool, int mem_flags, dma_addr_t *handle); The address of the DMA buffer returned by
- is the kernel virtual address
- void Dma_pool_free (struct dma_pool *pool, void *vaddr, dma_addr_t addr);
- creating a streaming DMA map
- When you establish a streaming mapping, you must tell the direction of the kernel data flow
- enum dma_data_direction
- dma_to_device
- Dma_from_device
- dma_bidirectional
- dma_none
- dma_addr_t dma_map_single (struct device *dev, void *buffer, size_t size, enum dma_data_direction direction);
- void Dma_unmap_single (struct device *dev, dma_addr_t dma_addr, size_t size, enum dma_data_direction direction);
- There are a few very important principles for streaming DMA mapping
- buffers can only be used for such transfers, that is, the direction of their delivery matches the direction given when mapping WFHG
- Once the buffer is mapped, it will belong to the device, not the processor
- The buffer mapping cannot be undone during DMA's active period, otherwise it will severely damage the system's stability
- void dma_sync_single_for_cpu (struct device *dev, dma_handle_t bus_addr, size_t size, enum dma_data_direction direction);
- void Dma_sync_single_for_device (struct device *dev, dma_handle_t bus_addr, size_t size, enum dma_data_direction direction);
- Single-page streaming mapping
- dma_addr_t dma_map_page (struct device *dev, struct page *page, unsigned long offset, size_t size, enum dma_data_direction direction);
- void Dma_unmap_page (struct device *dev, dma_addr_t dma_address, size_t size, enum dma_data_direction direction);
- Scatter/focus Mapping
- This is a special kind of streaming DMA mapping
- Suppose there are several buffers that need to transmit data to the device in both directions
- There are several ways to create this situation
- Generated from RAEDV or Writev system calls
- Disk I/O requests from the cluster are generated
- Generated from the list of page lists in the mapped kernel I/O buffers
- Many devices can accept a scatter table of a pointer array, as well as its length, and then transmit them all in a single DMA operation.
- The first step in mapping a decentralized table is to create and populate an array that describes the scatterlist structure of the transmitted buffer.
- <linux/scatterlist.h>
- struct scatterlist
- struct page *page;
- unsigned int length;
- unsigned int offset;
- int Dma_map_sg (struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction direction);
- Nents is the number of scattered table portals that are passed in
- The return value is the number of DMA buffers to be transferred
- The driver should transmit each buffer returned by the DMA_MAP_SG function
- dma_addr_t sg_dma_address (struct scatterlist *sg);
- unsinged int Sg_dma_len (struct scatterlist *sg);
- void Dma_unmap_sg (struct device *dev, struct scatterlist *list, int nents, enum dma_data_direction direction);
- Nents must be the number of entries previously passed to the DMA_MAP_SG function
- void dma_sync_sg_for_cpu (struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction direction);
- void Dma_sync_sg_for_device (struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction direction);
- PCI Dual Address cycle mapping
- Typically, the DMA support layer uses a 32-bit bus address, which is constrained by the device's DMA mask
- PCI bus also supports 64-bit address mode, dual address cycle (DAC)
- If your device needs to use large buffers placed in high-end memory, you can consider implementing DAC support
- <linux/pci.h>
- int Pci_dac_set_dma_mask (struct Pci_dev *pdev, u64 mask);
- Returns 0 o'clock to use the DAC address
- dma64_addr_t PCI_DAC_PAGE_TO_DMA (struct pci_dev *pdev, struct page *page, unsigned long offset, int direction);
- Direction
- Pci_dma_todevice
- Pci_dma_fromdevice
- Pci_dma_bidirectional
- void Pci_dac_dma_sync_single_for_cpu (struct Pci_dev *pdev, dma64_addr_t dma_addr, size_t len, int direction);
- void Pci_dac_dma_sync_single_for_device (struct Pci_dev *pdev, dma64_addr_t dma_addr, size_t len, int direction);
- ISA Device's DMA
- ISA bus allows two DMA transfers: local (native) DMA and ISA bus control (Bus-master) DMA
- Local DMA uses the standard DMA controller circuitry on the motherboard to drive the signal line on the ISA bus
- ISA Bus control DMA fully controlled by peripheral devices
- There are three implementations involving DMA data transfer on the ISA bus
- 8237 DMA Controller (DMAC)
- Peripheral equipment
- The DMA request signal must be activated when the device is ready to transfer data
- Device drivers
- There is little work required to complete the driver, it is only responsible for providing the direction of the DMA controller, the bus address, the size of the traffic, etc.
- Registering DMA
- <asm/dma.h>
- int REQUEST_DMA (unsigned int channel, const char *name);
- Returns 0 indicates successful execution
- void FREE_DMA (unsigned int channel);
- Communicating with the DMA controller
- unsigned long claim_dma_lock ();
- The information that must be loaded into the controller contains three parts: the address of the RAM, the number of atomic items that must be transmitted, and the direction of the transmission
- void Set_dma_mode (unsigned int channel, char mode);
- Mode
- Dma_mode_read
- Dma_mode_write
- Dma_mode_cascade
- Release control of the bus
- void set_dma_addr (unsigned int channel, unsigned int addr);
- void Set_dma_count (unsigned int channel, unsigned int count);
- void DISABLE_DMA (unsigned int channel);
- void ENABLE_DMA (unsigned int channel);
- int get_dma_residue (unsigned int channel);
- Returns the number of bytes that have not yet been transferred
- void clear_dma_ff (unsigned int channel);
"Linux Device Drivers", chapter 15th, Memory Mapping and Dma--note