Http://www.cnblogs.com/hanyan225/archive/2010/10/28/1863854.html
1, Concept
DMA is a hardware mechanism that allows two-way data transfer between peripherals and system memory without the involvement of the CPU. The use of DMA can be the system CPU from the actual IO data transfer process out of the way, thus greatly providing the system throughput rate. DMA mode of data transmission is controlled by the DMA controller (DMAC), during transmission, the CPU can perform other tasks concurrently, when the DMA ends, the DMAC through the interrupt to notify the CPU data transfer has ended, and then by the CPU to perform the corresponding interrupt service program for subsequent processing
2,DMA and Cache
When it comes to DMA, you think of the cache, which seems to be a little irrelevant, and indeed, assuming that the DMA has no overlapping areas for the destination address of the memory and the cache object, there is no overlap between the DMA and the cache, but if there is overlap, the DMA operation The cache memory corresponding to the data has been modified, and the CPU itself is not aware that it still believes that the data in the cache is still in-memory data, and later access to the cache mapped memory, it still uses the stale cache data, which will occur between the cache and memory data " Inconsistency "error. Once this happens, the driver will not function properly if it has not been handled well. So how to solve it? The simplest method is to directly prohibit the memory cache function in the DMA target address range, which is sacrificing performance, but highly reliable. No, so the balance between the two is really difficult to solve.
3,DMA Programming related
An area of memory used to interact with the peripheral data is called a DMA buffer, and the DMA buffer must be physically connected if the device does not support SCATTER/GATHERCSG, scatter/gather operations.
For ISA devices, their DMA operations can only be performed in memory below 16MB, so the GFP_DMA flag should be used when requesting DMA buffers using Kmalloc () and __get_free_pages () and their similar functions. This ensures that the acquired memory is DMA capable.
The kernel defines the __get_free_pages () "Shortcut" __get_dma_pages () for DMA, which adds GFP_DMA to the request flag as follows:
#define __get_dma__pages (Gfp_mask, order) __get_free_pages ((Gfp_mask) | GFP_DMA, (order)) Another function that requests DMA is the Dma_mem_alloc () function, as follows:
[CPP]View Plaincopy
- Static unsigned long dma_mem_alloc (int size)
- {
- int order = get_order (size); //Size index
- return __get_dma_pages (Gfp_kernel, order);
- }
It is necessary to note that the DMA hardware uses the bus address instead of the physical address, the bus address is the memory address seen from the device angle, and the physical address is the non-converted memory address seen from the CPU point of view (the converted called virtual address). On a PC, the bus is the physical address for both ISA and PCI, but not every platform. Because sometimes the interface bus is connected via a bridging circuit, the bridge circuit maps the IO address to a different physical address. For example, in the prep (PowerPC Reference Platform) system, the physical address 0 appears to be 0x80000000 on the device side, and 0 is usually mapped to the virtual address 0xc0000000, so the same address has triple identity: Physical Address 0, Bus address 0x80000000 and virtual address 0xc0000000, and some systems provide a page mapping mechanism, which can map any page as a continuous peripheral bus address.
The kernel provides the following functions for simple virtual address/bus address translation:
[CPP]View Plaincopy
- Unsigned long virt_to_bus (volatile void *address);
- void *bus_to_virt (unsigned long address);
It is necessary to note that the device does not necessarily perform DMA operations on all memory addresses, in which case the DMA address mask should be performed through the following functions:
[CPP]View Plaincopy
- int Dma_set_mask (struct device *dev, u64 mask);
For example, for devices that perform DMA operations only on 24-bit addresses, dma_set_mask (Dev, 0xFFFFFFFF) should be called. The DMA mapping consists of two areas of work: allocating a DMA buffer, and generating an address that the device can access for this buffer. In conjunction with the above, the DMA mapping must consider the cache consistency issue. The kernel provides a function for allocating a DMA-consistent memory area:
[CPP]View Plaincopy
- void *dma_alloc_coherent (struct device *dev, size_t size, dma_addr_t *handle, gfp_t GFP);
The return value of this function is the virtual address of the requested DMA buffer. In addition, the function returns the bus address of the DMA buffer via the parameter handle. The corresponding release function is:
[CPP]View Plaincopy
- void dma_free_coherent (struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle);
The following function is used to allocate a write merge (writecombinbing) DMA buffer:
[CPP]View Plaincopy
- void *dma_alloc_writecombine (struct device *dev, size_t size, dma_addr_t *handle, gfp_t GFP);
The corresponding is the release function: Dma_free_writecombine (), it is actually dma_free_conherent, but it is using the # define rename only.
The interface of a streaming DMA mapping is more complex than a consistent DMA mapping. For a single already allocated buffer, use Dma_map_single () to implement a streaming DMA mapping:
[CPP]View Plaincopy
- dma_addr_t dma_map_single (struct device *dev, void *buffer, size_t size, enum dma_data_direction direction);
If the mapping succeeds, the bus address is returned, otherwise null is returned. The last parameter DMA direction, may take Dma_to_device, Dma_form_device, dma_bidirectional and dma_none; the corresponding inverse function is:
[CPP]View Plaincopy
- void Dma_unmap_single (struct device *dev,dma_addr_t *DMA_ADDRP,size_t size,enum dma_data_direction direction);
Normally, the device driver should not access the streaming DMA buffer of the unmap (), and if you say I am willing to do so, I say what to write, choose the right, choose the responsibility, right. You can then use the following function to obtain the DMA buffer ownership:
void dma_sync_single_for_cpu (struct device *dev,dma_handle_t bus_addr, size_t size, enum dma_data_direction direction); After the driver accesses the DMA buffer, it should take ownership back to the device, using the following function:
[CPP]View Plaincopy
- void Dma_sync_single_for_device (struct device *dev,dma_handle_t bus_addr, size_t size, enum Dma_data_ Direction direction);
If the device requires a large DMA buffer, in which the SG mode is supported, the application of multiple discontinuous, relatively small DMA buffers is usually a method to prevent the application of too large contiguous physical space, and in the Linux kernel, the following functions are used to map the SG:
[CPP]View Plaincopy
- int Dma_map_sg (struct device *dev,struct scatterlist *sg, int nents,enum dma_data_direction direction );
Where Nents is the number of hash table entries, the return value of this function is the number of DMA buffers, which may be less than nents. For each item in Scatterlist, DMA_MAP_SG () generates the appropriate bus address for the device, which merges physically adjacent memory areas. The following gives the scatterlist structure:
struct scatterlist
{
struct page *page;
unsigned int offset; Offset amount
dma_addr_t dma_address; Bus Address
unsigned int length; Buffer length
}
After executing DMA_MAP_SG (), the bus structure of the scatterlist corresponding buffer can be returned through sg_dma_address (), and Sg_dma_len () returns the length of the scatterlist corresponding buffer, the prototypes of which are:
[CPP]View Plaincopy
- dma_addr_t sg_dma_address (struct scatterlist *sg); unsigned int sg_dma_len (struct scatterlist *sg);
After the DMA transfer is complete, the DMA mapping can be removed through the inverse function dma_unmap_sg () of the Dma_map_sg ():
void Dma_map_sg (struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction direction); SG Mapping is a streaming DMA mapping, with a single buffer condition
Similar to the downstream DMA mapping, if the device driver must access the SG buffer in the case of the mapping, the following function should be called First:
[CPP]View Plaincopy
- int dma_sync_sg_for_cpu (struct device *dev,struct scatterlist *sg, int nents,enum dma_data_direction direction);
After the access is complete, ownership is returned to the device through the following functions:
[CPP]View Plaincopy
- int Dma_map_device (struct device *dev,struct scatterlist *sg, int nents,enum dma_data_direction direction);
There can be a relatively simple way to pre-allocate buffers in a Linux system, which is to synchronize the "mem=" parameter to reserve memory. For example, for a system with a memory of 64MB, passing the MEM=62MB command-line argument to it allows
The top 2MB memory is reserved for use as IO memory, this 2MB memory can be statically mapped, can also execute Ioremap ().
As with interrupts, before using DMA, the device driver needs to request the DMA channel to the system first, and the function to request the DMA channel is as follows:
[CPP]View Plaincopy
- int REQUEST_DMA (unsigned int dmanr, const char * device_id);
Similarly, the device structure pointer can be used as the best parameter for incoming device_id. After using the DMA channel, you should use the following function to release the channel:
[CPP]View Plaincopy
- void Free_dma (unsinged int dmanr);
[Linux drivers] [Linux Memory] DMA Learning Note One