Linux DMA explanation

Source: Internet
Author: User
Tags 0xc0

Direct Memory Access-DMA (Direct Memory Access-DMA) is a data transmission mode in the computer, which does not require interference from the central processor (CPU.

DMA is implemented in different forms in computers of different architectures. Therefore, this article will focus on the implementation and working methods of the DMA subsystems of ibm pc and ibm pc/AT, and all other successors and compatible products.

Is the pc dma subsystem built on intel? 8237 controller. The 8237 controller contains four DMA channels, each of which can be controlled independently by programming and any channel can be active at any time. These channels are numbered 0, 1, 2, and 3. Starting from PC/AT, IBM added the second 8237 chip and named the Channel 4, 5, 6, and 7.

The original DMA controller (0, 1, 2, and 3) transmits one byte at a time. The second DMA controller (4, 5, 6, and 7) transmits 16 bits of data in two consecutive memory blocks each time, and the first eight bits must come from an odd-digit address. The two controllers are identical. The difference in transmission is determined by the method in which the second controller is connected to the system.

8237 each channel has two electrical signals, drq and-Dack. Some additional signals, such as HRG (hold request), hlda (hold acknowledge),-EOP (end of process), and memr (memory read ), -memw (memory write),-ior (I/O read), and-Iow (I/O write ).

8237 DMA is also called the "fly-by" DMA controller. This means that the transmitted data is neither transmitted through the DMA chip nor stored in the DMA chip. Therefore, DMA can only transmit data between the I/O port and the memory address, rather than between two I/O ports or two memory blocks.

Note: 8237 allows non-"fly-by" mode, that is, two channels are connected to complete data transmission from memory to memory. However, no one in the PC industry uses this mode because it is faster to use CPU to move data in the memory.

In PC architecture, the DMA channel is activated only when the hardware of a given DMA channel sends a drq signal to the channel.

9.1.1 an example of DMA Transmission

This example shows the triggering and execution of DMA transmission. In this example, the floppy disk controller (FDC) reads a byte from the floppy disk, and then the DMA needs to place the byte to 0x00123456 of the memory. The entire process starts when FDC sends a drq2 (drq signal to the second channel) signal to the DMA controller.

The DMA controller will notice that a drq2 signal is received. The Controller then determines that the second DMA channel has been programmed and marked as unmasked (Enabled ). The Controller then determines that other channels are active and have a higher priority. Once this is done, the DMA requires the CPU to open the bus so that you can use it. DMA sends an hrq signal through the bus to the CPU.

  

The CPU can execute some additional commands when it is idle. However, when the CPU finally executes the command to read content from the internal processor cache or pipeline, it still needs to wait.

Since the DMA "has obtained the management right", the DMA will activate the-memr,-memw,-ior,-Iow output signal, the address output from DMA is also set to 0x3456. This output will be used to direct the transmitted bytes to the specified memory address.

DMA then let the device that requires DMA data transmission know that the transmission is about to begin. The initial signal is-Dack. If the device is a floppy controller, the-dack2 signal is used.

The floppy Controller is responsible for placing the bytes to be transmitted on the bus data line. Unless the floppy controller needs more time to obtain data from the bus (and if the peripheral device does need more time, the device uses the ready signal to warn the DMA .), The DMA will wait for a DMA clock period, and then remove the-memw and-ior signals so that the memory can close and save the bytes on the bus. Then the floppy disk controller will know that the byte has been transmitted.

Because the DMA cycle only transmits one byte at a time, the floppy disk controller will now lose the drq2 signal, so the DMA will know that this is no longer needed. DMA also discards the-dack2 signal so that the floppy controller knows that it must stop transmitting data to the bus.

DMA checks whether any DMA channel is dynamic or not. If no drq signal exists in any channel, the DMA controller sends out the-memr,-memw,-ior,-Iow and address signals for the third time.

At last, DMA removes the hrq signal. When the CPU sees this signal, it also removes the Holda signal. Then the CPU activates-memr,-memw,-ior,-Iow and address information, and then goes back to execute the command and access the memory and peripheral devices.

For a typical floppy disk segment, the above process repeats 512 times, one byte at a time. When each byte is transferred, the address register in the DMA will add one, and the number of bytes of data to be transferred will be reduced by one.

When the counter changes to 0, the DMA sends an EOP signal, meaning that the DMA knows that the counter is 0, no data needs to be transmitted, and waits for the CPU to summon another task again. This event is also called terminal count (terminal count, or TC ). There is only one EOP signal, and since only one DMA channel is active at a time, the channel for this activity is only the channel that just completed the task.

After the cached transmission ends, if a peripheral device needs to send an interrupt signal, you can test to send the-dackn signal along with the EOP signal. If this happens, it means that the DMA will not transmit any data to that device without CPU intervention. The peripheral device can send an interrupt signal to get the attention of the processor. In a PC architecture, the DMA chip itself cannot issue interruption signals. Only peripheral devices and related hardware are responsible for sending interrupt signals. Therefore, there may be peripheral devices that use DMA without interruption.

It should be understood that although the CPU always releases the bus to the DMA when the DMA is needed, this action is invisible in the application and the operating system, but when the DMA is active, there is a slight difference in the time when the CPU executes commands. Therefore, the processor detects the status of the peripheral device from time to time, detects the registers in the DMA chip, or, when the DMA transmission ends, receives the interrupt signal from the peripheral device to determine the situation.

9.1.2 DMA page memory and 16 megabits address space limit

You may have noticed that we previously said that DMA would set the address line to 0x00123456. The actual DMA is set to 0x3456. The reason for doing so still needs a little explanation.

When ibm pc was designed, IBM chose to use both the DMA and interrupt controller chips, which were specifically designed for 8085. 8085 is an 8-bit processor with an address space of only 16 bits (64 K ). Since the ibm pc supports more than 64 KB of memory, some work needs to be done to enable the DMA read/write to mark the memory above 64 KB. The IBM Solution adds a data gate to each DMA channel. Each gate stores the upper half of each address for reading and writing. As long as a DMA channel shows signs of activity, the contents stored in the data gate of that channel will be written to the address bus and kept there
The DMA Operation is complete. IBM calls this gate "Page register ".

Therefore, for the above example, DMA will put the address 0x3456 on the bus. Then, the page register on DMA Channel 2 places 0x0012xxxx on the bus. Finally, these two parts are combined to form a complete address for access.

Because the data register gate is independent of the DMA control chip, the memory that can be read and written cannot exceed the physical limit of 64 KB. For example, if the DMA access address is 0 xFFFF, the address register will be added after the transfer ends. Then Access 0x0000 instead of 0x10000. The consequences may not be intentional.

Note: The 64 K "physical" limit cannot be confused with the 64 K "segments" in 8086 mode. The latter is simply attached with a segment register and an offset register. The page registers do not overlap addresses and are "or" (or-ed) together.

What's more complicated is that the external DMA address storage in the PC/AT architecture only has eight bits, with 8 + 16 = 24 bits. This means that the DMA can only address 16 MB of memory. For new computers with more than 16 MB of memory, standard PC-compatible DMA cannot access memory larger than 16 Mb.

To bypass this restriction, the operating system reserves a buffer in the memory. The buffer address is smaller than 16 MB, and the buffer size cannot exceed 64 KB. Then DMA will be programmed to transmit data from an external device to this buffer, and the operating system will transfer the data from this buffer to the real destination of the data.

When data is transferred from a address greater than 16 MB to a peripheral device, the data is first copied to a buffer lower than 16 MB, and then the DMA can transmit the data to external hardware. In FreeBSD, these reserved buffers are called "bounce buffers ". In MS-DOS? In the world, these buffers are called "smart buffers ".

Note: A New 8237 implementation, called 82374, allows 16-bit page registers and 32-bit address space access. No need for relay buffer (bounce buffers ).

9.1.3 DMA Operation Mode and settings

8237 DMA can be operated in different modes. Mainly include:

Single

A byte (or word) is transmitted. The DMA must be released and then the bus is retrieved again to complete each byte. It is usually used by devices that cannot transmit a whole block of data at a time. This type of peripherals will contact DMA during each transmission.

The standard PC-compatible floppy disk controller (NEC 765) has only one buffer of 1 byte size, so this mode is required.

Block/demand

Once the DMA gets the system bus, a whole piece of data will be transmitted until the upper limit of 64 K. If the external device requires additional time, the peripherals can simply transmit a ready signal to suspend data transmission. Ready cannot be used too much. For slow external devices, select single transfer mode.

The difference between block and demand is that once a data block starts to be transmitted, it will continue until 0 is transmitted. Drq only needs to be sent after-Dack is sent. In demand mode, after drq is sent, the DMA suspends the data transmission and releases the bus to the CPU. After drq is sent again, the suspended transmission continues.

The old hard disk controller uses the demand mode, which continues until the CPU develops fast enough so that it can use the CPU to control data transmission more efficiently. Especially if the data to be transmitted is stored in more than 16 Mb.

Cascade

This mechanism allows the DMA channel to request control of the bus, but the access peripherals will be responsible for putting the address control information on the bus, rather than the DMA itself. This can also be used to implement "Bus mastering ".

In cascade mode, the DMA channel accepts the control of the bus, but does not send the address and I/O control signals to the bus, just as the DMA is usually active. Instead, DMA sends only the-Dack signal to the active DMA channel.

At this time, the external device connected to the active DMA channel sends out the address and bus control information. In addition, the device has absolute control over the bus, and has the right to read and write the addresses below 16 megabits. When the peripheral task is completed, the drq signal is stopped, and the DMA controller returns the control to the CPU or other DMA channels.

The cascade mode can be used to link Multiple DMA controllers, which is exactly the purpose of DMA Channel 4 in the PC architecture. When a peripheral device requests to use the bus on the DMA channel, 2, and 3, the slave DMA controller sends hldreq, but the signal is sent to the master DMA controller rather than the CPU. If the main DMA Controller considers that there is work to be done on Channel 4, it uses the hldreq signal to request the CPU to use the bus. Once the CPU gives control of the bus to the master DMA controller, the-Dack signal is sent and then directly sent to the slave
The DMA controller activates the hlda signal. The slave DMA controller sends the data to the required DMA channel (0, 1 2 or 3), or the slave DMA controller can transfer the control of the bus to the desired peripherals, such as the SCSI controller.

Due to this connection operation, only DMA channels, 5, 6, and 7 are available in the PC/AT system.

Note: In early ibm pc computers, DMA channel 0 specifically performs refresh operations.

When an external device controls the system bus, it is important that peripherals must continuously transmit data. If it cannot continue, peripherals must release the bus frequently so that the system can refresh the primary memory.

The volatile RAM used by all PC memory must be frequently accessed so that some data stored in the memory remains "charged. The volatile memory is mainly composed of millions of capacitors, each of which stores a bit of data. These capacitors are fully charged
1When there is no power, it indicates0. Because all the capacitors will leak electricity, you must constantly charge the capacitors so that all1Unchanged. In fact, this task is executed by the RAM chip. However, the other part of the system must let the RAM chip know when to do this so that the system can normally access the memory. If the computer cannot refresh the memory, the data in the memory will crash in a short time.

Because the memory read/write cycle is "calculated" and the refresh cycle (a dynamic RAM refresh is actually an incomplete memory read operation), as long as the external controller continues to read or write data to the continuous memory address, the entire memory is refreshed.

Bus control can be found on some SCSI devices and high-performance external controllers.

Autoinitialize

This mode will cause the DMA to perform a byte, block, or required transmission operation. However, when the DMA transfer counter is 0, the counter and address will be set to the initial state when the DMA is programmed. This means that as long as the peripherals require data transmission, they will be authorized.

This method is often used on audio devices that only have a small hardware "sampling" buffer or have no hardware buffer. Processing the "loop" buffer will increase the CPU burden, but in some cases when the DMA counter is 0 and the DMA stop transmission is re-programmed, this is the only way to eliminate latency.

9.1.4 DMA Programming

The programmed DMA channel should always be blocked before loading any settings ". Because the hardware may unexpectedly send a drq signal to that channel, and the DMA will respond, even before all parameters are loaded or updated.

Once blocked, the host must point out the direction of data transmission (from memory to I/O or from I/O to memory). In what mode (single, block, demand, cascade and so on), and finally specify the length of the address and data. The data length is shorter than the data you want the DMA to transmit. The LSB and MSB addresses and data lengths are written to the same 8-bit I/O port, therefore, the other must be written in advance to ensure that DMA can accept the first byte as LSB and the second byte as MSB.

Then confirm to update the page counter, which is not in the DMA and must be accessed from different I/O Ports.

Once all the settings are ready, the DMA channel will be opened. The DMA channel has been "armed" and will respond to the drq signal uploaded by the channel.

Please refer to the hardware Data Book for detailed 8237 programming details. You may also need to refer to some information on PC system I/O port ing. These documents describe the locations of DMA and page registers. The following is a complete port ing table.

9.1.5 DMA port ing

All IBM-PC and PC/AT-based systems have the same DMA on the same I/O port. The following is a complete list. The port used by DMA controller 2 in a non-at design is undefined.

9.1.5.1 0x00-0x1f DMA controller 1 (channels 0, 1, 2, and 3)

DMA address and count register

0x00 Write Channel 0 start address
0x00 Read Channel 0 Current address
0x01 Write Channel 0 initial word count statistics
0x01 Read Statistics on the remaining words of channel 0
0x02 Write Channel 1 start address
0x02 Read Current address of Channel 1
0x03 Write Channel 1 initial word count statistics
0x03 Read Remaining words in Channel 1
0x04 Write Channel 2 start address
0x04 Read Channel 2 Current address
0x05 Write Channel 2 initial word count statistics
0x05 Read Remaining words in Channel 2
0x06 Write Channel 3 Start address
0x06 Read Channel 3 Current address
0x07 Write Channel 3 initial word count statistics
0x07 Read Number of remaining words in Channel 3

DMA instruction register

0x08 Write Instruction register
0x08 Read Status Register
0x09 Write Requirement register
0x09 Read -
0x0a Write Single shielded storage bit
0x0a Read -
0x0b Write Mode register
0x0b Read -
0x0c Write Clear LSB/MSB flip-flop
0x0c Read -
0x0d Write Master clear/Reset
0x0d Read Temporary registers (not in newer versions)
0x0e Write Clear shield register
0x0e Read -
0x0f Write Write bitwise of all shielded registers
0x0f Read Read the bit of all shielded registers (only available in Intel 82374)

9.1.5.2 0xc0-0xdf DMA controller 2 (channels 4, 5, 6, and 7)

DMA address and count register

0xc0 Write Channel 4 start address
0xc0 Read Channel 4 Current address
0xc2 Write Channel 4 initial word count statistics
0xc2 Read Channel 4 remaining words statistics
0xc4 Write Channel 5 start address
0xc4 Read Channel 5 Current address
0xc6 Write Channel 5 initial word count statistics
0xc6 Read Remaining words in Channel 5
0xc8 Write Channel 6 start address
0xc8 Read Channel 6 current address
0xca Write Channel 6 initial word count statistics
0xca Read Remaining words in Channel 6
0xcc Write Channel 7 start address
0xcc Read Channel 7 current address
0xce Write Channel 7 initial word count statistics
0xce Read Remaining words in Channel 7

DMA instruction register

0xd0 Write Instruction register
0xd0 Read Status
0xd2 Write Requirement register
0xd2 Read -
0xd4 Write Bit of a single shield register
0xd4 Read -
0xd6 Write Mode register
0xd6 Read -
0xd8 Write Clear LSB/MSB flip-flop
0xd8 Read -
0xda Write Master clear/Reset
0xda Read Temporary register (not available in Intel 82374)
0xdc Write Clear shield register
0xdc Read -
0xde Write Write bitwise of all shielded registers
0xdf Read Read all bitwise shielding registers (only available in Intel 82374)

9.1.5.3 0x80-0x9f DMA page register

0x87 Read/write Channel 0 low position (23-16) page register
0x83 Read/write Channel 1 low position (23-16) page register
0x81 Read/write Channel 2 low position (23-16) page register
0x82 Read/write Channel 3 low position (23-16) page register
0x8b Read/write Channel 5 low position (23-16) page register
0x89 Read/write Channel 6 low position (23-16) page register
0x8a Read/write Channel 7 low position (23-16) page register
0x8f Read/write Low page refresh

9.1.5.4 0x400-0x4ff 82374 enhanced DMA register

Intel 82374 EISA System component (ESC) appeared in 1996 and included a 8237 feature superset and other PC-compatible Core Components in a single package. This chip is dedicated to the EISA and PCI platforms and also provides modern DMA features such as scatter-gather, ring buffering, and system-level DMA for direct access to all 32-bit address spaces.

If these features are used, code must be attached to computers compatible with PCs of the past 16 years to provide similar functionality. To be compatible, you must program 8237 after each transfer to a traditional 82374 register. Writing data to a traditional 8237 register will force some 82374 enhanced register content to be cleared to provide backward software compatibility.

Zero X 401 Read/write Channel 0 high (BITS 23-16) Word Count statistics
Zero X 403 Read/write Channel 1 (BITS 23-16)
Zero X 405 Read/write Channel 2 (BITS 23-16)
Zero X 407 Read/write Channel 3 (BITS 23-16)
0x4c6 Read/write Channel 5 (BITS 23-16)
0x4ca Read/write Channel 6 (BITS 23-16)
0x4ce Read/write Channel 7 (BITS 23-16)
Zero X 487 Read/write Channel 0 high (BITS 31-24) page register
Zero X 483 Read/write Channel 1 (BITS 31-24) page storage
Zero X 481 Read/write Channel 2 (BITS 31-24) page storage
Zero X 482 Read/write Channel 3 (BITS 31-24) page storage
0x48b Read/write Channel 5 (BITS 31-24) page storage
Zero X 489 Read/write Channel 6 (BITS 31-24) page storage
0x48a Read/write Channel 6 (BITS 31-24) page storage
0x48f Read/write High page refresh
0x4e0 Read/write Channel 0 stop register (bits 7-2)
0x4e1 Read/write Channel 0 stop register (BITS 15-8)
0x4e2 Read/write Channel 0 stop register (BITS 23-16)
0x4e4 Read/write Channel 1 stop register (bits 7-2)
0x4e5 Read/write Channel 1 stop register (BITS 15-8)
0x4e6 Read/write Channel 1 stop register (BITS 23-16)
0x4e8 Read/write Channel 2 stop register (bits 7-2)
0x4e9 Read/write Channel 2 stop register (BITS 15-8)
0x4ea Read/write Channel 2 stop register (BITS 23-16)
0x4ec Read/write Channel 3 stop register (bits 7-2)
0x4ed Read/write Channel 3 stop register (BITS 15-8)
0x4ee Read/write Channel 3 stop register (BITS 23-16)
0x4f4 Read/write Channel 5 stop register (bits 7-2)
0x4f5 Read/write Channel 5 stop register (BITS 15-8)
0x4f6 Read/write Channel 5 stop register (BITS 23-16)
0x4f8 Read/write Channel 6 Stop register (bits 5-2)
0x4f9 Read/write Channel 6 Stop register (BITS 15-8)
0x4fa Read/write Channel 6 Stop register (BITS 23-16)
0x4fc Read/write Channel 7 stop register (bits 7-2)
0x4fd Read/write Channel 7 stop register (BITS 15-8)
0x4fe Read/write Channel 7 stop register (BITS 23-16)
0x40a Write Channels 0 to 3 link mode registers
0x40a Read Channel interrupt Status Register
0x4d4 Write Channels 4 to 7 link mode registers
0x4d4 Read Connection Mode Status
0x40c Read Chain buffer expiration control register
Zero x 410 Write Channel 0 scatter-gather Command Controller
Zero X 411 Write Channel 1 scatter-gather Command Controller
Zero X 412 Write Channel 2 scatter-gather instruction register
Zero X 413 Write Channel 3 scatter-gather instruction register
Zero X 415 Write Channel 5 scatter-gather instruction register
Zero X 416 Write Channel 6 scatter-gather instruction register
Zero X 417 Write Channel 7 scatter-gather instruction register
Zero X 418 Read Channel 0 scatter-gather Status Register
Zero X 419 Read Channel 1 scatter-gather Status Register
0x41a Read Channel 2 scatter-gather Status Register
0x41b Read Channel 3 scatter-gather Status Register
0x41d Read Channel 5 scatter-gather Status Register
0x41e Read Channel 5 scatter-gather Status Register
0x41f Read Channel 7 scatter-gather Status Register
0x420-0x423 Read/write Channel 0 scatter-gather Descriptor Table pointer register
0x0000-0x427 Read/write Channel 1 scatter-gather Descriptor Table pointer register
0x428-0x42b Read/write Channel 2 scatter-gather Descriptor Table pointer register
0x42c-0x42f Read/write Channel 3 scatter-gather Descriptor Table pointer register
0x434-0x437 Read/write Channel 5 scatter-gather Descriptor Table pointer register
0x438-0x43b Read/write Channel 6 scatter-gather Descriptor Table pointer register
0x43c-0x43f Read/write Channel 7 scatter-gather Descriptor Table pointer register

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.