Direct Memory Access-DMA (Direct Memory Access-DMA) is a data transmission mode in the computer, which does not require interference from the central processor (CPU.
DMA is implemented in different forms in computers of different architectures. Therefore, this article will focus on the implementation and working methods of the DMA subsystems of ibm pc and ibm pc/AT, and all other successors and compatible products.
Is the pc dma subsystem built on intel? 8237 controller. The 8237 controller contains four DMA channels, each of which can be controlled independently by programming and any channel can be active at any time. These channels are numbered 0, 1, 2, and 3. Starting from PC/AT, IBM added the second 8237 chip and named the Channel 4, 5, 6, and 7.
The original DMA controller (0, 1, 2, and 3) transmits one byte at a time. The second DMA controller (4, 5, 6, and 7) transmits 16 bits of data in two consecutive memory blocks each time, and the first eight bits must come from an odd-digit address. The two controllers are identical. The difference in transmission is determined by the method in which the second controller is connected to the system.
8237 each channel has two electrical signals, drq and-Dack. Some additional signals, such as HRG (hold request), hlda (hold acknowledge),-EOP (end of process), and memr (memory read ), -memw (memory write),-ior (I/O read), and-Iow (I/O write ).
8237 DMA is also called the "fly-by" DMA controller. This means that the transmitted data is neither transmitted through the DMA chip nor stored in the DMA chip. Therefore, DMA can only transmit data between the I/O port and the memory address, rather than between two I/O ports or two memory blocks.
Note: 8237 allows non-"fly-by" mode, that is, two channels are connected to complete data transmission from memory to memory. However, no one in the PC industry uses this mode because it is faster to use CPU to move data in the memory.
In PC architecture, the DMA channel is activated only when the hardware of a given DMA channel sends a drq signal to the channel.
9.1.1 an example of DMA Transmission
This example shows the triggering and execution of DMA transmission. In this example, the floppy disk controller (FDC) reads a byte from the floppy disk, and then the DMA needs to place the byte to 0x00123456 of the memory. The entire process starts when FDC sends a drq2 (drq signal to the second channel) signal to the DMA controller.
The DMA controller will notice that a drq2 signal is received. The Controller then determines that the second DMA channel has been programmed and marked as unmasked (Enabled ). The Controller then determines that other channels are active and have a higher priority. Once this is done, the DMA requires the CPU to open the bus so that you can use it. DMA sends an hrq signal through the bus to the CPU.
The CPU can execute some additional commands when it is idle. However, when the CPU finally executes the command to read content from the internal processor cache or pipeline, it still needs to wait.
Since the DMA "has obtained the management right", the DMA will activate the-memr,-memw,-ior,-Iow output signal, the address output from DMA is also set to 0x3456. This output will be used to direct the transmitted bytes to the specified memory address.
DMA then let the device that requires DMA data transmission know that the transmission is about to begin. The initial signal is-Dack. If the device is a floppy controller, the-dack2 signal is used.
The floppy Controller is responsible for placing the bytes to be transmitted on the bus data line. Unless the floppy controller needs more time to obtain data from the bus (and if the peripheral device does need more time, the device uses the ready signal to warn the DMA .), The DMA will wait for a DMA clock period, and then remove the-memw and-ior signals so that the memory can close and save the bytes on the bus. Then the floppy disk controller will know that the byte has been transmitted.
Because the DMA cycle only transmits one byte at a time, the floppy disk controller will now lose the drq2 signal, so the DMA will know that this is no longer needed. DMA also discards the-dack2 signal so that the floppy controller knows that it must stop transmitting data to the bus.
DMA checks whether any DMA channel is dynamic or not. If no drq signal exists in any channel, the DMA controller sends out the-memr,-memw,-ior,-Iow and address signals for the third time.
At last, DMA removes the hrq signal. When the CPU sees this signal, it also removes the Holda signal. Then the CPU activates-memr,-memw,-ior,-Iow and address information, and then goes back to execute the command and access the memory and peripheral devices.
For a typical floppy disk segment, the above process repeats 512 times, one byte at a time. When each byte is transferred, the address register in the DMA will add one, and the number of bytes of data to be transferred will be reduced by one.
When the counter changes to 0, the DMA sends an EOP signal, meaning that the DMA knows that the counter is 0, no data needs to be transmitted, and waits for the CPU to summon another task again. This event is also called terminal count (terminal count, or TC ). There is only one EOP signal, and since only one DMA channel is active at a time, the channel for this activity is only the channel that just completed the task.
After the cached transmission ends, if a peripheral device needs to send an interrupt signal, you can test to send the-dackn signal along with the EOP signal. If this happens, it means that the DMA will not transmit any data to that device without CPU intervention. The peripheral device can send an interrupt signal to get the attention of the processor. In a PC architecture, the DMA chip itself cannot issue interruption signals. Only peripheral devices and related hardware are responsible for sending interrupt signals. Therefore, there may be peripheral devices that use DMA without interruption.
It should be understood that although the CPU always releases the bus to the DMA when the DMA is needed, this action is invisible in the application and the operating system, but when the DMA is active, there is a slight difference in the time when the CPU executes commands. Therefore, the processor detects the status of the peripheral device from time to time, detects the registers in the DMA chip, or, when the DMA transmission ends, receives the interrupt signal from the peripheral device to determine the situation.
9.1.2 DMA page memory and 16 megabits address space limit
You may have noticed that we previously said that DMA would set the address line to 0x00123456. The actual DMA is set to 0x3456. The reason for doing so still needs a little explanation.
When ibm pc was designed, IBM chose to use both the DMA and interrupt controller chips, which were specifically designed for 8085. 8085 is an 8-bit processor with an address space of only 16 bits (64 K ). Since the ibm pc supports more than 64 KB of memory, some work needs to be done to enable the DMA read/write to mark the memory above 64 KB. The IBM Solution adds a data gate to each DMA channel. Each gate stores the upper half of each address for reading and writing. As long as a DMA channel shows signs of activity, the contents stored in the data gate of that channel will be written to the address bus and kept there
The DMA Operation is complete. IBM calls this gate "Page register ".
Therefore, for the above example, DMA will put the address 0x3456 on the bus. Then, the page register on DMA Channel 2 places 0x0012xxxx on the bus. Finally, these two parts are combined to form a complete address for access.
Because the data register gate is independent of the DMA control chip, the memory that can be read and written cannot exceed the physical limit of 64 KB. For example, if the DMA access address is 0 xFFFF, the address register will be added after the transfer ends. Then Access 0x0000 instead of 0x10000. The consequences may not be intentional.
Note: The 64 K "physical" limit cannot be confused with the 64 K "segments" in 8086 mode. The latter is simply attached with a segment register and an offset register. The page registers do not overlap addresses and are "or" (or-ed) together.
What's more complicated is that the external DMA address storage in the PC/AT architecture only has eight bits, with 8 + 16 = 24 bits. This means that the DMA can only address 16 MB of memory. For new computers with more than 16 MB of memory, standard PC-compatible DMA cannot access memory larger than 16 Mb.
To bypass this restriction, the operating system reserves a buffer in the memory. The buffer address is smaller than 16 MB, and the buffer size cannot exceed 64 KB. Then DMA will be programmed to transmit data from an external device to this buffer, and the operating system will transfer the data from this buffer to the real destination of the data.
When data is transferred from a address greater than 16 MB to a peripheral device, the data is first copied to a buffer lower than 16 MB, and then the DMA can transmit the data to external hardware. In FreeBSD, these reserved buffers are called "bounce buffers ". In MS-DOS? In the world, these buffers are called "smart buffers ".
Note: A New 8237 implementation, called 82374, allows 16-bit page registers and 32-bit address space access. No need for relay buffer (bounce buffers ).
9.1.3 DMA Operation Mode and settings
8237 DMA can be operated in different modes. Mainly include:
-
Single
-
A byte (or word) is transmitted. The DMA must be released and then the bus is retrieved again to complete each byte. It is usually used by devices that cannot transmit a whole block of data at a time. This type of peripherals will contact DMA during each transmission.
The standard PC-compatible floppy disk controller (NEC 765) has only one buffer of 1 byte size, so this mode is required.
-
Block/demand
-
Once the DMA gets the system bus, a whole piece of data will be transmitted until the upper limit of 64 K. If the external device requires additional time, the peripherals can simply transmit a ready signal to suspend data transmission. Ready cannot be used too much. For slow external devices, select single transfer mode.
The difference between block and demand is that once a data block starts to be transmitted, it will continue until 0 is transmitted. Drq only needs to be sent after-Dack is sent. In demand mode, after drq is sent, the DMA suspends the data transmission and releases the bus to the CPU. After drq is sent again, the suspended transmission continues.
The old hard disk controller uses the demand mode, which continues until the CPU develops fast enough so that it can use the CPU to control data transmission more efficiently. Especially if the data to be transmitted is stored in more than 16 Mb.
-
Cascade
-
This mechanism allows the DMA channel to request control of the bus, but the access peripherals will be responsible for putting the address control information on the bus, rather than the DMA itself. This can also be used to implement "Bus mastering ".
In cascade mode, the DMA channel accepts the control of the bus, but does not send the address and I/O control signals to the bus, just as the DMA is usually active. Instead, DMA sends only the-Dack signal to the active DMA channel.
At this time, the external device connected to the active DMA channel sends out the address and bus control information. In addition, the device has absolute control over the bus, and has the right to read and write the addresses below 16 megabits. When the peripheral task is completed, the drq signal is stopped, and the DMA controller returns the control to the CPU or other DMA channels.
The cascade mode can be used to link Multiple DMA controllers, which is exactly the purpose of DMA Channel 4 in the PC architecture. When a peripheral device requests to use the bus on the DMA channel, 2, and 3, the slave DMA controller sends hldreq, but the signal is sent to the master DMA controller rather than the CPU. If the main DMA Controller considers that there is work to be done on Channel 4, it uses the hldreq signal to request the CPU to use the bus. Once the CPU gives control of the bus to the master DMA controller, the-Dack signal is sent and then directly sent to the slave
The DMA controller activates the hlda signal. The slave DMA controller sends the data to the required DMA channel (0, 1 2 or 3), or the slave DMA controller can transfer the control of the bus to the desired peripherals, such as the SCSI controller.
Due to this connection operation, only DMA channels, 5, 6, and 7 are available in the PC/AT system.
Note: In early ibm pc computers, DMA channel 0 specifically performs refresh operations.
When an external device controls the system bus, it is important that peripherals must continuously transmit data. If it cannot continue, peripherals must release the bus frequently so that the system can refresh the primary memory.
The volatile RAM used by all PC memory must be frequently accessed so that some data stored in the memory remains "charged. The volatile memory is mainly composed of millions of capacitors, each of which stores a bit of data. These capacitors are fully charged
1When there is no power, it indicates0. Because all the capacitors will leak electricity, you must constantly charge the capacitors so that all1Unchanged. In fact, this task is executed by the RAM chip. However, the other part of the system must let the RAM chip know when to do this so that the system can normally access the memory. If the computer cannot refresh the memory, the data in the memory will crash in a short time.
Because the memory read/write cycle is "calculated" and the refresh cycle (a dynamic RAM refresh is actually an incomplete memory read operation), as long as the external controller continues to read or write data to the continuous memory address, the entire memory is refreshed.
Bus control can be found on some SCSI devices and high-performance external controllers.
-
Autoinitialize
-
This mode will cause the DMA to perform a byte, block, or required transmission operation. However, when the DMA transfer counter is 0, the counter and address will be set to the initial state when the DMA is programmed. This means that as long as the peripherals require data transmission, they will be authorized.
This method is often used on audio devices that only have a small hardware "sampling" buffer or have no hardware buffer. Processing the "loop" buffer will increase the CPU burden, but in some cases when the DMA counter is 0 and the DMA stop transmission is re-programmed, this is the only way to eliminate latency.
9.1.4 DMA Programming
The programmed DMA channel should always be blocked before loading any settings ". Because the hardware may unexpectedly send a drq signal to that channel, and the DMA will respond, even before all parameters are loaded or updated.
Once blocked, the host must point out the direction of data transmission (from memory to I/O or from I/O to memory). In what mode (single, block, demand, cascade and so on), and finally specify the length of the address and data. The data length is shorter than the data you want the DMA to transmit. The LSB and MSB addresses and data lengths are written to the same 8-bit I/O port, therefore, the other must be written in advance to ensure that DMA can accept the first byte as LSB and the second byte as MSB.
Then confirm to update the page counter, which is not in the DMA and must be accessed from different I/O Ports.
Once all the settings are ready, the DMA channel will be opened. The DMA channel has been "armed" and will respond to the drq signal uploaded by the channel.
Please refer to the hardware Data Book for detailed 8237 programming details. You may also need to refer to some information on PC system I/O port ing. These documents describe the locations of DMA and page registers. The following is a complete port ing table.
9.1.5 DMA port ing
All IBM-PC and PC/AT-based systems have the same DMA on the same I/O port. The following is a complete list. The port used by DMA controller 2 in a non-at design is undefined.
9.1.5.1 0x00-0x1f DMA controller 1 (channels 0, 1, 2, and 3)
DMA address and count register
0x00 |
Write |
Channel 0 start address |
0x00 |
Read |
Channel 0 Current address |
0x01 |
Write |
Channel 0 initial word count statistics |
0x01 |
Read |
Statistics on the remaining words of channel 0 |
0x02 |
Write |
Channel 1 start address |
0x02 |
Read |
Current address of Channel 1 |
0x03 |
Write |
Channel 1 initial word count statistics |
0x03 |
Read |
Remaining words in Channel 1 |
0x04 |
Write |
Channel 2 start address |
0x04 |
Read |
Channel 2 Current address |
0x05 |
Write |
Channel 2 initial word count statistics |
0x05 |
Read |
Remaining words in Channel 2 |
0x06 |
Write |
Channel 3 Start address |
0x06 |
Read |
Channel 3 Current address |
0x07 |
Write |
Channel 3 initial word count statistics |
0x07 |
Read |
Number of remaining words in Channel 3 |
DMA instruction register
0x08 |
Write |
Instruction register |
0x08 |
Read |
Status Register |
0x09 |
Write |
Requirement register |
0x09 |
Read |
- |
0x0a |
Write |
Single shielded storage bit |
0x0a |
Read |
- |
0x0b |
Write |
Mode register |
0x0b |
Read |
- |
0x0c |
Write |
Clear LSB/MSB flip-flop |
0x0c |
Read |
- |
0x0d |
Write |
Master clear/Reset |
0x0d |
Read |
Temporary registers (not in newer versions) |
0x0e |
Write |
Clear shield register |
0x0e |
Read |
- |
0x0f |
Write |
Write bitwise of all shielded registers |
0x0f |
Read |
Read the bit of all shielded registers (only available in Intel 82374) |
9.1.5.2 0xc0-0xdf DMA controller 2 (channels 4, 5, 6, and 7)
DMA address and count register
0xc0 |
Write |
Channel 4 start address |
0xc0 |
Read |
Channel 4 Current address |
0xc2 |
Write |
Channel 4 initial word count statistics |
0xc2 |
Read |
Channel 4 remaining words statistics |
0xc4 |
Write |
Channel 5 start address |
0xc4 |
Read |
Channel 5 Current address |
0xc6 |
Write |
Channel 5 initial word count statistics |
0xc6 |
Read |
Remaining words in Channel 5 |
0xc8 |
Write |
Channel 6 start address |
0xc8 |
Read |
Channel 6 current address |
0xca |
Write |
Channel 6 initial word count statistics |
0xca |
Read |
Remaining words in Channel 6 |
0xcc |
Write |
Channel 7 start address |
0xcc |
Read |
Channel 7 current address |
0xce |
Write |
Channel 7 initial word count statistics |
0xce |
Read |
Remaining words in Channel 7 |
DMA instruction register
0xd0 |
Write |
Instruction register |
0xd0 |
Read |
Status |
0xd2 |
Write |
Requirement register |
0xd2 |
Read |
- |
0xd4 |
Write |
Bit of a single shield register |
0xd4 |
Read |
- |
0xd6 |
Write |
Mode register |
0xd6 |
Read |
- |
0xd8 |
Write |
Clear LSB/MSB flip-flop |
0xd8 |
Read |
- |
0xda |
Write |
Master clear/Reset |
0xda |
Read |
Temporary register (not available in Intel 82374) |
0xdc |
Write |
Clear shield register |
0xdc |
Read |
- |
0xde |
Write |
Write bitwise of all shielded registers |
0xdf |
Read |
Read all bitwise shielding registers (only available in Intel 82374) |
9.1.5.3 0x80-0x9f DMA page register
0x87 |
Read/write |
Channel 0 low position (23-16) page register |
0x83 |
Read/write |
Channel 1 low position (23-16) page register |
0x81 |
Read/write |
Channel 2 low position (23-16) page register |
0x82 |
Read/write |
Channel 3 low position (23-16) page register |
0x8b |
Read/write |
Channel 5 low position (23-16) page register |
0x89 |
Read/write |
Channel 6 low position (23-16) page register |
0x8a |
Read/write |
Channel 7 low position (23-16) page register |
0x8f |
Read/write |
Low page refresh |
9.1.5.4 0x400-0x4ff 82374 enhanced DMA register
Intel 82374 EISA System component (ESC) appeared in 1996 and included a 8237 feature superset and other PC-compatible Core Components in a single package. This chip is dedicated to the EISA and PCI platforms and also provides modern DMA features such as scatter-gather, ring buffering, and system-level DMA for direct access to all 32-bit address spaces.
If these features are used, code must be attached to computers compatible with PCs of the past 16 years to provide similar functionality. To be compatible, you must program 8237 after each transfer to a traditional 82374 register. Writing data to a traditional 8237 register will force some 82374 enhanced register content to be cleared to provide backward software compatibility.
Zero X 401 |
Read/write |
Channel 0 high (BITS 23-16) Word Count statistics |
Zero X 403 |
Read/write |
Channel 1 (BITS 23-16) |
Zero X 405 |
Read/write |
Channel 2 (BITS 23-16) |
Zero X 407 |
Read/write |
Channel 3 (BITS 23-16) |
0x4c6 |
Read/write |
Channel 5 (BITS 23-16) |
0x4ca |
Read/write |
Channel 6 (BITS 23-16) |
0x4ce |
Read/write |
Channel 7 (BITS 23-16) |
Zero X 487 |
Read/write |
Channel 0 high (BITS 31-24) page register |
Zero X 483 |
Read/write |
Channel 1 (BITS 31-24) page storage |
Zero X 481 |
Read/write |
Channel 2 (BITS 31-24) page storage |
Zero X 482 |
Read/write |
Channel 3 (BITS 31-24) page storage |
0x48b |
Read/write |
Channel 5 (BITS 31-24) page storage |
Zero X 489 |
Read/write |
Channel 6 (BITS 31-24) page storage |
0x48a |
Read/write |
Channel 6 (BITS 31-24) page storage |
0x48f |
Read/write |
High page refresh |
0x4e0 |
Read/write |
Channel 0 stop register (bits 7-2) |
0x4e1 |
Read/write |
Channel 0 stop register (BITS 15-8) |
0x4e2 |
Read/write |
Channel 0 stop register (BITS 23-16) |
0x4e4 |
Read/write |
Channel 1 stop register (bits 7-2) |
0x4e5 |
Read/write |
Channel 1 stop register (BITS 15-8) |
0x4e6 |
Read/write |
Channel 1 stop register (BITS 23-16) |
0x4e8 |
Read/write |
Channel 2 stop register (bits 7-2) |
0x4e9 |
Read/write |
Channel 2 stop register (BITS 15-8) |
0x4ea |
Read/write |
Channel 2 stop register (BITS 23-16) |
0x4ec |
Read/write |
Channel 3 stop register (bits 7-2) |
0x4ed |
Read/write |
Channel 3 stop register (BITS 15-8) |
0x4ee |
Read/write |
Channel 3 stop register (BITS 23-16) |
0x4f4 |
Read/write |
Channel 5 stop register (bits 7-2) |
0x4f5 |
Read/write |
Channel 5 stop register (BITS 15-8) |
0x4f6 |
Read/write |
Channel 5 stop register (BITS 23-16) |
0x4f8 |
Read/write |
Channel 6 Stop register (bits 5-2) |
0x4f9 |
Read/write |
Channel 6 Stop register (BITS 15-8) |
0x4fa |
Read/write |
Channel 6 Stop register (BITS 23-16) |
0x4fc |
Read/write |
Channel 7 stop register (bits 7-2) |
0x4fd |
Read/write |
Channel 7 stop register (BITS 15-8) |
0x4fe |
Read/write |
Channel 7 stop register (BITS 23-16) |
0x40a |
Write |
Channels 0 to 3 link mode registers |
0x40a |
Read |
Channel interrupt Status Register |
0x4d4 |
Write |
Channels 4 to 7 link mode registers |
0x4d4 |
Read |
Connection Mode Status |
0x40c |
Read |
Chain buffer expiration control register |
Zero x 410 |
Write |
Channel 0 scatter-gather Command Controller |
Zero X 411 |
Write |
Channel 1 scatter-gather Command Controller |
Zero X 412 |
Write |
Channel 2 scatter-gather instruction register |
Zero X 413 |
Write |
Channel 3 scatter-gather instruction register |
Zero X 415 |
Write |
Channel 5 scatter-gather instruction register |
Zero X 416 |
Write |
Channel 6 scatter-gather instruction register |
Zero X 417 |
Write |
Channel 7 scatter-gather instruction register |
Zero X 418 |
Read |
Channel 0 scatter-gather Status Register |
Zero X 419 |
Read |
Channel 1 scatter-gather Status Register |
0x41a |
Read |
Channel 2 scatter-gather Status Register |
0x41b |
Read |
Channel 3 scatter-gather Status Register |
0x41d |
Read |
Channel 5 scatter-gather Status Register |
0x41e |
Read |
Channel 5 scatter-gather Status Register |
0x41f |
Read |
Channel 7 scatter-gather Status Register |
0x420-0x423 |
Read/write |
Channel 0 scatter-gather Descriptor Table pointer register |
0x0000-0x427 |
Read/write |
Channel 1 scatter-gather Descriptor Table pointer register |
0x428-0x42b |
Read/write |
Channel 2 scatter-gather Descriptor Table pointer register |
0x42c-0x42f |
Read/write |
Channel 3 scatter-gather Descriptor Table pointer register |
0x434-0x437 |
Read/write |
Channel 5 scatter-gather Descriptor Table pointer register |
0x438-0x43b |
Read/write |
Channel 6 scatter-gather Descriptor Table pointer register |
0x43c-0x43f |
Read/write |
Channel 7 scatter-gather Descriptor Table pointer register |