Understanding the GPU from the bottom up (GPU-driven initialization process)

Last Update:2018-07-26 Source: Internet

Author: User

Tags sleep amd radeon

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

background

This series of summary should be accompanied by the project in a timely manner, but for the graphics card driver, itself can refer to very little information, only from the kernel code to do not try to figure it out. The purpose of the project is actually very simple and rude, why do you say so, because the work to be done involves implementing a 2D hardware accelerator on the embedded device, capable of supporting Mesa Open source 3D Graphics library, EGL,DLX and DRM modules. Finally, a hardware-accelerated 3D application development environment and display platform are achieved in a class desktop environment. This article is based on the GPU kernel code, which is the DRM code to introduce the graphics driver initialization process from the bottom, the graphics card type is AMD Radeon R600 after the series of graphics cards.

The basic process is the driver loading, hardware initialization, setting up the hardware independent module (such as memory manager), setting the display (resolution, etc.); radeon_driver_load_kms ()

This function is defined in the RADEON_KMS.C file and is the starting point for all content related to GPU initialization. Call Radeon_device_init () to initialize the hardware of the non-display device, and call Radeon_modeset_init () to initialize the display device-related hardware (crtc,connector, encoder, etc.). Radeon_device_init ()

The main work of driver initialization is done by Radeon_device_init (), which is defined in radeon_device.c. First we will initialize a whole bunch of the structures that the drive needs to use. Then call Radeon_asic_init (). This function is used to set some function pointers related to the circuit, such as sleep/resume calls, hardware resets, set up and handle interrupt requests, set up and fetch clocks, and so on. The generic code invokes these and circuit-related functions with these registered function pointers to get the appropriate functionality. For example, enabling and handling an interrupt is not the same on RV100 and on RV700. Due to the constant changes in the graphics cards of different generation, there are many different ways to handle multiple Asics families. This also allows us to have compatibility and matching functions related to different chip programming. For example, the r1xx and r3xx chips use the same interrupt processing mode, but they have different initialization paths (R100_init () and R300_init ()); Set the DMA mask for the drive

This step is to let the kernel know what size of address space the video card can handle. For the Radeon series, this is used for GPU access graphics buffering, graphic buffer is stored in system memory, and the GPU accesses the system memory through the Gart table. AGP and older on-chip gart mechanisms are limited to 32 bits. Some new on-chip gart mechanisms have greater address space. Set Mmio

PCI/PCIE/AGP devices are programmed by a known bars (base address register). These mappings provide access to the on-chip registers, frame buffers, and the resources on the memory. Set up the GPU via registers, and if you want to access these registers, you will need to map the base address of this register. If you want to write to Framebuffer (to display the data on your screen), you will need to map the base address of this framebuffer. Here we map the register base address, and this mapped register will be used to enable the driver to configure the graphics card (GPU). Vga_client_register ()

Vga_client_register is called, and the purpose of this function is beyond the scope of this blog post discussion. is to provide a basic way to provide limitations on the VGA on a PCI bus that has multiple VGA devices. Radeon_init ()

is actually a macro definition on radeon.h that refers to a circuit-related initialization call that was previously initialized by Radeon_asic_init (). This circuit-related initialization function is called. For RV100, it is the r100_init () defined in r100.c, and for RV770, it is rv770_init ().

For Radeon_device_init () It's been a lot of work to do that. Next we look at what the circuit-specific initialization functions do.
They follow the same pattern, although some circuits may do much or less, depending on their function. Let's look at the R100_init () in r100.c.

First of all, we initialize the Debugfs. This is a kernel debug architecture and is not discussed here. Then we call r100_vga_render_disable and turn off the VGA engine of the video card. This VGA engine provides VGA compatibility, because we will program the graphics card directly, so we shut it down.
Next, we set the scratch register for the GPU by defining the Radeon_scratch_init () in radeon_device.c. The Scratch registers (scratch register) used by the CP (command processor) are used to identify the events that are plotted by the graph. Usually they are used by what we call fence. A write operation to one of the scratch registers can be added to the command stream and then sent to the GPU. When this command is executed, the GPU writes a specific value to the scratch register. The driver can then check the value of the scratch register to determine whether the fence has occurred. For example, if you want to know if the GPU has finished rendering work for a buffer, then you will insert a fence after the render command. You can then check the value of the scratch register to determine if the fence has passed, meaning that the rendering is complete. Radeon_get_bios ()

Load the BIOS of the video card from the PCI ROM bar. The BIOS of this video card contains data and command tables. The data table defines the number and type of connectors similar to the graphics card, and how these connector are mapped to Encoder,gpio registers, the bit fields used by the DDC and other i²c-buses, the LVDS screen information for notebooks, The limitations of the PLL engine, and so on. The command table is used to initialize the hardware (it should normally be done by the system BIOS at boot time, but it needs something like a sleep/resume and initialize level two graphics card), and on a system with an atom BIOS, the command table is used to set the display and change things like engines and memory clocks. Initializing BIOS Scratch registers

Call Radeon_combios_initialize_bios_scratch_regs () in Radeon_combios_init (). These registers are the means by which the firmware on the system communicates directly with the graphics driver. They contain events like lid or mode changes such as the connection output, regardless of whether the driver or firmware will handle the event, and so on. Radeon_boot_test_post_card ()

Check to see if the system BIOS is loaded into the video card and booted. This operation is used to determine whether the video card needs to be driven through the BIOS command table for initialization or the system BIOS has completed this work. Radeon_get_clock_info ()

Gets the PLL (phase locked loop, which is used to generate the clock) information from the BIOS table. This includes a PLL that shows the PLL, engine, and memory, and a reference clock for the PLL that generated the final clock. Radeon_pm_init ()

Initializes the power management features on the chip. Initialize MC (memory controller)

R100_mc_init (). The GPU has its own address space similar to the CPU. In this address space you can map VRAM and Gart. The modules on the chip (3D engine, display control, etc.) can access these data resources through the GPU's address space. The VRAM is mapped to an offset, gart at another offset. If you want to read a texture data from gart memory, point to a certain offset in the gart in the GPU address space to get the base address of the texture. If you want to display a buffer in the video memory to the display, you will point your CRTC base address to the memory in the GPU address space. The memory controller initialization function determines how much memory is used for VRAM in the GPU address space, and how much is used for gart. Radeon_fence_driver_init ()

Initializes some of the common code used for fence. The use of fence has been described earlier. Radeon_irq_kms_init ()

Initializes some of the common code used to interrupt the request. Radeon_bo_init ()

Initializes the memory manager. Memory manager. R100_pci_gart_init ()

Set the gart mechanism on the Board and Radeon_agp_init () to initialize the AGP Gart. This allows the GPU to access buffer in the system memory. Because the system memory is paged, most of the allocations are discontinuous. Gart provides a way for many separate physical pages to look like contiguous pieces of memory by using address remapping, and you only need to have the AGP base address of the GPU provided by the North Bridge chip. On-board Gart also provides the same functionality in systems that do not support AGP. r100_set_safe_registers ()

This function sets the user-space command cache to allow access to a list of registers. When user-driven, such as DDX (2d), Mesa (3d) sends commands to the GPU, the DRM module checks these command caches to prevent access to unauthorized registers or memory. R100_startup ()

Encode the hardware using all the content set by R100_init (). This is a separate function, all when resuming from sleep is also called, in which case the current hardware configuration needs to be saved. The settings for VRAM and Gart are encoded in R100_mc_program () and r100_pci_gart_enable; IRQs is set in R100_irq_set (). R100_cp_init ()

Initialize CP (command processor) and set the ring buffer. The CP is part of the chip and is responsible for providing the GPU with expedited commands. is to read the command from the ring buffer and the driver (CPU) sends the command to the ring buffer, and the GPU is responsible for reading and processing the command from it. In addition to commands, you can write pointers to the command buffers stored elsewhere in the GPU address space, known as indirect buffering. For example, the 3D driver may send a command cache to DRM, and after checking, the DRM will hold a pointer to a command buffer that follows a fence, located on the ring. When the CP obtains this pointer on the ring, it reads the command buffer that points to it, processes the command in it, and then returns to the location where the ring was just left. Buffers that reference this command buffer will be locked until the fence passes, because the GPU is accessing them when the command is executed. R100_wb_init ()

Initializes a write-back scratch register, which is a feature that allows the GPU to update the copy value of the scratch register in Gart memory. This also allows the driver (running on the CPU) to access the contents of these registers without having to read them through the MMIO registers (which need to be via the bus). R100_ib_init

Initializes an indirect buffer for sending commands to the CP, like a 3D driven user state. initialization of the display section

Radeon_modeset_init (). First we set the display limits as well as the mode. Then we set some features of the output (Radeon_modeset_create_props ()), which are exposed by the Xrandr properties when x is running.

Initializing the CRTC,CRTC in Radeon_crtc_init is also known as the display controller, which is a module on the chip that provides a display frequency that determines which part of the frame buffer a particular monitor is pointing to. A CRTC provides a separate "head". Most of the Radeon circuits have two CRTC, and the new evergreen chip has a total of six.

Radeon_setup_enc_conn () Sets the connectors and encoder mappings based on the Graphics BIOS data table. Encoder is the decoder, is responsible for converting the digital signal to analog signal output to connectors, and connectors is connected to the display, that is, displays devices. A encoder can be bound to one or more connectors. This mapping is important because you need to know which encoder are being used and which ones are bound so that they are displayed correctly.

Radeon_hpd_init () is a macro that points to a circuit-related function that initializes the HPD (hot plug detect) hardware for a digital display. HPD allows you to get an interrupt when displaying a connection or disconnecting. When this interrupt occurs, the driver takes reasonable action and generates an event that allows the user to listen to the program. The application can display a message asking what the user is going to do, and so on. Radeon_fbdev_init ()

Sets the FB interface in the DRM kernel. This provides a kernel FB interface on top of DRM for terminal or other kernel FB applications.

When the driver is unloaded, the whole process is reversed, and all *_fini () functions are called to unload the driver.

Wrote for a long time, for their future reference, but also hope to learn the GPU kernel friends have help.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More