The previous blogs have described some of the DRM drivers, graphics card memory management mechanisms, and interrupt mechanisms, and it should be much easier to read the AMD DRM-driven initialization process.
Here is an article written by an AMD developer (put it here for the time being, and then add your own view later).
Understanding GPUs from the ground up
I get asked a lot on learning how to program GPUs. Bringing up evergreen kms seems type a good place-to-start, so I figured I write a series of articles detailing th E process based on the actual evergreen patches. First, to get a better understanding of what GPUs work, take a look at the Radeon DRM. This article assumes a basic understanding of C and computer architectures. The basic process is and the driver loads, initializes the hardware, sets up NON-HW specific things like the memory Manag Er, and sets up the displays. This first article describes the basic driver flow is the DRM loads in KMS mode.
RADEON_DRIVER_LOAD_KMS () (in radeon_kms.c) is where everything starts. It calls Radeon_device_init () to initialize the Non-display hardware and Radeon_modeset_init () (in radeon_display.c) to Itialize the display hardware.
The main workhorse of the driver initialization is radeon_device_init () found in radeon_device.c. first we Initializ e a bunch of the structs used in the driver. then Radeon_asic_init () is called. This function sets the Asics specific function pointers for various things such as Suspend/resume callbacks, ASIC reset, Set/process IRQs, Set/get engine clocks, etc. the common code then uses these callbacks to call the ASIC specific C Ode to achieve the requested functionality. for example, enabling and processing interrupts works differently on a R V100 vs. a rv770. Since functionality changes in stages, some routines is used for multiple ASIC families. Th Is lets us mix and match the appropriate functions for the specifics of what the chip is programmed. for example, bot H r1xx and r3xx chips both use the same interrupt scheme (as defined in R100_irq_set ()/r100_irq_process ()), but they has Different initialization routines (R100_init () vs. R300_init()).
Next We set up the DMA masks for the driver. these to the kernel know what size address space the the card is AB Le to address. in the case of radeons, it's used for GPUs access to graphics buffers stored in system memory which AR e accessed via a gart (Graphics Address remapping Table) . AGP and the older on-chip gart mechanisms is limited to 3 2 bits. Newer on-chip gart mechanisms have larger address spaces.
After DMA masks, we set up the MMIO aperture. PCI/PCIE/AGP devices is programmed via apertures called BARs (Base Address Register). There apertures provide access to resources on the card such as registers, framebuffers, and ROMs. GPUs is configured via registers, if you want to access those registers, you ' d map the register BAR. If you want to the framebuffer (some of which is displayed on your screens), you would map the framebuffer BAR . In this case we map the register BAR; This register mapping are then used by the driver to configure the card.
Vga_client_register () comes next, and is beyond the scope of this article. It's basically a-around the limitations of VGA on PCI buses with multiple VGA devices.
Next up is Radeon_init (). This was actually a macro defined in Radeon.h that references the ASICS Init callback we initialized in Radeon_asic_init () Several steps ago. The ASIC specific init function is called. For an RV100, it would is r100_init () defined in r100.c, for RV770, it's Rv770_init ().
That's pretty much it for Radeon_device_init (). Next Let's look at what happens in the ASIC specific INIT functions. They all follow the same pattern, although some Asics may does more or less depending on the functionality. Let's take a look at the R100_init () in r100.c. First we initialize DEBUGFS; This was a kernel debugging framework and outside the scope of this article. Next we call R100_vga_render_disable () This disables the VGA engine on the card. The VGA engine provides VGA compatibility; Since we is going to be programming the card directly, we disable it.
Following, we set up the GPU scratch registers (radeon_scratch_init () defined in radeon_device.c) . these is Scratch registers used by the CP (Command Processor) to signal graphics events. in general they is used for WHA T we call fences. A write to one of these scratch registers can is added to the command stream sent to the GPU.&NBSP ; When it encounters this command, it writes the value specified to a scratch register. the driver can then check T He value of the scratch register to determine whether that fence have come up or not. for example, if you want to kno W If the GPU is do rendering to a buffer, you ' d insert a fence after the rendering commands. you can then check th E Scratch register to determine if, fence has passed (and hence the rendering are done).
Radeon_get_bios () loads the video BIOS from the PCI ROM bar. the video BIOS contains data and command Tables.&nbs P The data tables define things like the number and type of connectors on the card and how those connectors is mapped to EN Coders, the GPIO registers and bitfields used for DDC and other i²c buses, LVDS panel information for laptops, display and Engine PLL limits, etc. the command tables is used for initializing the hardware (normally do by the system BIOS During post, but required for things like Suspend/resume and initializing secondary cards), and in systems with ATOM bios The command tables is used for setting up, the displays and changing things like engine and memory clocks.
Next, we initialize the BIOS scratch registers (Radeon_combios_initialize_bios_scratch_regs () via Radeon_combios_init () ). These registers is a-on-the-same-the-firmware on the system to communicate state to the graphics driver. They contain things like connected outputs, whether the driver or the firmware would handle things like lid or mode change Events, etc.
Radeon_boot_test_post_card () checks to see whether the system BIOS had posted the card or not. This was used to determine whether the card needs to being initialized by the driver using the BIOS command tables or if the S Ystem bios as already done it.
Radeon_get_clock_info () Gets the PLL (Phase Locked Loop, used to generate clocks) information from the BIOS tables. This includes the display PLLs, engine and memory plls and the reference clock, the PLLs use to generate their final C Locks.
Radeon_pm_init () initializes the power management features of the chip.
Next the MC (Memory Controller) is initialized (R100_mc_init ()). The GPU has it's own address space similar to the CPU. Within that address space you map VRAM and Gart. The blocks on the chip (3D engines, display controllers, etc) access these resources via the GPU s address space. VRAM is mapped at one offset and gart at another. If you want to read from a texture located in Gart memory, you ' d point the texture base address at some offset in the Gart Aperture in the GPU ' s address space. If you want to display a buffer in VRAM on your monitor, you ' d point one of your CRTC base addresses to an address in the VRAM aperture in the GPU ' s address space. The MC init function determines how much VRAM are on the "where to place" VRAM and Gart in the GPU ' s address space.
Radeon_fence_driver_init () initializes the common code used for fences. See above for more on fences.
Radeon_irq_kms_init () initializes the common code used for IRQs.
Radeon_bo_init () initializes the memory manager.
R100_pci_gart_init () sets up the board Gart Mechanism and Radeon_agp_init () initializes AGP Gart. This allows the GPU into Access buffers in system memory. Since system memory is paged, large allocations was not contiguous. The Gart provides a-many disparate pages look like one contiguous block by using address remapping. With AGP, the Northbridge provides the the address remapping, and your just point the GPU's AGP aperture at the one provide D by the Northbridge. The on-board Gart provides the same functionality for NON-AGP systems (PCI or PCIE).
Next up We have r100_set_safe_registers (). This function sets the list of registers, the command buffers from userspace is allowed to access. When a userspace driver like the DDX (2D) or Mesa (3D) sends commands to the GPU, the DRM checks those command buffers to Prevent access to unauthorized registers or memory.
Finally, R100_startup () programs The hardware with everything set up in R100_init (). It ' s a separate function since it's also called when resuming from suspend as the current hardware configuration needs to Be restored in the case as well. The VRAM and Gart setup are programmed in R100_mc_program () and r100_pci_gart_enable (); IRQs is setup in R100_irq_set ().
R100_cp_init () initializes the CP and sets up the ring buffer. the CP are the part of the the chip that feeds Accelera tion commands to the gpu. It's fed by a ring buffer, the the driver (CPU) writes to and the GPU reads from. is Sides commands, you can also write pointers to command buffers stored elsewhere in the GPU's address space (called an Indi Rect buffer) . For example, the 3D driver might send a command buffer to the DRM; After checking it, the DRM would put a pointer to that command buffer on the ring, followed by a fence. when the CP Gets to the pointer in the ring, it fetches the command buffer and processes the commands in it, then returns to where it Left off on the ring. buffers referenced by the command buffer was "locked" until the fence passes since the GPU is a Ccessing them in the execution of those commands.
R100_wb_init () initializes scratch register writeback which is a feature that lets the GPU update copies of the scratch re Gisters in Gart memory. This allows the driver (running in the CPU) to access the content of those registers without have to read them from the MMIO register aperture which requires a trip across the bus.
R100_ib_init initializes the indirect buffers used for feeding command buffers to the CP from userspace drivers like the 3 D driver.
The display side is set to Radeon_modeset_init (). First we set up the display limits and mode callbacks, then we set up the output properties (Radeon_modeset_create_props () That was exposed via Xrandr properties when X was running.
Next, we initialize the Crtcs in Radeon_crtc_init (). Crtcs (also called display controllers) is the blocks on the chip this provide the display timing and determine where in The framebuffer a particular monitor points to. A CRTC provides an independent "head." Most Radeon Asics has a Crtcs; The new evergreen chips has six.
Radeon_setup_enc_conn () sets up the connector and encoder mappings based on video BIOS data tables. Encoders is things like DACs for analog outputs like VGA and TV, and TMDS or LVDS encoders for things like digital DVI or LVDS panels. An encoder can being tied to one or more connectors (e.g., the TV DAC was often tied to both the S-video and a VGA port or the Analog portion of a dvi-i port). The mapping is important as you need to know what encoders be in use and what they be tied to in order to program the DI Splays properly.
Radeon_hpd_init () is a macro, points to the Asics specific function to initializes the HPD (hot Plug Detect) hardware F or digital monitors. HPD allows a interrupt when a digital monitor is connected or disconnected. When this happens the driver would take appropriate action and generate a event which userspace apps can listen for. The app can then display a message asking the user what they want to do, etc.
Finally, Radeon_fbdev_init () sets up the DRM kernel FB interface. This provides a kernel FB interface on top of the DRM for the console or other kernel FB apps.
When the driver was unloaded the whole process happens in reverse; This time all the *_fini () functions is called to tear down the driver.
The next set of articles would walk through the evergreen patches available here which has already been applied upstream a nd explain what each patch does to bring up support for evergreen chips.
Graphics systems in "original" Linux environments and AMD R600 graphics Programming (8)--AMD graphics DRM driver initialization process