X86-another article on memory cacheable/Write combine/uncacheable and whether the memory of ioremap is cacheable

Source: Internet
Author: User

From: http://kerneltrap.org/mailarchive/linux-kernel/2008/4/29/1657814

AMIDS some heavy flaming, it's clear that there is a lot of confusion on how
Cachability and ioremap cooperate on x86 on a hardware level, and how this
Interacts with Linux (both before 2.6.24 and in current trees ).
This email tries to describe the various aspects and constraints involved,
In the hope to take away the confusion and to make clear how Linux works,
Both in the past and going forward.
(Without degrading to flames again, lets keep this thread technical please)

Cachable... what does it mean?
-----------------------------
For the CPU, if a piece of memory is cachable (how it decides that I'll cover
Later), it means that
1) the CPU is allowed to read the data from the memory into its cache at any
Point in time, even if the program will never actually read the memory.
Hardware prefetching, speculative execution etc all can cause the CPU
To get content into its caches if it's cachable. the CPU is also allowed
To hold on to this content as long as it wants, until something in
System forces the CPU to remove the cache line from its cache.
2) the CPU is allowed to write the contents from its cache back to memory
Any point in time, even if the program will never actually write to
Cacheline; the later is the result of speculation etc; what will be written
In that case is the clean cacheline that was in the cache.
(AMD CPUs seem to do this relatively aggressively; Intel CPUs may or may
Not do this)
3) the CPU is allowed to write a full cacheline without having read it; it
Will just get the cacheline exclusive in this case.
4) the CPU is allowed to hold on to written cache lines without writing them
Back for as long as it wants, until something in the cache coherency
Protocol forces a commit or discard.

Practically speaking this means that a memory location that the CPU sees
Cachable, needs to be on some device that takes part of the cache coherency
Device or a very special case (such as ROM) that:
-The device must be readable.
-Writing must be idempotent, ordering-independent and access-size-independent.
-Writing back a read value must be safe and side-effect-free.
-Any side effects due to a write can be delayed until the data is explicitly
Flushed by software.

Anything else will lead to data loss (read: Upload uption) or other "very weird ",
Unpredictable things will happen.
Regular memory is cache coherent, and DMA (with a few very special cases
Exception that are beyond the scope of this document) is cache coherent
The CPU on a PC. PCI mmio regions and other similar pieces of device memory
Are not cache coherent.

Uncachable... What does that mean?
---------------------------------
Uncachable is easier than cachable for the CPU... In short it means that
1) Every read will go over the bus and will come from the actual device,
Not the CPUs caches.
2) Every write will go over the bus and will bypass the CPUs caches.
Note: On PCI, the PCI chipset is allowed to buffer (post) Such writes and
Group them into bigger transactions before devices actually see the data.
However reads will not pass writes.

Write combining... What does that mean?
--------------------------------------
Write combining is like uncachable in several ways, with one exception:
1) the CPU is allowed to buffer and group consecutive writes into bigger IOS.
This feature is used mostly by graphics cards for accelerating bigger copies
Into its video memory.
What happens if you read the data that is being buffered is somewhat undefined,
However if your CPU supports "Self snooping" ("SS" in/proc/cpuinfo) then
Expected thing happens (you get the data you just wrote ).

Mixing... can we do that?
------------------------
What if you mix the two rules from abve for the same piece of physical memory?

The short answer is: Don't do that!

The longer answer is: Too weird things can happen, including CPU or chipset
Lockups. The software developer manuals from varous CPU vendors explain how
You can safely do transitions from one to the other.

How does something become cachable/uncachable?
------------------------------------------------
So far so good, easy stuff. However, now it gets more tricky (and more x86
Specific ).
First of all, there are some CPU configuration bits (CR registers and MSRs)
That allow you to turn on/off caching entirely. The BIOS will turn these on,
And Linux will pretty much never touch these so I'll leave them out for
Rest of this discussion, and just assume these are enabled.

There are 2 major factors that decide if a (virtual) memory location is
Considered cachable by the CPU:
1) The page table bits for the virtual memory location
2) memory type range registers for the physical memory location

The page table bits can set, in practice, 4 different settings for a piece
Memory: (these are often called Pat bits)
1) cachable (default)
2) Write combining
3) Weak uncachable (ucminus)
4) Strong uncachable

The mtrr can also set different settings
1) cachable
2) Write combining
3) uncachable

If both the page table and the mtrr agree, things are easy. But what if they
Disagree?
The table below describes the end result for the varous combinations

| UC-wc wb [Pat]
---- + ----------------
UC | UC WC UC
WC | UC WC
WB | UC WC WB
[Mtrr]

UC-(strong) uncachable
UC-weak uncachable
WC-write combining
WB-write back (cachable)

What happens on PCs
-------------------
On PCs, if the BIOS is not too buggy, the BIOS will set up the mtrrs such that
All regular memory is cachable, and that all mmio space is set to uncachable,
With a possible exception for the video memory that may be set to write combining.
Linux will not remove mtrrs the BIOS sets up.
(This tends to give problems in SMM mode or with suspend/resume ).

The operating system tends to use the page table bits to control cachability,
And Linux (well x.org) will add mtrrs for the graphics memory if the BIOS did
Not set this up as write combining and there are some free mtrrs left
Programming.

The net effect of this is (see the table above) that mmio space is not cachable
By the CPU, the only thing that the OS can do is turn uncachable space
Write combining space for a few special cases.

Regular memory is cachable; the OS can decide to mark pieces of it uncachable;
This may be useful for very specific hardware tricks and for things like AGP textures
Or video cards that use main memory as video RAM.

Ioremap-Past, Present and Future
----------------------------------
Ioremap () is the Linux API to "map" memory on devices (such as mmio space on
PCI cards) into the kernels address space so that Linux can then access this
Memory, generally from the device driver.

Upto Linux version 2.6.24, Linux wocould not set any special cache bits in
Page table for ioremap () D device memory on x86. in practice, as long as
BIOS was not too buggy, The mtrrs wocould take care of making sure that card
Memory was accessed in an Uncached way (see the table). Occasionally
BiOS wocould be buggy and weird things wowould happen. Other types of memory
That get ioremap 'd... depends on the BIOS.

Quite some time ago, an API function called "ioremap_uncached ()" was introduced
That, in theory, shocould be used when the device driver knows he only wants
Uncachable memory mapping. Use of this API is limited to a few handful
Drivers, even though the vast majority really wants (and gets) uncachable memory.

Recently, the behavior of ioremap () has changed: ioremap () now explicitly sets
The (weak) uncachable bits in the page table; An ioremap_cached () function can
Be used by the handful of places that really wants a cached mapping (but beware
Of the caching rules! PCI mmio space shouldn't use this unless it's a Rom! See
The rules above)

There are several reasons for this change:
1) mtrrs are a problem and Linux is, over the next kernel releases, going
Depend less and less on them (the Pat work is a step in that direction ).
2) depending on the same ue of the BIOS is a trap, especially since there are
Good ways to make sure we get the type we want (uncachable ).
3) almost all users want uncachable memory, even though they don't explain icitly
Ask for it.
4)Most other ubuntures already make ioremap () explicitly Uncached.

The rest of the story was a large flamewar that I'm not going to repeat here;
The intention for this text was to make explicit what behavior is happening
So that everyone can understand how this stuff works.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.