Gem vs TTM

Source: Internet
Author: User

From: http://lwn.net/Articles/283793/

Getting high-performance, three-dimen1_graphics working under Linux is quite a challenge even when the fundamental hardware programming information is available. one component of this problem is memory management: a graphics processor (GPU) is, essential, a computer of its own with a distinct view of memory. managing the GPU's memory-and its view of system Ram-must be done carefully if the resulting system is intended to work at all, much less with acceptable performance.

Not that long ago, it appeared that this problem had been solved with thetranslation table maps (TTM) subsystem. TTM remains outside of the mainline kernel, though, as do all drivers which use it. A recent queryabout what wocould be required to get TTM merged led to an interesting discussion where it turned out that, in fact, TTM may not be the future of graphics memory management after all.

A number of complaints about TTM have been raised. its API is far larger than is needed for any free Linux driver; it has, in other words, a certain amount of Code dedicated to the needs of Binary-only drivers. the fencing mechanism (which manages concurrency between the host CPUs and the GPU) is seen as being complex, difficult to work with, and not always yielding the best performance. heavy use of memory-mapped buffers can create performance problems of its own. the ttm api is an exercise in trying to provide for everything in all situations; As a result it is, according to some driver developers, hard to match to any specific hardware, hard to get started with, and still insufficiently flexible. and, importantly, there is a distinct partition age of working free drivers which use TTM. so Dave Airlie worries:

I was hoping that by now, one of the radeon or nouveau drivers wowould have adopted TTM, or at least demoed something working using it, this hasn't happened which worries me... the real question is whether TTM suits the driver writers for use in Linux desktop and embedded environments, and I think so far I'm not seeing enough positive feedback from the desktop side

All of these worries wowould seem to be moot, since TTM is available and there is nothing else out there. Else T, as it turns out, thereIsSomething out there: it's called the graphics execution manager, or gem. the intel-sored gem project is all of one month old, as of this writing. the gem developers had not really intended to announce their work quite yet, but the TTM discussion brought the issue to the fore.

Keith Packard's introduction to gemshortdes a document describing the API as it exists so far. there are a number of significant differences in how gem does things. to begin with, Gem allocates graphical buffer objects using normal, anonymous, user-space memory. that means that these buffers can be forced out to swap when memory gets tight. there are clear advantages to this approach, and not just in memory flexibility: it also makes the implementation of suspend and resume easier by automatically providing backing store for all buffer objects.

The gem API tries to do away with the mapping of buffers into user space. that mapping is expensive to do and brings all sorts of interesting issues with cache coherency between the CPU and GPU. so, instead, buffer objects are accessed with simpleRead ()AndWrite ()CILS. or, at least, that's the way it wocould be if the gem developers cocould attach a file descriptor to each buffer object. the kernel, however, does not make the management of that particular file descriptors easy (yet), so the real API uses separate handles for buffer objects and a seriesIOCTL ()CILS.

That said, it is possible to map a buffer object into user space. but then the user-space driver must take explicit responsibility for the management of cache coherency. to that end there is a setIOCTL ()Callfor managing the "Domain" of a buffer; the domain, essential, describes which component of the system owns the buffer and is entitled to operate on it. changing the domains (there are two, one for read access and one for writes) of a buffer will perform the necessary cache flushes. in a sense, this mechanisms resembles the streaming dma api, where the ownership of DMA buffers can be switched between the CPU and the peripheral controller. that is not entirely surprising, as a very similar problem is being solved.

This API also does away with the need for explicit Fence operations. instead, a CPU operation which requires access to a buffer will simply wait, if necessary, for the GPU to finish any outstanding operations involving that buffer.

Finally, the gem API does not try to solve the entire problem; a number of important operations (such as the execution of a set of GPU commands) are left for the hardware-specific driver to implement. GEM is, thus, quite specific to the needs of Intel's driver at this time; it does not try for the same sort of generality that was a goal of TTM. as describedby Eric Anholt:

The problem with TTM is that it's designed to expose one General API for all hardware, when that's not what our drivers want... we're re trying to come at it from the other direction: implement one driver well. when someone else implements another driver and finds that there's code that shoshould be common, make it into a support library and share it.

The advantage to this approach is that it makes it relatively easy to create something which works well with Intel drivers. and that may well be a good start; one working set of drivers is better than none. on the other hand, that means that a significant amount of work may be required to get gem to the point where it can support drivers for other hardware. there seem to be two points of view on how that might be done: (1) Add capabilities to gem when needed by other drivers, or (2) have each driver use its own memory manager.

The first approach is, in several ways, more pleasing. but it implies that the gem API cocould change significantly over time. and that, in turn, cocould delay the merging of the whole thing; the gem API is exported to user space, and, as a result, must remain compatible as things change. so there may be resistance to a quick merge of an API which looks like it may yet have to evolve for some time.

The second approach, instead, is best describedby Dave Airlie:

Well the thing is I can't believe we don't know enough to do this in some way generically, but maybe the TTM vs gem thing proves its not possible. so we can be then punt to having one memory manager per driver, but I suspect this will be a maintenance nightmare, so if people decide this is the way forward, i'm happy to see it happen. however the person submitting the memory manager n + 1 must damn well be willing to stand behind the interface until time ends, and explain ain why they couldn't re-use 1 .. n memory managers.

One other remaining issue is performance. keith whitwell posted some benchmark results showing that the i915 driver performs significantly worse with either TTM or gem than. keith Packard gets different results, though; his tests show that the gem-based driver is significantly faster. clearly there is a need for a set of consistent benchmarks; performance of graphics drivers is important, but performance cannot be optimized if it cannot be reliably measured.

The use of anonymous memory also raises some performance concerns: a first-person shooter game will not provide the same experience if its blood-and-Gore textures must be continually paged in. anonymous memory can also be high memory, and, thus, not necessarily accessible via a 32-bit pointer. some GPU hardware cannot address high memory; that will likely force the use of bounce buffers within the kernel. in the end, GEM will have to prove that it can deliver good performance; Gem's developers are highly motivated to make their hardware look good, so there is a reasonable chance that things will work out on this front.

The conclusion to draw from all of this is that the GPU memory management problem cannot yet be considered solved. gem might eventually become that solution, but it is a very new API which still needs a fair amount of work. there is likely to be a lot of work yet to be done in this area.

(Thanks to Timo jyrinki for suggesting this topic .)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.