Why is the cache so multilevel? Why is the level bigger than the first? Is the larger the cache the better?

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed. Click above "public number" can subscribe to Oh!

How to Balance L1, L2 and L3, with a fixed number of transistors to achieve the best combination of results, this is a balance of art. After years of practice, now relatively fixed, Intel and AMD's L1 cache hit ratio, now often more than 95%, adding more L1 effect is not very significant, now more is to increase the L3 to achieve the same cost, to do more things.

cache do a great job?

L3 is now dozens of M, which is much more generous than it used to be, but it lags significantly behind Moore's law's increased memory capacity. Why is cache growth so slow? Or the cost of the problem. One of the simplest SRAM will consume 6 transistors:

Plus tag, you need a minimum of dozens of transistors at a very high cost. How much does it cost to increase the cache and measure the performance hit rate?

For the sake of simplification, we assume that the L1 is maintained at less than 60% of the hit rate (in fact around 95%). As can be seen, with the increase of L2 capacity, the L2 and the overall hit rate increase rapidly at the beginning, which indicates that increasing the L2 capacity utility is obvious. Then L2 's hit rate slowed down after the capacity increased to 64KB, while the overall hit ratio slowed down at the same time, and eventually it didn't even change much. By adding the same transistors, the benefits are diminishing and marginal utility declines.

How did the cache get organized and worked? Does not make a difference in the different mappings between the results?

picture source ExtremTech

Note that the left side is occupied by the L2 cache, the right side is also some L1 cache, the overall larger than 65% die size is used for the cache.

a brain hole

We put down the annoying cost and process problem for a while, assuming we are rich and capricious, make a A4 paper-like L1 cache, put our noble core in the middle, like this:

our L1 cache can do 1G, completely do not memory, we will not get a performance awesome CPU it? Unfortunately, not at all. Because L1 do so large, in a clock domain (synchronous clock design is much more than asynchronous simple) can access all units, clock frequency does not high, and core because to operate in a cycle L1, also have to give up the price of more than 3G, and greatly reduce the frequency. Synthesis down may not be as part of doing L1, part of doing L2 and L3. This L1 can be very high frequency, the core does not need to self-deprecating. This also proves the need for cache grading on the other hand.

conclusion

Intel will add a new cache at about every 10 years. So far, we've got the L4 cache. On the other hand, the same number of transistors, in the end is used to do the cache or better branch predictor, or more core, which is also testing the ability of designers.

Next article will introduce the CAM (Content addressable Memory) as the main body of the tag and its relationship with SRAM, please look forward to.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.