question: How do I implement the last level of cache partitioning in the CPU? For the LLC 2mb,cache line 64Byte, 32-way, partition the 16-way connection and keep the cache sets unchanged?
For example, the following 4 sets, 8-way connection, after partitioning into 4 valid (flag 1) cache.
1 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
1 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
1 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
1 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
Ideas:
- The key sections of the partition contain lru.cc and cacheset.hh;
- lru.cc in the section on accessing Accessblock () and replacing Findvictim () need to be modified;
- Accessblock () Controls access to only the content within the partition (the portion marked 1) for each access to the LLC;
- Findvictim () is used to control the last block within the partition when the LLC substitution occurs;
- Adds an option for specifying partitions, such as –part=16 options;
Method:
1. Modify configs/common/options.py, add – part option;
2. Use the partitioning option as part of the cache, like the cached assoc parameter, modify the src/mem/cache/basecache.py, increasepart = Param.Int("PartitionSize")
3. To make part active in the cache configuration, modify the configs/common/cacheconfig.py settings for the first, level two, and level three caches, as follows:
# Three cache definition added Part=options.part, specify LLC partition sizeifOptions.l3cache:system.l3 = L3_cache_class (Clk_domain=system.cpu_clk_domain, Size=options.l3_size, Part=options.part, Assoc=optio NS.L3_ASSOC)# Modify the part parameter of the first-level cache because only the LLC is partitioned, so part equals L1I_ASSOC unchangedifOptions.caches:icache = Icache_class (Size=options.l1i_size, Part=options.l1i_ Assoc, ASSOC=OPTIONS.L1I_ASSOC) Dcache = Dcache_class (Size=options.l1d_size, PART=OPTIONS.L1D_ASSOC, ASSOC=OPTIONS.L1D_ASSOC)# Modify the part parameter of level two cache because only the LLC is partitioned, so part equals L2_ASSOC unchangedifOPTIONS.L2CACHE:SYSTEM.CPU[I].L2 = L2_cache_class (Clk_domain=system.cpu_clk_domain, Size=options.l2_size, PART=OPTIONS.L2_ASSOC, ASSOC=OPTIONS.L2_ASSOC)
4. In order for the part partition to have an impact on the cached access behavior, it is necessary to change the replacement policy. First define part in Src/mem/cache/tags/lru.hh,
//在assoc下面加入part定义,必须按照这个顺序,与src/mem/cache/BaseCache.py中的定义顺序需要一致。protected: /** The associativity of the cache. */ const unsigned assoc; /** The partition size. by sff */ const unsigned part; /** The number of sets in the cache. */ const unsigned numSets;
Second, modify the src/mem/cache/tags/lru.cc and initialize the part parameter.
//增加 part(p->part)的初始化LRU::LRUParams*p) :BaseTags(p), assoc(p->assoc), part(p->part), numSets(p->/ (p->* p->assoc))
Again, modify src/mem/cache/tags/tags.py in relation to the LRU class definition parameter part, because the params *p reference requires this setting.
class LRU(BaseTags): ‘LRU‘ ‘LRU‘ "mem/cache/tags/lru.hh" "associativity") "partition"
By testing, it is found that the part parameter is not recognized in CacheSet, and the SRC/MEM/CACHE/TAGS/CACHESET.HH needs to be modified to increase its definition:
template <class Blktype>class CacheSet{ public: /** The associativity of this set. */ int assoc; //增加部分 /** The partition size. */
By testing, it was found that the part parameter identified in CacheSet was incorrect because Set[i was not assigned in src/mem/cache/tags/lru.cc.
//在lru的初始化中添加sets[i].part = part;unsigned0; // index into blks array for (unsigned0; i < numSets; ++i) { sets[i].assoc = assoc; sets[i].part = part;
test method notes:
Use the following debugging methods to test, note that the L1, L2 and L3 are different, convenient debugging time to distinguish.
# After modifying the contents of the source code SRC, be sure to compile it once before running, otherwise the code will not take effect. Scons-j8Build/alpha/gem5.debug# Start DebuggingGDB--args build/alpha/gem5.debug configs/example/spec06_l3_se.py--benchmark=bzip2-n1--cpu-type=detailed--cpu-clock=2GHz--caches-- L1i_size= +KB-- L1i_assoc=4-- L1d_size= +KB-- L1d_assoc=8-- L2cache-- L2_size= theKB-- L2_assoc= --- L3cache-- L3_size=2MB-- L3_assoc= +--part=9#设置断点b lru.cc: the #即LRU初始化代码中b cacheset.hh: - #即 way_id = i; medium Test part#运行到lru. cc:80, the following way to identify whether the settings are correct, CacheSet is the sameP ASSOCP PARTP name ()
5. Modify the access control for the partition, and the access invokes the FINDBLK function in src/mem/cache/tags/cacheset.hh to modify it. Access to the LLC is limited to part.
Template <class blktype>blktype*cacheset<blktype>::findblk (Addr tag,int& way_id)Const{/** * WAY_ID returns the ID of the the "the" that matches the block * IF no block are found way_id is set to Assoc. */ /** * This is the source code. way_id = Assoc; for (int i = 0; i < Assoc; ++i) {if (Blks[i]->tag = = Tag && blks[i]->isvalid ()) {Way_ id = i; return blks[i]; } } */ //modifed for LLC partition. L1 and L2 access normal. way_id = part; for(inti =0; I < part; ++i) {if(Blks[i]->tag = = Tag && blks[i]->isvalid ()) {way_id = i;returnBlks[i]; } }returnNULL;}
6. Modify how partitions are replaced. The Findvictim function in src/mem/cache/tags/lru.cc. The part partition is finally replaced.
lru::blktype*lru::findvictim (Addr Addr, Packetlist & writebacks) {unsigned set = extractset (addr); //grab a replacement candidate //blktype *blk = sets[set].blks[assoc-1]; This is the source code. Blktype *blk = Sets[set ].blks[part-1 ]; //modified if (Blk->isvalid ()) {dprintf (CACHEREPL, "set%x:selecting blk%x for replacement, Blk RefCount:%d, DRD:%d \ n ", set , Regenerateblkaddr (Blk->tag, set ), Blk->refcount, BLK->DRD); } return Blk;}
7. Repeat the test in the above remarks, done! See if the data after the part and part partitions is the same as expected. The partition has been modified.
Note:
You need to add the following in the se.py of the level three cache configuration to prevent the LLC from being automatically partitioned when you later use se.py without setting the part parameter.
# default LLC(L3) partition 16, we should specify this args. Otherwise when we do not want partition, it still executes. by sffif‘--part‘notin sys.argv: print"Error: default LLC(L3) partition is 16, we should specify this args ‘--part=PART‘. Otherwise when we do not want partition, it still executes." sys.exit(1)
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
GEM5: Implementing the last level cache LLC partition