A Fine-grain and scalable set-based cache partitioning through thread classification

As contemporary processors utilize more and more cores, cache partitioning algorithms tend to preserve cache associativity with a finer-grain of control to achieve higher throughput and fairness goals. In this study, we propose a scalable set-based cache partitioning mechanism, which welds an allocation policy and an enforcement scheme together. We also propose a set-based classifier to better allocate partitions to more deserving threads, a fast set redirection logic to map accesses to dedicated cache sets, and a double access mechanism to overcome the performance penalty due to a repartitioning phase. We compare our work to the best line-grain cache partitioning scheme that is available in the literature. Our results show that set-based partitioning improves throughput and fairness by 5.6 % and 4.8 % on average, respectively. The maximum achievable gains are as high as 33 % in terms of throughput and 23 % in terms of fairness.