Gürhan KÜÇÜK, İsa Ahmet GÜNEY

ÇOK ÇEKİRDEKLİ İŞLEMCİLERDE SET-BAZLI DİNAMİK ÖNBELLEK BÖLÜNMESİ

Günümüzde çok çekirdekli işlemciler, çekirdek-dışı bellek erişimlerindeki gecikmeleri azaltmak için çekirdekler tarafından paylaşılabilen bir son seviye önbellek içermektedir. Ancak, çekirdekler üzerinde paralel olarak çalışan uygulamalar çok fazla sayıda önbellek çatışmasına yol açarak bu tip önbelleklerden elde edilebilecek faydaları kısıtlayabilmektedir. Literatürde bu önbellek seviyesini bölümleyerek her uygulamaya özel bir alan yaratan ve sonuçta önbellek çatışmalarını azaltmaya çalışan birçok çalışma mevcuttur. Genelde bu çalışmalar her bir çekirdeğe ihtiyacına uygun sayıda önbellek yolu atamaya odaklanmıştır. Bunun yanında son zamanlarda önerilen set bazında önbellek bölümlenmesi öneren çalışmalar da mevcuttur. Set-bazlı bölümlemenin yol-bazlı bölümlemeye göre birtakım avantajları bulunmaktadır. Bu çalışma, son seviye önbellek yapılarını set-bazlı olarak bölerek işlemci başarımının iyileştirilmesini hedeflemektedir. Bölümleme kararları, donanım yardımı ile periyodik olarak toplanan, çalışan uygulamalara ait çalışma-anı istatistikleri yardımıyla verilmektedir.

SET-BASED DYNAMIC CACHE PARTITIONING ON CHIP MULTIPROCESSORS

Today, most of the chip multiprocessor architectures utilize a shared last level cache to reduce the off-chip memory delay. Benefit from such a cache may be very limited due to cache conflicts caused by applications running in parallel. In the literature, there are numerous studies that try to reduce cache conflicts by partitioning this cache level and allocating dedicated cache areas to each application. These studies generally focus on policies dedicating an appropriate number of ways to each core. There has also been recent studies suggesting set-based cache partitioning. Set-based partitioning has a number of advantages over way-based partitioning. This study aims to improve the processor performance by using a mechanism to dynamically partition the cache based on sets. The resizing decisions for partitions are made according to statistics collected at periodic intervals.

PDF

___

1. Suh, G. E., Devadas, S., ve Rudolph, L., A new memory monitoring scheme for memory-aware scheduling and partitioning, In Proc. of the 8th Int. Symp. on High-performance Computer Architecture. Washington, DC, 117128, 2002.
2. Suh, G. E., Rudolph, L., ve Devadas, S., Dynamic partitioning of shared cache memory, Journal of Supercomputing, Cilt 28, No 1, 7 26, 2004.
3. Stone, H. S., Turek, J., ve Wold, J. L., Optimal partitioning of cache memory, IEEE Transactions on Computers, Cilt 41, No 9, 10541068, 1992.
4. Chiou, D., Rudolph, L., Devadas, S., ve Ang, B. S., Dynamic cache partitioning via columnization, In Proceedings of Design Automation Conference, 2000.
5. Settle, A., Connors, D., Gibert , E., ve González, A., A dynamically reconfigurable cache for multithreaded processors, Journal of Embedded Computing - Issues in Embedded Single-chip Multicore Architectures, Cilt 2, No 2, 221233, 2006.
6. Lin, J., Lu, Q., Ding, X., Zhang Z., Zhang, X., ve Sadayappan, P., Gaining insights into multi-core cache partitioning: Bridging the gap between simulation and real systems, In Int. Symp. on High-Performance Comp. Architecture, Salt Lake City, UT, 367378, 2008.
7. Qureshi, M. K., ve Patt, Y. N., Utility-based cache partitioning: A low-overhead, highperformance, runtime mechanism to partition shared caches, In Proc. of the 39th IEEE/ACM Int. Symp. on Microarchitecture, Washington, DC, 423432, 2006.
8. Moreto, M. M., Cazorla, F. J., Ramirez, A., ve Valero, M., Dynamic cache partitioning based on the MLP of cache misses, Transactions on High-performance Embedded Architectures and Compilers III, Berlin, Heidelberg, 323, 2011.
9. Moreto M. M., Cazorla, F., Ramirez, A., ve Valero, M., Explaining dynamic cache partitioning speed ups, IEEE Comp. Architecture Letters, Cilt 6, No 1, 14, 2007.
10. Sanchez, D., ve Kozyrakis, C., Vantage: Scalable and efficient fine-grain cache partitioning, In Proceedings of the 38th Annual International Symposium on Computer Architecture, New York, NY, 5768, 2011.
11. Sanchez, D., ve Kozyrakis, C., The zcache: Decoupling ways and associativity, In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, 187198, 2010.
12. Sharkey, J. J., Ponomarev, D., ve Ghose, K., Msim: A flexible, multithreaded architectural simulation environment, Tech. Rep. CS-TR-05- DP01, Department of Computer Science, State University of New York, Binghamton, NY, 2005.