Çok Çekirdekli İşlemcilerde Set-Bazlı Dinamik Önbellek Bölünmesi

Günümüzde çok çekirdekli işlemciler, çekirdek-dışı bellek erişimlerindeki gecikmeleri azaltmak için çekirdekler tarafından paylaşılabilen bir son seviye önbellek içermektedir. Ancak, çekirdekler üzerinde paralel olarak çalışan uygulamalar çok fazla sayıda önbellek çatışmasına yol açarak bu tip önbelleklerden elde edilebilecek faydaları kısıtlayabilmektedir. Literatürde bu önbellek seviyesini bölümleyerek her uygulamaya özel bir alan yaratan ve sonuçta önbellek çatışmalarını azaltmaya çalışan birçok çalışma mevcuttur. Genelde bu çalışmalar her bir çekirdeğe ihtiyacına uygun sayıda önbellek yolu atamaya odaklanmıştır. Bunun yanında son zamanlarda önerilen set bazında önbellek bölümlenmesi öneren çalışmalar da mevcuttur. Set-bazlı bölümlemenin yol-bazlı bölümlemeye göre birtakım avantajları bulunmaktadır. Bu çalışma, son seviye önbellek yapılarını set-bazlı olarak bölerek işlemci başarımının iyileştirilmesini hedeflemektedir. Bölümleme kararları, donanım yardımı ile periyodik olarak toplanan, çalışan uygulamalara ait çalışma-anı istatistikleri yardımıyla verilmektedir.

Anahtar Kelimeler:

Eş-zamanlı Çoklu İş Parçacıklı İşlemciler, Dinamik Önbellek Bölümleme

PDF

___

Suh, G. E., Devadas, S., ve Rudolph, L., “A new memory monitoring scheme for memory-aware scheduling and partitioning”, In Proc. of the 8th Int. Symp. on High-performance Computer Architecture. Washington, DC, 117–128, 2002.
Suh, G. E., Rudolph, L., ve Devadas, S., “Dynamic partitioning of shared cache memory”, Journal of Supercomputing, Cilt 28, No 1, 7–26, 2004.
Stone, H. S., Turek, J., ve Wold, J. L., “Optimal partitioning of cache memory”, IEEE Transactions on Computers, Cilt 41, No 9, 1054–1068, 1992.
Chiou, D., Rudolph, L., Devadas, S., ve Ang, B. S., “Dynamic cache partitioning via columnization”, In Proceedings of Design Automation Conference, 2000.
Settle, A., Connors, D., Gibert , E., ve González, A., “A dynamically reconfigurable cache for multithreaded processors”, Journal of Embedded Computing - Issues in Embedded Single-chip Multicore Architectures, Cilt 2, No 2, 221–233, 2006.
Lin, J., Lu, Q., Ding, X., Zhang Z., Zhang, X., ve Sadayappan, P., “Gaining insights into multi-core cache partitioning: Bridging the gap between simulation and real systems”, In Int. Symp. on High-Performance Comp. Architecture, Salt Lake City, UT, 367–378, 2008.
Qureshi, M. K., ve Patt, Y. N., “Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches”, In Proc. of the 39th IEEE/ACM Int. Symp. on Microarchitecture, Washington, DC, 423–432, 2006.
Moreto, M. M., Cazorla, F. J., Ramirez, A., ve Valero, M., “Dynamic cache partitioning based on the MLP of cache misses”, Transactions on High-performance Embedded Architectures and Compilers III, Berlin, Heidelberg, 3–23, 2011.
Moreto M. M., Cazorla, F., Ramirez, A., ve Valero, M., “Explaining dynamic cache partitioning speed ups”, IEEE Comp. Architecture Letters, Cilt 6, No 1, 1–4, 2007.
Sanchez, D., ve Kozyrakis, C., “Vantage: Scalable and efficient fine-grain cache partitioning”, In Proceedings of the 38th Annual International Symposium on Computer Architecture, New York, NY, 57–68, 2011.
Sanchez, D., ve Kozyrakis, C., “The zcache: Decoupling ways and associativity”, In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, 187–198, 2010.
Sharkey, J. J., Ponomarev, D., ve Ghose, K., “M-sim: A flexible, multithreaded architectural simulation environment”, Tech. Rep. CS-TR-05-DP01, Department of Computer Science, State University of New York, Binghamton, NY, 2005.