南京大学学报(自然科学版) ›› 2014, Vol. 50 ›› Issue (3): 330–.

• • 上一篇    下一篇

三维众核片上处理器存储架构研究

李丽*,张宇昂,傅玉祥,潘红兵,韩峰   

  • 出版日期:2014-06-01 发布日期:2014-06-01
  • 作者简介:(南京大学电子科学与工程学院微电子设计研究所,南京,210046)
  • 基金资助:
    国家自然科学基金面上项目(61176024),高等学校博士学科点专项科研基金(20120091110029)

The study of memory architectures for 3D chip multi-processors

Li Li, Zhang Yuang, Fu Yuxiang, Pan Hongbing, Han Feng, Yang Dan   

  • Online:2014-06-01 Published:2014-06-01
  • About author: (School of Electronic Science and Engineering, Nanjing University, Nanjing, 210046, China)

摘要: 三维众核片上处理器的研究近年来逐渐引起了学术界的广泛关注。三维集成电路技术可以支持将不同工艺的存储器层集成到一颗芯片上。三维众核片上处理器可以集成更大的片上缓存以及主存储器。研究了三维众核片上处理器存储架构,探索了集成SRAM L2 cache层,DRAM主存储器层等,对三维众核片上处理器性能的影响。从仿真结果来看,相比集成1层L2 cache,集成2层L2 cache的三维众核片上处理器性能最大提高了55%,平均提高34%。将DRAM主存储器集成到片上最大可以提高三维众核片上处理器80%的系统性能,平均改善34.2%。

Abstract: the performance improvement of memory architectures for three-dimensional chip multi-processors (3D CMPs). As CMPs integrate more and more cores, a great deal of data access pressure is placed on the memory subsystem. Designers face the challenges of feeding enough data to a massive number of on-die cores for CMPs. Three-dimensional integrated circuits (3D ICs) can stack memories of different process technologies into the same chip. The stacking memory bandwidth can be enlarged by using fine-pitch through-silicon vias (TSVs), which can mitigate the pressure on the I/O infrastructure for CMPs. In this paper, we start with studying the potential benefit of 3D integration and the recent advantages on the research of memory architectures for 3D CMPs. Bothe large caches and main memories can be stacked in 3D CMPs. Hence, we focus on the memory architectures for 3D CMPs in two aspects, stacking cache architecture and stacking main memory architecture. 3D CMPs can integrate much larger L2 caches compared to their 2D counterparts in the same area footprint. Meanwhile, the L2 caches can be several layers. We firstly explore the performance improvements of stacking SRAM L2 cache layers atop processor layers for 3D CMPs. The experimental results show that the 3D CMPs with 2 L2 cache layers can improve the performance up to 55% and 34% on average compared to that of 3D CMPs with 1 L2 cache layer. 3D CMPs provide opportunities for composing future systems by integrating disparate technologies memories. The off-chip DRAM main memories can be stacked on the processor layers. We secondly study the performance benefit of integrating DRAM main memories into 3D CMPs. The experiment results show that stacking DRAM main memories can provide up to 80% and on average 34.2% performance improvement for 3D CMPs compared to the 2D CMPs with off-chip DRAM main memory. Our analysis and experimental results give a guideline to design efficient 3D CMPs with stacking SRAM L2 caches and DRAM main memories.

[1] Wulf W, McKee S. Hitting the memory wall: Iimplications of the obvious. ACM SIGARCH Computer Architecture News, 1995, 23(1): 20~24..

[2] Topol A W, La Tulipe D C, Shi Jr L, et al. Three-dimensional integrated circuits. IBM Journal of Research and Development, 2006, 50(4, 5): 491~506.
[3] Davis W R, Wilson J, Mick S, et al. Demystifying 3D ICS: The pros and cons of going vertical. IEEE Design and Test of Computers, 2005, 22(6): 498~510.
[4] Ieong M, Guarini K W, Chan V, et al. Three dimensional CMOS devices and integrated circuits. In: Proceedings of the IEEE 2003 Custom Integrated Circuits Conference, 2003: 207~213.
[5] Feero B S, Pande P P. Networks-on-chip in a three-dimensional environment: A performance evaluation. IEEE Transactions on Computers, 2009, 58(1): 32~45.
[6] Van der Plas G, Limaye P, Loi I, et al. Design issues and considerations for low-cost 3-D TSV IC technology. IEEE Journal of Solid-State Circuits, 2011, 46(1): 293-307.
[7] Millberg M, Nilsson E, Thid R, et al. The nostrum backbone-a communication protocol stack for networks on chip. In: Proceedings of the 17th International Conference on VLSI Design, 2004: 693~696.
[8] Axel Jantsch, Hannu Tenhunen. Networks on chip. US: Kluwer Academic Publishers, 2003, 303.
[9] Xu T, Liljeberg P, Tenhunen H. Exploring DRAM last level cache for 3D network-on-chip Architecture. Advanced Materials Research, 2010, 403: 4009~4018.
[10] Meng J, Kawakami K, Coskun A. Optimizing energy efficiency of 3-D multicore systems with stacked DRAM under power and thermal constraints. In: Proceedings of the 49th ACM/EDAC/IEEE Design Automation Conference, 2012: 648~655.
[11] Zhang Y, Li L, Lu Z H, et al. Performance and network power evaluation of tightly mixed SRAM NUCA for 3D multi-core network on chips. In: IEEE International Symposium on Circuits and Systems, 2014.
[12] Zhao W, Zhang Y, Lakys Y, et al. Embedded MRAM for high-speed computing. In: Proceedings of the IEEE/IFIP 19th International Conference on VLSI and System-on-Chip, 2011: 37~42.
[13] Raoux S, Burr G W, Breitwisch M J, et al. Phase-change random access memory: A scalable technology. IBM Journal of Research and Development, 2008, 52(4.5): 465~479.
[14] Sun G, Dong X, Xie Y, et al. A novel architecture of the 3D stacked MRAM L2 cache for CMPs. In: Proceedings of the IEEE 15th International Symposium on High Performance Computer Architecture, 2009: 239~249.
[15] Zhang W, Li T. Exploring phase change memory and 3D die-stacking for power/thermal friendly, fast and durable memory architectures. In: Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques, 2009:101~112.
[16] Milo M K M, Daniel J S, Bradford M B, et al. Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. Computer Architecture News, 2005: 92~99.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1]  于文1,倪培1**,王国光1,商力1,江来利2,王波华3,张怀东3
.  安徽金寨县沙坪沟
斑岩钥矿床成矿流体演化特征*
[J]. 南京大学学报(自然科学版), 2012, 48(3): 240 -255 .
[2]  王 彪1,2*,蒋亚立1,戴跃伟1.  基于l0范数的匹配场源定位方法[J]. 南京大学学报(自然科学版), 2017, 53(4): 675 .
[3] 张弘,申俊峰,董国臣,刘圣强,王冬丽,王伟清. 云南来利山锡矿锡石标型特征及其找矿意义[J]. 南京大学学报(自然科学版), 2019, 55(6): 888 -897 .
[4] 陈超逸,林耀进,唐莉,王晨曦. 基于邻域交互增益信息的多标记流特征选择算法[J]. 南京大学学报(自然科学版), 2020, 56(1): 30 -40 .
[5] 信统昌,刘兆伟. 基于贝叶斯⁃遗传算法的多值无环CP⁃nets学习[J]. 南京大学学报(自然科学版), 2020, 56(1): 74 -84 .