Trade off with power consumption. ⢠Severe die overhead. Gb/s/pin. IO. GDDR5. 7.0. X32. DDR4 ... Page 9 .... 1 channel
Confidential
HBM: Memory Solution for High Performance Processors Kevin Tran and June Ahn
SK hynix Inc. October 2014
© 2014 SK hynix Inc. This material is proprietary of SK hynix Inc. and subject to change without notice.
Agenda
“Completion of HBM1 Qualification” Announcement Landscape of memory solution challenges Why HBM is the suitable solution HBM Architecture Review Conclusion
Achievement of HBM1 Qualification SK hynix achieved Customer Qualification Level samples in Sep’14 SK hynix TSV chronicle
SK hynix World-First HBM Products Worldwide first HBM provider
‘08 4Gb Flash
‘10 4Gb DRAM DDP -W LP
‘11 16Gb DRAM 9M CP
Customer Qualification Samples shipping today Volume production begins Q1’15 HBM2 design wins in progress with major SoCs in multiple markets
‘11 16/ 32GB DI M M
‘11 4hi KGSD WIO
’13 5m KGSD HBM
(Bottom View )
(Top View ) (Section View )
Agenda
“Completion of HBM1 Qualification” Announcement Landscape of memory solution challenges Why HBM is the suitable solution HBM Architecture Review Conclusion
Challenges#1) Bandwidth All DRAMs expect to face immense bandwidth requirement Data Rate/Pin
Challenges • DRAM transistor and tester limit over 10Gbps • Trade off with power consumption • Severe die overhead Gb/s/pin
IO
GDDR5
7.0
X32
DDR4
3.2
X16
LPDDR4
4.2
X32
Challenges#2) Density/Latency Density has increased by 1000X over the past two decades, Latency has decreased only by 56% Density/Latency
Challenges • Capacity / Cost Limitation • DRAM Scaling Challenges • Severe die overhead increase Density
Mode
tRC(ns)
1992
4Mb
Cache DRAM
110
2002
256Mb
DDR
6X
2003
1Gb
DDR2
5X
2012
4Gb
DDR3
4X~5X
2014
8Gb
DDR4
4X
Challenges#3) Power Efficiency Reducing power and increasing performance are always trade-offs Power Efficiency
Challenges Performance
Normalized DDR256 = 100%
• Trade-off between performance and power
Power
Challenges#4) Form Factor System level board design challenges for # of DRAM
Revolutionary Changes with TSV memories TSV is a revolutionary technology enabling next generation memories ▼ High Speed
▼ Lower Power
▼ High Density
▼ Small Form Factor
Agenda
“Completion of HBM1 Qualification” Announcement Landscape of memory solution challenges Why HBM is the suitable solution HBM Architecture Review Conclusion
General Memory Requirements Each application has different memory requirement, but most common are high bandwidth and density based on real time random operation. HP C
Bandwidth Density Power
Graphics
Real time Based Random Operation
Datacenter
Density Bandwidth Power Latency
Netw orking
Bandwidth Power Latency
Bandwidth Latency Power
Low Power Lower speed/pin and Cio reduce power consumption by 42% compared to GDDR5
Low Latency Pseudo channel improves tFAW by 60% compared to DDR4
Small Form Factor 1GB HBM package size is smaller than 1 tablet of aspirin PKG Size DDR3
2.5D stacking size
X44 bigger HBM 1GB
HBM 1GB
SoC
DDR4
X37 bigger vs. HBM
X1
HBM 1GB
HBM 1GB
Silicon in Package SiP enhances memory channel conditions Memory Channel Conditions
Long Distance Loading, Power , C&R ↑ Speed, tDV ↓
DRAM
SiP
DRAM vs. SiP
Items
DRAM (Off-chip)
SiP (2.5D)
Physical dimension
very large
small
Signal Distance
long
short
Signal Loading
high
low
Driver Strength
large
small
IO Speed /Pin
high
low
Termination
need
no
Power Consumption
high
Low
Agenda
“Completion of HBM1 Qualification” Announcement Landscape of memory solution challenges Why HBM is the suitable solution HBM Architecture Review Conclusion
Through Silicon Via - TSV TSV is the underlying technology for 3D Stack (High Density / Small size PKG / High speed)
Die Pad
Die Pad Bond wire
TSV
PKG Pad
PKG Pad
Die #2 Die #1
< Wire bonding PKG >
< TSV PKG >
HBM 2.5D SiP Structure System-in-Package implementation with KGSD* FBGA
HBM in 2.5D SiP Side Molding
Side Molding
DRAM Slice DRAM Slice DRAM Slice DRAM Slice DA ball
TSV
SoC
PHY
PHY
Interposer
KGSD 3D Memory(HBM) Base die
* KGSD (Known Good Stacked Die)
Silicon die
PKG Substrate
(Source : esilicon)
HBM Overall specification HBM1 2Gb Density per DRAM die 1Gbps speed /pin 128GB/s Bandwidth 4 Hi Stack (1GB) x1024 IO 1.2V VDD KGSD w/ μBump
HBM2 8Gb per DRAM die 2Gbps speed/pin 256GBps Bandwidth/Stack 4/8 Hi Stack (4GB/8GB)
Base Die
Interposer
HBM Architecture Overview 4 Core DRAM + 1 Base logic die (Chip on Wafer) B6 B7 B6 B7
B4
B3
B5
B4
B2
128 I/O
B3
B5
B2
128 I/O
B0
B1
B0
B1
Core Die 3 1024 TSV I/O
Core Die 2 Core Die 1 Core Die 0 DA
POWER/TSV
Probe PAD (Microbump depopulated)
Microbump array
PHY
Base Logic Die
[1] D.U Lee, SK hynix, ISSCC 2014
Items
Target
# of Stack
4(Core) + 1(Base)
Ch./Slice
2
Total Ch. for KGSD
8
IO/Ch.
128
Total IO/KGSD
1024(=128 x 8)
Address/CMD
Dual CMD
Data Rate
DDR
HBM Core Architecture Each HBM die has 2channels 1 channel consists of 128 TSV I/O with 2n prefetch
DWORD 2 32 I/O
B5
B4
C
YCTRL
YCTRL
B7
B6
[2] D.U Lee, SK hynix, ISSCC 2014
DWORD 3 32 I/O
DWORD 0 32 I/O
XCTRL
XCTRL
YCTRL
YCTRL
C
YCTRL
B3
B2
XCTRL
B2
C
XCTRL
B3
B1
B3
DWORD 1 32 I/O
AWORD
DWORD 2 32 I/O
DWORD 3 32 I/O
1 bank : 2 sub-banks(64Mb) B4 B5 non-shared I/O between sub-banks
B5
B4
B5
C
YCTRL
YCTRL
C
YCTRL
YCTRL
C
YCTRL
B7
B6
B7
B6
B7
XCTRL
AWORD
YCTRL
B0
XCTRL
DWORD 1 32 I/O
XCTRL
B2
YCTRL
XCTRL
B3
C
B1
XCTRL
B6
YCTRL
B0
XCTRL
YCTRL
YCTRL
B1
XCTRL
B4
C
CH-Right
XCTRL
DWORD 0 32 I/O
XCTRL
B2
B0
XCTRL
YCTRL
B1
XCTRL
B0
XCTRL
CH-Left
HBM Base Die Architecture Base die consists of 3 Areas – PHY, TSV, Test Port Area
HBM ballout area 6,050x3,264 μm [3] D.U Lee, SK hynix, ISSCC 2014
Pseudo Channel Concept • HBM is comprised of 8ch (2Channel/die) with 128DQs per channel. • Each channel(CH) consists of 2 Pseudo Channel(PS). Only BL4 is supported. CH-6 CH-4 CH-2 PS-CH0
CH-0
CH-7 CH-5 CH-3 PS-CH1
PS-CH0
CH-1
PS-CH1
B0
B1
B0
B1
B0
B1
B0
B1
B2
B3
B2
B3
B2
B3
B2
B3
ADD CMD
64I/O
64I/O
ADD CMD
64I/O
64I/O
B4
B5
B4
B5
B4
B5
B4
B5
B6
B7
B6
B7
B6
B7
B6
B7
(Note : Pseudo channel is only applicable to HBM2)
Pseudo Channel Mode Each pseudo channel share AWORD, but has separated banks & independent 64 I/Os
Channel # CMD
Channel # ACT#-RD#
128DQ D0 128
D1 128
D2 128
D3 128
D1 64
D2 64
D3 64
16Bank / CH CH #_PC0 ACT#0-RD#0 Channel # CMD
64DQ D0 64
16Bank/PC0 CH #_PC1 ACT#1-RD#1 16Bank/PC1
64DQ D0 64
D1 64
D2 64
D3 64
Restriction of tFAW in Legacy mode • For Legacy Mode, Each channel has 2KB page size • Restriction of Gapless Bank Activation by tFAW(4 activate window) tFAW=30ns > 4Bank*tRRD=16ns Lower efficiency of Band Width • Suppose tCK=2ns, tFAW=30ns, tRRD=4ns
Gap Gap Gap
Benefit of Pseudo Channel • • • •
Pseudo channel has reduced page size compared to Legacy mode. : 2KB 1KB Lower Active Power(IDD0) by 1K Page size Define tEAW (1KB x 8 ACT) instead of tFAW (2KB x 4 ACT) Bandwidth improvement by more Activations during tFAW
Base Die Customization – Future HBM Concept Logic Layer Host I/F + Memory I/F + Base Logic/IP Block
Overcome Memory Scaling Memory Die
Customization to meet various requirements
- Timing - Refresh
- Parallel-to-Serial(P2S)/S2P - JTAG, PMBIST - Configuration Registers - Error Handling -…
27
Agenda
“HBM1 Qualification Sample” Announcement Landscape of memory solution challenges Why HBM is the suitable solution HBM Architecture Review Conclusion
TSV Ecosystem Complicated TSV Ecosystem requires close collaboration among all stakeholders KGSD/KGD Memory Vendors
SoC & Interposer Foundries
SiP Assy’ & Test
OSATs
Business Model for SiP Set Makers
IP Enablers
PHY & MC IP Controller IP
Conclusion HBM enables new memory subsystem architecture for Next Generation high performance processors. SK hynix is the leader in HBM technology SK hynix HBM1 Customer Qualification samples are shipping HBM addresses the major memory requirements with lower power, lower latency, and smaller form factor. HBM2 Pseudo channel improves bus utilization and system performance
Thank You