John Schumann. IBM Server and Technology Group .... Share the goals using a common test-plan. â Easy share ... Add dedicated coverage monitors. â« Adding ...
IBM Corporation
Reaching Coverage Closure in Post-silicon Validation
Allon Adir, Amir Nahir, Avi Ziv IBM Haifa Research Lab
Charles Meissner, John Schumann
IBM Server and Technology Group
© 2010 IBM Corporation
© 2010 IBM Corporation
Some Verification Realities Complexity of designs increasing rapidly – And complexity of verification is increasing even faster
Many design have problems that require additional tape-outs, mostly due to functional bugs Design of high-end processors calls for more than one (planned) tape-out – Development of system and software – Tune-up of manufacturing process
Functional verification of the actual silicon is (or should be) an integral part of the verification process Haifa Verification Conference 2010
© 2010 IBM Corporation
Verification Flow in Post-Silicon TestTemplate
Random Stimuli Generator
Test
DUV Simulator
Test
Checking, Pass Assertions Fail
Coverage Information
Coverage Reports
Coverage Analysis Tool
But … – Most of the compute time is spent in the environment – Silicon has very limited observability and controllability
Haifa Verification Conference 2010
© 2010 IBM Corporation
Test-Template Example Genesys-Pro: Symmetric multi-processor test program generator
Haifa Verification Conference 2010
© 2010 IBM Corporation
From Test-Template to Test Process 0,0: lwa 1000, R1 stw R5, 1000 stw R6, 1000 … lwa 1000, R15
Process 0,1: stw R2, 1000 stw R11, 1000 … stw R3, 1000 stw R2, 1000
Process 0,1: Process 0,0: Process 0,1:0,1:0,1: Process Process 0,1: Process 0,0: 0,1: Process Process 0,0: Process 0,0: Process 0,0: Process 0,1: Process Process Process 0,0: Process 0,1: 0,0: Process 0,1: Process 0,0: Process 0,0: stw R4, 6A00 lwa 6A00, R1 stw R9, 1FF0 stw lwa R1, 1FF0, BAA4 stw R3 R2, C118 stw R4, 6A00 BAA4, lwa R3 C118, R2 lwa 6A00, R1 stw R1, BAA4 stw R1, 2500 lwa BAA4, stw R9, R3 lwa 1FF0 2500, R5 stwR3 R2, C118 lwa 1FF0, lwalwa C118, R2 stw R5, 6A00 stw R9, 6A00 stw R8, 1FF0 stw stw R1, R9, BAA4 stw 1FF0 R7, C118 stw R5, 6A00 BAA4, R10 R1, C118 stw R9, 6A00 stw R1, BAA4 stw R9, 2500 BAA4, stw R8, R10 lwa 1FF0 2500, R9 stwlwa R7, C118 R9,stw 1FF0 stwlwa R1,stw C118 stw R9, 6A00 lwa … 1FF0 ,…R9 … … BAA4, lwa R10 C118, R1 stw R9, 6A00 BAA4, … R10 lwa 2500, R7 … … … … lwa 1FF0 ,lwa R9 lwalwa C118, R1 stw R1, 6A00 … stw stw R9, 1FF0 R7, … BAA4 stw R7, C118 stw R1, 6A00 … … stw R7, BAA4 stw R9, 2500 … R9, 1FF0 … R7,stw C118 … … … stw stw R12, 6A00 stw R8, 1FF0 stw stw R3,stw R12, BAA4 1FF0 R4, BAA4 stw R1, C118 R12, 6A00 stw R3, BAA4 stw R4, stw BAA4 R8, lwa 1FF0 2500, R15 R12, 1FF0 stwstw R1,stw C118 Haifa Verification Conference 2010
© 2010 IBM Corporation
Bridging the Pre- and Post-Silicon Gap We need a unified methodology for the pre- and postsilicon domains – Share the goals using a common test-plan – Easy share and transfer of information between the platforms • E.g., for debug purposes
– Proven pre-silicon methodologies are a good starting point
Each platform with its own solutions adapted to its characteristics Haifa Verification Conference 2010
© 2010 IBM Corporation
Post-Silicon Coverage Coverage is one of the main means for monitoring the quality and progress of the verification process – Helps identifying areas in the DUV that are not verified or lightly verified – Tracks and drives the progress of the verification process
Coverage heavily relies on observing behaviors of the DUV – Which makes it difficult to implement in a post-silicon environment
Haifa Verification Conference 2010
© 2010 IBM Corporation
Post-silicon Coverage – Possible Solutions Use in-silicon coverage monitors – Take advantage of existing monitors • Performance monitors, etc.
– Add dedicated coverage monitors
Adding coverage monitors to silicon has negative effect on timing, power, area, … Unlike in-silicon checkers, they are not really useful in the field This solution is limited to a small number of really important coverage monitors Haifa Verification Conference 2010
© 2010 IBM Corporation
Regression Suites Regression suites are sets of testcases or testtemplates that are used periodically to ensure increasing verification quality – Interesting testcases and test-template are harvested – Harvesting is based on quality measures • E.g., coverage
Two major types of regression suites – Deterministic, based on testcases
• Known properties • Hard to maintain, sensitive to changes
– Probabilistic, based on test-templates • Easier to maintain • Exact behavior is unknown
Haifa Verification Conference 2010
© 2010 IBM Corporation
Regression Suites in Post-silicon Coverage of regression suites is a-priori known – Because the suites are built from previously executed testcases or test-templates No need to measure coverage of the suites • Limited observability is no longer an issue
Question: How do we know in advance the coverage of a given test / test-template? Answer: Execute the test / test-template on a platform that allows coverage measurement Haifa Verification Conference 2010
© 2010 IBM Corporation
“Guaranteed” Coverage Implementation Run post-silicon generation tools (exercisers) on a pre-silicon platform – Simulation is too slow to run exercisers
• Need faster execution platform – emulation, acceleration
Collect coverage data and harvest interesting testcases or test-templates
– For example, tests that contribute to coverage or tests that reveal bugs
Use the harvested tests / test-templates as (part of) post-silicon regression Haifa Verification Conference 2010
© 2010 IBM Corporation
Assume
What to Harvest?
1. The execution speed ratio is 105
The test probability of hittingand the silicon target may Running the2. same in simulation coverage eventbehavior in a 10-minute run not produce the exact same is 0.1% – Slight differences in models, asynchronous interface behavior Need “smartThen harvesting” the probability of not hitting the – Instead of harvesting tests, harvest templates that event specific in 10 minutes on silicon is provide non-negligent probability of hitting events 100,000 -44 (1 - 1/1,000) 4 x 10 – The large number of silicon cycles converts these probabilities to almost certainty – Harvest specific tests only in special cases • No proof of non-negligent probability – e.g., event hit just once in many cycles
• Test found an interesting bug Haifa Verification Conference 2010
© 2010 IBM Corporation
Regression Suite Algorithm For each event e and for each test-template t – If t covers e more than 3 times, Cov(e, t) = 1 – Else, Cov (e, t) = 0
• 3 is a small constant used to avoid hitting events by chance • The resulting Cov matrix is a 0-1 coverage matrix of testtemplates
Solve (deterministic) set cover problem
– There are many known algorithms for that problem – A simple greedy algorithm works well
Solution is a regression suite of test-templates Haifa Verification Conference 2010
© 2010 IBM Corporation
POWER7 Implementation The design was partitioned to groups of related testplan items – With a goal to create a separate regression suites for each group
Test-templates assigned costs according to their relevance to the group – Templates related to the group have the lowest cost • These templates are identified a-priori by the bring-up team
– Templates for other units have the highest cost
The set cover algorithm minimizes the cost of the suite Haifa Verification Conference 2010
© 2010 IBM Corporation
Post-silicon Coverage Platform Accelerator Silicon TestTemplate
Random Stimuli Generator Exerciser
Test
DUV Simulator
Test
Checking, Pass Assertions Fail
Coverage Information
Coverage Reports
Coverage Analysis Tool
Exercisers-on-Accelerator Post-silicon Pre-silicon platform platform Haifa Verification Conference 2010
© 2010 IBM Corporation
Exercisers-on-Accelerators (EoA) Use post-silicon tools (exercisers) on a pre-silicon platform (accelerators) – Note, this is not the only way to utilize accelerators Provide the post-silicon tools the benefits of the pre-silicon environment – Added observability and controllability Added benefits – Early validation and tune-up for post-silicon tools – Post-silicon tools contribute to the pre-silicon verification effort – Use EoA platform for recreation and analysis of bugs found on silicon with the same exercisers – Really bridge the gap between pre- and post-silicon verification Haifa Verification Conference 2010
© 2010 IBM Corporation
Results Coverage-driven EoA was used in the verification of the Power7 processor Encouraging pre-silicon results – EoA was fully integrated into the pre-silicon flow – EoA reached high level of coverage • Almost as high as simulation
– Coverage driven approach led to
• Finding many holes in the activation of the exercisers • Finding bugs in the exercisers • Finding bugs in the design – Including some juicy ones that escaped simulation
Post-silicon regression suites speed-up bring-up process and increased confidence in its quality Haifa Verification Conference 2010
© 2010 IBM Corporation
POWER7 Core Coverage Results UNIT
Unit Sim
Core Sim
EoA
Total
IFU
96.79%
96.77%
94.99%
98.65%
ISU
96.48%
92.49%
92.78%
97.42%
FXU
99.60%
84.72%
85.85%
99.85%
FPU
97.44%
98.15%
90.20%
99.58%
LSU
94.33%
91.04%
85.32%
98.66%
PC
92.51%
76.95%
55.23%
93.51%
Core Total
96.18%
92.78%
88.70%
98.06%
Haifa Verification Conference 2010
© 2010 IBM Corporation
Post-Silicon Regression for POWER7 LSU Regression Test Plan Coverage Item Events
Directed Templates
Directed
Unit
Other
1
26
13
2
0
0
2
19
9
1
1
0
3
22
16
0
1
0
4
111
18
5
0
0
5
55
17
4
0
1
6
68
7
2
1
0
Haifa Verification Conference 2010
© 2010 IBM Corporation
Summary Coverage is an important tool for assessing the quality and progress of the verification process Measuring coverage in post-silicon is difficult because of the limited observability We proposed a method for reaching coverage closure in post-silicon validation – Based on creating test-template regression suites in a presilicon environment using post-silicon exercisers
Very good results from usage in POWER7 – EoA provided high verification value – Generated regression suites improved confidence in postsilicon bring-up Haifa Verification Conference 2010
© 2010 IBM Corporation
Haifa Verification Conference 2010
© 2010 IBM Corporation