C20-3 Strong Subthreshold Current Array PUF with 265 ... - IEEE Xplore

0 downloads 0 Views 1MB Size Report
Strong Subthreshold Current Array PUF with 265 Challenge-Response Pairs Resilient to. Machine Learning Attacks in 130nm CMOS. Xiaodan Xi, Haoyu ...
C20-3 Strong Subthreshold Current Array PUF with 265 Challenge-Response Pairs Resilient to Machine Learning Attacks in 130nm CMOS Xiaodan Xi, Haoyu Zhuang, Nan Sun, and Michael Orshansky The University of Texas at Austin, TX, USA, [email protected]

Abstract This paper presents a strong silicon physically unclonable function (PUF) immune to machine learning (ML) attacks. The PUF, termed the subthreshold current array (SCA) PUF, is composed of a pair of two-dimensional transistor arrays and a low-offset comparator. The fabricated PUF chip allows 265 challenge-response pairs (CRPs) and achieves high reliability with average bit error rate (BER) of 5.8% for temperatures -20 to 80°C and VDD ± 10%. The calibration-based CRPs filtering method effectively improves BER to 2.6% with a 10% loss of CRPs. When subjected to ML attacks, the PUF shows resilience that is 100X higher than known alternatives, with negligible loss in PUF unpredictability. Keywords: PUF, SCA, security, nonlinearity and attacks Introduction PUFs have great promise as hardware authentication primitives due to their physical unclonability, high resistance to reverse engineering, and difficulty of mathematical cloning. Strong PUFs are distinguished by an exponentially large number of CRPs, in contrast with weak PUFs that have a smaller CRP set. Because the adversary cannot create an enumeration clone (by recording all CRPs) even when in physical possession of a PUF, strong PUFs enable secure direct authentication, that does not require cryptography, and is thus attractive to low-energy and IoT applications. Unfortunately, early strong PUFs [1][2][3][4] have shown vulnerability to being attacked through a construction of a mathematical model via machine learning. The concept of an ML-resilient PUF based on the subthreshold current array was proposed in [5] promising high resilience against ML attacks. In this paper, we present the first ever silicon implementation of a strong PUF resilient to ML attacks. We fabricated a 265 CRP SCA-PUF in 130nm CMOS. When used in conjunction with reliability-driven dynamic thresholding, the BER of only 2.6% is achieved for temperatures -20 to 80°C and VDD ± 10% at 90% CRP utilization. When subjected to a suite of ML attacks, the PUF shows resilience that is 100X higher than known alternatives, with negligible loss in PUF unpredictability. SCA-PUF: Principle of Operation The basic building block of SCA-PUF is a pair of nominally identical nk two-dimensional transistor arrays (Fig. 1) with all devices subject to stochastic variability operating in subthreshold region. Under impact of process randomness, the output voltages of two arrays differ and are converted to a binary output by a comparator. Each array consists of k columns and n rows of a unit cell (Fig. 2). A unit cell consists of a stochastic PFET (Mijx) that always operates in the subthreshold region and a non-stochastic switch transistor (Mij) arranged in parallel to the stochastic PFET. The term “stochastic transistor” refers to a device with high amount of threshold voltage variability (achieved by using a minimumsize transistor). Switch transistors are controlled by an external challenge word, leading to 2nk possible output voltages. The transistors Mc1 and Mc2 act as current sources biased to ensure that the diode-connected stochastic transistors of the array operate in the subthreshold region (Fig. 2). To avoid a direct up-down path (when all switch transistors are shorted in one column), PMOS transistors (Mjy) are placed on top of each transistor column (Fig. 2). The principle feature of the circuit is that it has a highly nonlinear boundary between the regions of PUF 1-outputs and 0-outputs in the nk-dimensional space of Vth (Fig. 3). This makes existing machine learning methods fail in predicting the responses of the SCA-PUF which achieves high security.

Proposed Circuit Design We built a 65-bit-input SCA-PUF based on 513 array architecture (Fig. 2). The array analog outputs are Vout_a and Vout_b. The distribution of Vout = Vout_a - Vout_b determines the allowable maximum offset of the comparator. The measured Vout distribution has μ = 17.9mV,  = 13.6mV (Fig. 5a). We design the comparator to have input offset Vos < 100μV, which guarantees that 99.6% of CRPs are correctly resolved. This low input offset is achieved by using an active offset cancellation scheme (offset is cancelled by adjusting Vcalp and Vcaln of the comparator (Fig. 6)). An innovative circuit stabilizes PUF output. In Fig. 2, if node X is directly connected to VDD, the common-mode output voltage Vout_CM = (Vout_a + Vout_b) / 2 is highly sensitive to the challenge (causing it to vary from 300mV to 900mV). That complicates comparator offset cancellation because it is feasible only when Vout_CM is well controlled. To achieve a fixed Vout_CM, we separate node X from VDD by inserting a PFET Mf5, whose gate voltage Vf is generated by a novel common-mode feedback circuit (Fig. 4). With a proper size ratio between Mc1, Mc2, and Mc3, it is ensured that (If1 + If2) / 2 = If3, making (Vgs1 + Vgs2) / 2  Vgs3. As a result, Vout_CM = (Vout_a + Vout_b) / 2  Vref, regardless of the challenge (Fig. 5b). Results and Discussion Inter-die Hamming distance (HD) and Hamming weight (HW), quantifying PUF output uniqueness and randomness, respectively, are measured across 50 dies with 500 challenges. For ideal PUFs, both inter-die HD and HW are 0.5. For our PUF, the average normalized inter-die HD is 0.499 ( = 0.043) (Fig. 7a), and the average HW is 0.528 ( = 0.109) (Fig. 7b). Intra-die HD, used as a measure of BER to quantify temporal stability, is measured across 5 dies with 500 challenges, under operating conditions of -20 to 80°C and 1.08-1.32V VDD. The average intra-die HD is 0.058 ( = 0.038) (Fig. 7a). Distinct effects of temperature and VDD on BER are studied in Fig. 8, which shows that high temperature affects bit reliability most. The average BER in the worst case is 9%. Dynamic thresholding is used to further reduce BER. We developed an efficient indirect method to determine individual CRP reliability levels that allows discarding unstable CRPs. Rather than measure Vout (CRP), which is a direct measure of reliability, we measure response’s sensitivity to comparator offset (by changing the calibration voltage with Vcal and scanning it). This method can greatly reduce the average BER to 2.6% with a 10% loss of CRPs or < 0.1% (0.4% in the worst case) with a 42% loss of CRPs (Fig. 9). To prove resilience to ML attacks, we apply a suit of standard ML techniques, including support vector machines (SVM), logistic regression (LR), and neural networks (NN), to predict PUF responses given a number of training set CRPs. The prediction error stays over 40% on 104 training points, which is 100X higher than a 65-bit arbiter PUF, and the error decreases slowly compared with the arbiter PUF (Fig. 10a). The secrecy capacity [7] increases linearly with CRPs while it saturates for the arbiter PUF (Fig. 10b). Table I summarizes measurement results and compares with prior work. The SCA-PUF generates response bits at 6Kb/s while consuming 68nW and 11pJ/bit. The design occupies 44700μm2 (die photo in Fig. 11). References

C268

2017 Symposium on VLSI Circuits Digest of Technical Papers

978-4-86348-606-5 ©2017 JSAP

[1] K. Yang, et al., ISSCC Tech. Dig., pp. 254-255, 2015. [2] R. Maes, et al., ESSCIRC, pp. 486-489, 2014. [3] J. W. Lee, et al., VLSI Tech. Dig., pp. 176-179, 2004. [4] G. E. Suh, et al, DAC Tech. Dig., pp. 9-14, 2007. [5] M. Kalyanaraman, et al., HOST, pp. 13-18, 2013. [6] U. Rührmair, et al., IEEE TIFS, 8, 1876, 2013. [7] G. Hospodar, et al., WIFS, pp. 37-42, 2012.

 C11C21…Cn1

Vf

C11 X 0/1

Cnk

SCA_a

SCA_b

C21

RG Vout_a

Comparator

SCA_b

Vb

Mc1

C12C22…Cn2

C1kC2k…Cnk

M2y

Mky

Mf5

SCA_a C11 C12

M1y

Vout_b Mc2

M11

M11x

M21

M21x

Vb

Fig. 1: SCA-PUF structure

C12

C22

M12

M12x

M22

M22x

.. . Cn1

C1k

C2k

M1k

M1kx

M2k

M2kx

.. .

Mn1

Mn1x

Cn2

.. .

Mn2

Cnk

Mn2x

Mk1

Mnkx

Fig. 3: Nonlinearity of PUF output

Fig. 2: PUF circuit diagram and subthreshold array structure Mf4

Vf

CLK

CLK

II

Vout_a

II II Mf1

Mf2

Vgs1

Vout_b

Vout_1

Mf3

Vgs2 Vb

Vout_2

Vref Vgs3

Vcal_p

Vout_a

Mc3

Fig. 4: Common-mode feedback circuit

Vout_b

Vcal_n

CLK

Fig. 6: Low-offset comparator circuit

Fig. 5: (a) Differential PUF output voltage and (b) commonmode PUF output voltage

Fig. 8: Average BER across VDD and temperature

Fig. 7: (a) PUF uniqueness (inter-die HD) and reliability (intra-die HD) distance and (b) uniformity/randomness (Hamming Weight)

Fig. 9: BER vs. discarded CRPs at different calibration voltages Vcal

Fig. 10: Machine learning attacks on the 65-bit SCA-PUF and arbiter PUF: (a) prediction error and (b) secrecy capacity results Table I: State-of-the-art silicon PUF and chip ID designs

Fig. 11: Die micrograph of 130nm CMOS SCA-PUF test

This work

ISSCC’ 15 [1]

ESSCIRC’ 14 [2]

VLSI’ 04 [3]

DAC’ 07 [4]

Technology

0.13m

40nm

65nm

0.18m

90nm (FPGA)

Type Post-attack security (prediction error on 104 CRPs) Secrecy capacity (bits) (on 265 CRPs) Number of CRPs

SCA PUF 40% ~2

65

~3.71019

Delay PUF

RO PUF

-

1%

-

1% [6]

~2000

-

~5.51028

~1.81019

1.41020

523776 0.48%

BER in worst case

9% (averaged) 0.4% (42% CRPs discarded)

9%

4.5%

4.8%

Power (W) / Energy (pJ/bit)

0.068 / 11.0

28.4 / 17.75

-

-

-

VDD range

1.08-1.32V

0.7-1.2V

1.2-1.44V

± 2%

1.08-1.2V

Temperature range

-20-80ºC

-25-125ºC

-40-85ºC

27-67ºC

20-120ºC

2017 Symposium on VLSI Circuits Digest of Technical Papers

C269

Suggest Documents