Noname manuscript No. (will be inserted by the editor)
A Parallelizable Chaos-based True Random Number Generator based on Mobile Device Cameras for the Android Platform Wei-Zhu Yeoh · Je Sen Teh · Huey Rong Chern
Received: date / Accepted: date
Abstract True random number generators are used in high security applications such as cryptography where non-determinism is required. However, they are slower than their pseudorandom counterparts because they need to extract entropy from physical phenomenon. To overcome this drawback, generators have been designed to extract unpredictability from devices such as computer processing units or microphones. This paper introduces a new generator for the Android mobile platform based on images captured by a built-in camera. Although similar generators exist, they suffer from poor performance and a lack of proper security evaluation. The proposed generator implements a chaos-based postprocessing algorithm that eliminates statistical defects and increases its throughput. These goals are achieved by using the inherent properties of a chaotic system to amplify entropy extracted from the captured images. The proposed generator is evaluated in two phases: first, statistical test suites are executed to identify statistical defects. Next, the generator’s forward and backward security is analysed. Results indicate that the proposed true random number generator is able to generate statistically secure true random number sequences faster than existing mobile-based generators. In addition, the generator is designed to support parallel processing, thus allowing its performance to scale according to the mobile device’s multicore architecture.1 Wei-Zhu Yeoh Inti International College Penang Je Sen Teh( ) Universiti Sains Malaysia E-mail: jesen
[email protected] Huey Rong Chern Inti International College Penang 1 This is a pre-print of an article that will be published in Multimedia Tools and Applications. The final authenticated version will be available online at: https://doi.org/10.1007/s11042-018-7015-0.
2
Wei-Zhu Yeoh et al.
Keywords True random number generator · chaos theory · Android · mobile device · digital camera
1 Introduction Random numbers are vital for applications such as computer simulation, gambling and cryptography. These applications rely heavily on the statistical quality of the random number sequence generated by a random number generator (RNG). There are two main categories of RNGs which are pseudorandom number generators (PRNG) and true random number generators (TRNG). PRNGs are deterministic in nature whereby they produce the same set of random numbers given the same initial input (seed). TRNGs are instead nondeterministic in nature as it produces random numbers in an unpredictable manner even though all initial conditions remain constant. PRNGs are used if reproducibility is required. However for high security applications such as generating encryption keys for cryptography, TRNGs are preferred for their unpredictability. TRNGs are generally slower than their PRNG counterparts because they must harvest entropy from physical phenomenon. These entropy sources can include radioactive decay [?], lasers [?] and electrical noise [?]. TRNGs based on the unpredictability of computer components have also been introduced in the effort to design generators that do not require specialized equipment for entropy extraction. Several examples of these entropy sources include air turbulence from disk drives [?], GPU race conditions [?], and mouse movements [?]. Mobile devices are quickly replacing personal computers as the dominant computing platform. Users check their e-mails, browse webpages, and even shop online using their mobile devices. Therefore, these devices also require secure TRNGs to generate security parameters. Mobile devices are commonly equipped with multiple hardware sensors. Some of these sensors such as fingerprint sensors [?], accelerometers, gyroscopes and GPS [?,?] have been used as viable entropy sources for TRNGs. Due to its high sampling rate, a mobile device’s built-in camera is also a suitable entropy source. Modern mobile devices come equipped with powerful cameras that can capture a large amount of photon data as an image. Although these cameras may seem deterministic in nature, slight environmental changes and electrical noise contribute towards the unpredictability of pixel values. Numerous researchers have attempted to generate random numbers based on images [?,?,?,?]. Their findings verify that images are indeed suitable entropy sources given that suitable algorithms are used for entropy extraction and postprocessing. However, existing designs suffer drawbacks such as low performance, insufficient security evaluation and in some cases, reduced flexibility as they require active participation of the user. Some of the aforementioned generators have been designed by using chaotic maps [?,?,?,?]. Chaotic maps are mathematical functions that depict randomlike behaviour, are highly sensitive to slight changes to their initial conditions,
Title Suppressed Due to Excessive Length
3
and have confusion and diffusion properties. Due to these characteristics, there has been an increased interest in the application of chaos theory in fields such as cryptography [?,?] and random number generators.
4
Wei-Zhu Yeoh et al.
This paper outlines the design of a new TRNG for Android mobile devices based on image data and chaos theory. The proposed generator extracts entropy solely from images captured by the mobile device’s camera without human intervention or additional hardware. Thus, the TRNG can be easily deployed to any existing Android-based mobile device. It uses a chaos-based postprocessing algorithm to amplify data entropy that has been extracted from images. Its security is evaluated in two phases: statistical testing using various test suites followed by an analysis of its forward and backward security. This two-phase security evaluation is based on the standardization efforts of the German Federal Office for Information Security (BSI) for the security evaluation of RNGs [?]. The security evaluation performed on the proposed TRNG sets it apart from existing work that rely solely on statistical test suites. The TRNG is also designed to be parallelizable, allowing it to scale according to the latest multicore technology. The remaining sections of this paper are organized as follows: Section ?? discusses related work, followed by Section ?? that provides an introduction to image formats and chaos theory. Section ?? describes the architecture of the proposed algorithm. Next, Sections ?? and ?? discuss the security and performance of the proposed algorithm respectively. The paper is then concluded in Section ??.
2 Related Work This section discusses several existing TRNGs based on mobile devices, specifically those that utilize images as their entropy source. Bouda et al. proposed a TRNG for mobile devices that uses a hash function for entropy extraction [?]. Preprocessing is first performed by sampling many images in low resolution and extracting only high entropy bits from each sampled image. Carter-Wegman universal hashing [?] is then used for further entropy amplification. The performance of the TRNG is approximately 36 bits per second (bps). The downsampling performed during the preprocessing stage improves the quality of the TRNG but lowers its performance as a result. Zhao et al. proposed three variants of a TRNG that uses chaotic systems as a postprocessing algorithm [?]. Three different chaos-based approaches were compared: one based on the Arnold cat map, a sorting algorithm based on the logistic map, and a chaos-based image cipher. Security testing was performed using the NIST statistical test suite. They found that the methods based on the Arnold cat map and the chaos-based cipher yielded the best statistical result. However, all three variants have poor performance whereby the fastest of the three has a throughput of approximately 8.6 Kilobits per second (Kbps). In 2014, Zhang et al. proposed a TRNG based on readings obtained from the image sensor of smartphone cameras [?]. This generator requires users to obstruct the camera lens with their finger. Then, the red channel of image data is used as the entropy source. However, since it only uses one out of three available channels, the random number output rate is gated to one-
Title Suppressed Due to Excessive Length
5
third of its maximum potential. Another drawback of this generator is the requirement of human interaction to produce random numbers which in turn reduces its practicality. Postprocessing is performed with the help of vectormatrix multiplication. The output rate of the generator ranges between 1-2 Megabits per second (Mbps). Sanguinetti et al. proposed a different method that involves passing raw pixel data through a 2000-bit input/500-bit output complex extractor [?]. This method results in data compression by a factor of 4. If implemented on a FPGA, their algorithm is able to achieve speeds of up to 3 Gbps. However, the software implementation of their proposed method on a consumer grade device achieves throughputs of approximately 1 Mbps. Security evaluation was performed using statistical tests such as NIST and DIEHARD. Based on the existing work, there several issues that can be addressed. First, most of the existing generators use RGB channel values or raw data captured by camera sensors. There are other pixel formats that are worth investigating such as the YUV format which is the standard image data format for Android devices starting from version 5.0. This alternative data format may be a more suitable candidate for an entropy source compared to RGB or raw data. Next, the performance of existing mobile-based TRNGs are still subpar compared to TRNGs implemented on other platforms. Improvements can be made by using alternative postprocessing algorithms that are able to increase performance while eliminating statistical defects. A parallelizable design can also be considered for further performance boost. In terms of security evaluation, most of the existing designs only rely on statistical testing which is insufficient because these test suites cannot verify forward and backward unpredictability which is an important requirement for high security applications such as cryptography. The proposed TRNG in this paper was designed to address these issues.
3 Preliminaries 3.1 Image Formats Images captured by mobile device cameras can be encoded in different formats. The most common format is JPEG which performs lossy compression when encoding image data. To access the underlying RGB pixel values, one needs to decode the JPEG data. The additional encoding and decoding processes incur additional computational overhead, especially for generators that require RGB values as their entropy source. Thus, the JPEG format is not optimal for RNGs because it will indirectly contribute to a decrease in overall performance. An alternative image encoding format that can be used to design RNGs is the YUV format. In this paper, the term YUV refers to the YCbCr format in keeping with Android documentation [?]. Y refers to the brightness whereas U and V are the color components. In addition to the JPEG format, Android 5.0 onwards supports the YCbCr 4:2:0 format for its new Camera2 API, where
6
Wei-Zhu Yeoh et al.
Cb and Cr represents the blue-difference chroma and red-difference chroma respectively. Image data represented as YUV can be used directly by the TRNG since the data is already extracted as the Y , U and V arrays without the need for decoding. Raw image data captured by the camera is also available as another image format. This format does not undergo any lossy compression and provides the best image quality available for the user. Different camera vendors have different types of proprietary raw encoding formats which leads to a lack of standardization. Thus, using this image format for a TRNG is not suitable because it will lower the flexibility and interoperability of the algorithm. After thorough consideration, the YUV image format was chosen for the proposed TRNG because it provides a performance advantage over JPEG due to the absence of decoding. The compression and conversion processes for JPEG images adds computational overhead which lowers the generator’s efficiency. In addition, JPEG involves lossy compression which contributes to a loss in data entropy. On the other hand, the RAW format was not used due to the lack of standardization which will result in inconsistencies with regards to its data entropy. The YUV format is also supported by a majority of mobile phones, accounting for approximately 89% of devices in the market that use the Android OS 5.0 and above [?]2 . If YUV is not supported by the target device, the JPEG image format will be used as the alternative.
3.2 Chaos Theory Chaos theory is the study of dynamical systems that are extremely sensitive to its initial conditions whereby any slight change leads to a greatly varying output. This characteristic along with other properties such as random-like behavior, aperiodicity, confusion and diffusion capabilities are analogous to the requirements of cryptographic algorithms. Thus, chaotic systems have been used in the design of hash functions [?], encryption algorithms [?,?], and even key distribution [?]. Due to these intrinsic properties, the proposed TRNG utilizes a hyperchaotic system known as the chaotic coupled map lattice (CCML) as part of its postprocessing mechanism. The CCML is a complex chaotic system that consists of multiple simple chaotic maps that are coupled in a nearest-neighbour fashion. 3.2.1 Logistic Map A chaotic map is a mathematical function that exhibits chaotic behaviour. The proposed TRNG requires a simple and fast chaotic map to be used as the local map of the CCML. The logistic map was chosen for this purpose due to its simplicity and speed. In addition, the logistic map is also one of 2
As of September, 2018.
Title Suppressed Due to Excessive Length
7
1.0
0.8
x
0.6
0.4
0.2
0.02.4
2.6
2.8
3.0
3.2
r
3.4
3.6
3.8
4.0
Fig. 1 Logistic Map Bifurcation Diagram
the most well-studied chaotic maps in the field of chaotic cryptography. Its mathematical formula is depicted as xt+1 = rxt (1 − xt ),
(1)
where xt is the logistic map’s state, t is the time index or iteration number, and r is the control parameter. x and r values fall in the range of [0, 1] and [0, 4] respectively. Increasing the value of r from 0 to 4 changes the behaviour of the logistic map from periodic to aperiodic as shown its bifurcation diagram in Fig. ??. The shaded areas denote the chaotic regions that are suitable for the design of a generator. Note that there exists a window of periodicity in the range of [3.82,3.86]. Therefore, the proposed work utilizes r > 3.86 for its postprocessing algorithm to ensure that the logistic map operates within its chaotic region.
3.2.2 Chaotic Coupled Map Lattice CCML is a spatiotemporal chaotic system that couples multiple local maps based on a nearest-neighbor structure. The CCML has increased period length and more complex behaviour as compared to simple chaotic maps. The CCML equation is calculated as xit+1 = εf (xit ) +
1−ε i−1 (f (xi+1 )), t ) + f (xt 2
(2)
where ε is the coupling coefficient that ranges between [0,1], f (x) is the local chaotic map, i = {0, 1, ..., L − 1} is the map index, and L is the lattice size (number of local maps in the system). The minimum number of iterations to fully diffuse one chaotic state (xit ) to affect all other states is f lr( L2 ), where f lr(R) is a function that rounds a real number, R down to the nearest integer. Figs. ?? and ?? illustrate the minimum number of iterations required for complete diffusion of a single state for L = 5 and L = 6 respectively.
8
Wei-Zhu Yeoh et al.
Fig. 2 CCML Diffusion for L = 5
Fig. 3 CCML Diffusion for L = 6
4 Proposed Algorithm 4.1 Overview The proposed algorithm has three main phases: image sampling, preprocessing, and postprocessing. During the image sampling phase, images will be captured using the mobile device’s camera. Next, the preprocessing phase will combine pairs of images via XOR operation and interlace the resulting image data to maximize entropy. Here, image data refers to the brightness and colour components of the YUV image format. The postprocessing module further amplifies data entropy by iterating the CCML using the resulting image data as initial conditions. The output of the postprocessing phase will be the final random bits generated by the proposed TRNG. This postprocessing phase is designed to be parallelizable for higher throughput. The detailed descriptions of the sampling, preprocessing and postprocessing phases are covered in Sections ??, ??, and ?? respectively whereas parallelization of the proposed algorithm is
Title Suppressed Due to Excessive Length
9
Fig. 4 Flowchart of the Proposed Algorithm
described in Section ??. The flowchart of the proposed algorithm is depicted in Fig. ??.
4.2 Image Sampling The proposed algorithm sets the camera settings to auto-exposure and autofocus so that it can adjust itself in dark, bright or normal lighting conditions. The first 24 camera frames are discarded to provide sufficient time for the camera to adjust its focus and exposure automatically. The largest image resolution that can be supported by the device is chosen to maximize the throughput of the generator. Images are captured in a burst fashion whereby multiple shots are taken back-to-back once the camera is successfully initialized. Images are specifically captured in multiples of two to prepare for the preprocessing phase. Images can also be captured and stored ahead of time to further increase the TRNG’s throughput by avoiding the need for reinitialization. These images can be stored in an entropy pool and used when required. The captured images are stored in the YUV format, specifically YCbCr 4:2:0 where the size of the Y array is double of the U and V arrays combined. Each color sample is stored as 8 bits. In the event that the YUV format is not supported by the mobile device, RGB values decoded from the JPEG image format will be used as the entropy source. The use of both YUV and RGB will be described in the following subsection.
10
Wei-Zhu Yeoh et al.
4.3 Preprocessing Preprocessing stage will perform some basic data transformation and reshuffling to obtain higher entropy from image data. Let two images be denoted by A and B respectively, each with their own Y , U and V arrays. The two images will first undergo an XOR operation denoted by B B A Yˆ = {(y0A ⊕ yn−1 ), (y1A ⊕ yn−2 ), ..., (yn−1 ⊕ y0B )},
(3)
where yiX is the i-th Y array element of image X, and n is the array size ˆ and Vˆ arrays. Let in bytes. The same procedure is performed to obtain U yˆi , u ˆi and vˆi represent these new array elements. After the XOR operations, the array elements are interlaced in the format of Z = {z0 , z1 , ..., zl−1 } = {ˆ y0 , u ˆ0 , yˆ1 , vˆ0 , yˆ2 , u ˆ1 , yˆ3 , vˆ1 , ..., yˆn−2 , u ˆm−1 , yˆn−1 , vˆm−1 }, where n = 2m and l = 2n. If any array elements remain, they are appended at the end of Z in a linear fashion to maximize the potential output of the algorithm. The preprocessing algorithm is summarized in Algorithm ??. Under any circumstances where the YUV format is unavailable, RGB valX X ues from a JPEG image can be used as the entropy source. Let ρX i , γi and βi be the i-th red, green and blue elements respectively for an image X. Then, two images, A and B will undergo an XOR operation denoted by B A B A B ˆ = {(ρA R 0 ⊕ ρn−1 ), (ρ1 ⊕ ρn−2 ), ..., (ρn−1 ⊕ ρ0 )}.
(4)
The same equation is repeated for the green and blue colour elements to obtain ˆ and B ˆ arrays respectively. Then, the array elements of R, ˆ G ˆ and B ˆ will be inG terlaced as Z = {z0 , z1 , ..., zl−1 } = {ˆ ρ0 , γˆ0 , βˆ0 , ρˆ1 , γˆ1 , βˆ1 , ..., ρˆn−1 , γˆn−1 , βˆn−1 }, where l = 3n. The postprocessing phase described in the following subsection is the same for both the YUV and RGB arrays.
4.4 Postprocessing The postprocessing algorithm takes the preprocessed image data, Z as an input and generates the final random bits as an output. Postprocessing is performed using the CCML with logistic maps as its underlying local maps. Values in Z will be normalised and assigned as the CCML’s initial values. These values are interpreted as signed 32-bit integer in a big-endian manner. The data is then normalized to the range of [0,1] represented as a 64-bit floating point value. This normalization is performed to conform to the logistic map’s input requirement. Feature scaling is used as the data normalization technique [?]. The formula for feature scaling is calculated as znorm =
zi − min(Z) , max(Z) − min(Z)
(5)
Title Suppressed Due to Excessive Length
11
Algorithm 1 Preprocessing Algorithm 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23:
Input: Image arrays: Y A , Y B , U A , U B , V A , V B Output: Array of interlaced bytes: Z //Reverse XOR operation lenY ← length(Y B ) − 1 //Length of the Y array m ← lenY for i = 0 to lenY do B yˆi ← yiA ⊕ ym m←m−1 end for lenU ← length(U B ) − 1 //Length of the U /V array m ← lenU for i = 0 to lenU do B uˆi ← uA i ⊕ um B vˆi ← viA ⊕ vm m←m−1 end for //Interlacing for i = 0 to (2 × lenU ) do if i mod 2 == 0 then z2i ← yˆi z2i+1 ← u ˆi 2
24: 25: 26:
else z2i ← yˆi z2i+1 ← vˆf lr( i ) 2
27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37:
end if end for //Append remaining values lenZ ← lenY + (2 × lenU ) //Length of the final Z array e ← 2 × lenU //Starting index for remaining Y values for i = 4 × lenU to lenz do zi ← yˆe e←e+1 end for return Z ← {z0 , z1 , ..., zlenZ }
where zi and znorm denote the current (integer) and normalized (floating point) values respectively. The min(Z) and max(Z) functions produce the smallest and largest values in the Z array respectively. Let the state values and control parameters for each of the logistic maps within the CCML be xit and rti respectively, where i = {0, 1, 2, ..., L − 1} and t is the number of iterations. After calculating znorm , it is assigned to one of the CCML’s state values (xi0 ). The process is repeated until all L of the initial values have been set. The coupling coefficient is set to ε = 0.5 to ensure that the current chaotic state will have heavier weightage on the result of calculating (??). Based on the results of statistical testing, the size of the CCML is set to L = 6 to achieve a balance between performance and statistical quality of the generated random numbers. If the available data is not a multiple of 6,
12
Wei-Zhu Yeoh et al.
values from the beginning of the dataset will be repeated. Each logistic map is iterated 50 times to eliminate transient effects [?]. The initial control parameters, r0i are set to 3.9 to ensure that the logistic maps operate within their chaotic region. For further entropy amplification, the control parameters will be recalculated as rti = rti + 0.001xit + c,
(6)
where c is a small constant. Any arbitrary value for c within the range of [0.001, 0.005] can be chosen but c = 0.002 was chosen for the implementation of the TRNG in this paper. The value of c ensures that rti will be modified even though xit ≈ 0. If a situation arises where rti > 4 after computing (??), it will be further recalculated as rti = 3.9 + 0.0025rti ,
(7)
which prevents the logistic maps from being iterated out of scope. The scaled value of the control parameter, 0.0025rti is added to a constant value of 3.9 to obtain slightly varying results each time r is reset. The perturbation of the control parameters in (??) and (??) are performed before each set of 50 logistic map iterations. Under certain circumstances such as extreme darkness or brightness, images that are captured will have multiple pixel values that are equal. After performing the XOR operation in the preprocessing phase, there will be multiple state values of xit = 0. This scenario is undesirable as it will cause the chaotic map’s state to remain at zero for every iteration and lead to nonrandom outputs. If this occurs, xit is modified as xit =
rti , 4
(8)
where 14 is chosen as the coefficient for rti to ensure that the resulting value of xit will remain in the range of (0, 1). This remapping enables the TRNG to operate optimally regardless of the environmental lighting condition. After processing the entire CCML for f lr( L2 ) iterations, the final states (represented as 64-bit floating point numbers) are compressed by performing an XOR operation. The 32 most significant bits (MSB) and the 32 least significant bits (LSB) are XOR-ed together to form the final 32-bit output data. This is performed because the MSBs have lower entropy than the LSBs. By performing the XOR operation, all the bits will play a role in the final output without being discarded. The final output is then stored in a byte array in big-endian fashion. Big-endian is chosen over little-endian to remain consistent with the rest of the algorithm but using either representation will have a similar result. This byte array is the final random number sequence produced by the proposed TRNG. The modified logistic map equation is detailed in Algorithm ?? whereas the rest of the postprocessing algorithm is as shown in Algorithm ??.
Title Suppressed Due to Excessive Length
13
Algorithm 2 Modified Logistic Map Algorithm 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:
Input: 64-bit floating point values: x, r, c Output: 64-bit floating point values: x, r if x == 0 then x ← r4 end if r ← r + 0.001 × x + c //Perturb r with (??) if r > 4 then r ← 3.9 + 0.0025 × r //Reset r with (??) end if for i = 0 to 49 do x ← r × x × (1 − x) end for return x, r
Algorithm 3 Postprocessing Algorithm 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21:
Input: Byte array from preprocessing: Z, lattice size: L, parameter values {r00 , ..., r0L−1 } Output: Array of random numbers: O j=0 while j < length(Z) do for i = 0 to L − 1 do zi+j −min(Z) //Normalization based on (??) xi0 ← max(Z)−min(Z) end for //Iterate CCML for full diffusion for t = 0 to f lr( L − 1) do 2 //Loop for each local map for i = 0 to L − 1 do xit+1 ← εf (xit , r¯ti , c) + 2ε (f (xi+1 t
mod L
, r¯ti+1
mod L
, c)+
mod L i−1 mod L f (xi−1 , r¯t , c)) t
end for end for //Push final random numbers into O for i = 0 to L − 1 do O ← M SB32(xi L ) ⊕ LSB32(xi f lr( 2 −1)
−1) f lr( L 2
)
22: end for 23: 24: j ← j + L //Increment index 25: end while 26: return O Note: r¯ti denotes to passing rti by reference
4.5 Parallelization The proposed algorithm is designed to be parallelizable to achieve higher throughput when multicore architecture is available. The most computationally heavy stage of the proposed TRNG is its postprocessing algorithm. By parallelizing the postprocessing module, the performance of the algorithm can be drastically improved.
14
Wei-Zhu Yeoh et al.
When parallelized, the TRNG will execute the postprocessing algorithm across multiple threads. The number of working threads is set to the number of cores available on the CPU. Increasing the thread count beyond the available number of cores increases the overall execution time because it increases the amount of thread context switching. During parallel execution, each local chaotic map is initialized with a different control parameter to improve the entropy of the generated random numbers. The control parameter is calculated based on the current thread ID, d = {0, 1, ..., (T − 1)} as well as the total number of threads, T as r0d = 3.9 +
d . 10.01 × (T − 1)
(9)
This equation evenly distributes the control parameter within the range of [3.9, 3.9999] to ensure that all the logistic maps operate within their chaotic region. The parallel processing strategy employed for the TRNG is input and output partitioning. This strategy ensures that multiple threads will not access the same memory location, thus eliminating the need to perform thread synchronization. Using this strategy, a set of worker threads is first initialized to a size equal to the number of available physical cores. The input of the postprocessing module, Z, will be divided into equal-sized subarrays based on the number of worker threads. Each subarray will be handled by exactly one worker thread. The initial value of the logistic map’s control parameter for each worker thread is initialized according to (??). Each worker thread will then proceed to compute Algorithm ??. Upon completion, each of their outputs are concatenated to produce the final random number byte array. Since each thread works in isolation without sharing any input or output space, there will be no need for synchronization among the threads. The significant performance boost from parallelization is discussed in Section ??.
5 Security Evaluation This section evaluates the security of the proposed TRNG based on the evaluation criteria proposed by Schindler and Killman [?]. Their work is part of the BSI efforts to standardize the security evaluation of RNGs. Based on this standardization effort, two functionality classes for RNGs were identified, P1 and P2. P1 generators are suitable for applications that require statistical randomness but can be openly transmitted such as challenges or initialization vectors for encryption algorithms. These generators have the minimum requirement of passing statistical test suites to identify the presence of statistical defects. Both PRNGs and TRNGs can comply with the requirements of the P1 functionality class. For sensitive applications such as generating encryption keys or signatures, P2 generators are recommended. A P1 generator must additionally have forward and backward unpredictability in order to satisfy P2 requirements.
Title Suppressed Due to Excessive Length
15
Table 1 ENT Results Test Name Entropy Chi-Square Arithmetic Mean
Test Output
Ideal Output
Result
7.999999
8.0
PASS
50.95%
50%
PASS
127.4957
127.5
PASS
Monte Carlo Value for Pi
3.142303099
3.141592653
PASS
Serial Correlation Coefficient
-0.000001
0.0
PASS
Therefore, evaluation of the proposed TRNG includes statistical testing followed by evaluation of forward and backward security to ensure that it falls under the P2 functionality class.
5.1 Statistical Testing Three statistical test suites, ENT [?], DIEHARDER [?] and NIST [?] were used to locate statistical defects in the random number sequences generated by the proposed TRNG. However, they cannot indicate if the generator is non-deterministic. The ENT test suite consists of five different statistical tests that measure randomness based on different criteria. The sample size used for the ENT test is 1000-Mbit, selected to be the same as the NIST suite. Test results are benchmarked against the ideal values for each test as depicted in Table ??. All the results are near-ideal which indicates that the proposed TRNG successfully passed the ENT test suite. The DIEHARDER test suite is an extension to the DIEHARD battery of tests [?]. It consists of 31 individual statistical tests which produce at least one P-value. Whenever the P-value of a test falls below 0.01 or exceeds 0.99, the random number sequence is said to have failed the test. A sample size of 62.58 Gbit is used as required by the test suite. Test results are tabulated in Table ?? where worst case P-values are shown for tests that produce more than one output. The proposed TRNG successfully passed all tests in DIEHARDER. The final test suite used to evaluate the proposed generator is the NIST SP 800-22. Each of its 15 individual tests is performed on 1000 samples of 1-Mbit length. Each test outputs a single P-value and passing rate (PR). The final P-value is calculated based on the distribution of 1000 individual Pvalues obtained from testing each sample. A final P-value of at least 0.0001 is required to indicate uniform distribution. For a significance level of α = 0.01, the minimum PR for each test is 0.9805 with the exception of the random excursions test and its variant. These two tests permit a slightly lower PR (0.9780) because a lower number of samples are tested. The proposed TRNG successfully passed all the NIST tests as shown in Table ??.
16
Wei-Zhu Yeoh et al.
Table 2 DIEHARDER Results Test Name
P-Value
Result
Birthday
0.27832259
PASS
OPERM5
0.51100287
PASS
32 × 32 Binary Rank
0.27832259
PASS
6 × 8 Binary Rank
0.49047325
PASS
Bitstream
0.80291839
PASS
OPSO
0.94147304
PASS
OQSO
0.42922786
PASS
DNA
0.13983407
PASS
Count-the-1s (Stream)
0.84488077
PASS
Count-the-1s (Byte)
0.70458042
PASS
Parking Lot
0.57621075
PASS
Minimum Distance
0.80544089
PASS
3D Sphere
0.12038852
PASS
Squeeze
0.26085107
PASS
Sums
0.02909476
PASS
Runs
0.20203097
PASS
Craps
0.95163613
PASS
GCD
0.94235563
PASS
STS Monobit
0.32666868
PASS
STS Runs
0.4021927
PASS
STS Serial Test
0.80898850
PASS
RGB Bit Distribution
0.35348487
PASS
RGB Generalized Minimum Distance
0.8373383
PASS
RGB Permutations
0.85900303
PASS
RGB Lagged Sum
0.9770688
PASS
RGB Kolmogorov-Smirnov Test
0.43157471
PASS
DAB Byte Distribution
0.10611634
PASS
DAB DCT
0.92396432
PASS
DAB Fill Tree
0.64173909
PASS
DAB Fill Tree 2
0.71333388
PASS
DAB Monobit 2
0.26969197
PASS
5.2 Analysis of Forward and Backward Security The preceding section has shown that the proposed TRNG can generate random numbers that are statistically inconspicuous, thus fulfilling the requirement for the P1 class of generators. However, it does not guarantee that the tested TRNG has forward and backward security. Forward and backward security is a vital property to ensure that an adversary cannot predict past or future values even with the ability to corrupt the internal state of a generator [?].
Title Suppressed Due to Excessive Length
17
Table 3 NIST Results Test Name
P-Value
PR
Minimum PR
Result
Frequency
0.516113
0.988
0.9805
PASS
Block Frequency
0.928857
0.993
0.9805
PASS
Cumulative Sums
0.572847
0.989
0.9805
PASS
Runs
0.122325
0.991
0.9805
PASS
Longest Run
0.291091
0.985
0.9805
PASS
Rank
0.530120
0.995
0.9805
PASS
FFT
0.858002
0.990
0.9805
PASS
Non-Overlapping Templates
0.743915
0.990
0.9805
PASS
Overlapping Templates
0.502247
0.984
0.9805
PASS
Universal
0.373625
0.981
0.9805
PASS
Approximate Entropy
0.647530
0.995
0.9805
PASS
Random Excursions
0.292960
0.9863
0.9780
PASS
Random Variant
0.966685
0.9893
0.9780
PASS
Serial
0.402962
0.988
0.9805
PASS
Linear Complexity
0.433590
0.984
0.9805
PASS
Excursions
To evaluate a generator for this property, it is essential to show evidence that the increase in entropy per internal random number is sufficiently large [?]. The Coron test [?] can be used for this purpose, however it is crucial for a generator to show that it is at least Markovian (independent and memoryless) before the test can be applied. To depict the Markovian property, the following notions of an ideal TRNG are observed [?]: Definition 1 A TRNG that generates numbers from a finite set M = {0, 1, ..., m − 1} is ideal if and only if it produces a sequence of statistically independent and identically distributed (i.i.d) discrete random variables uniformly distributed over m-numbers, and also has no memory of the past and future generated symbols. Definition 2 A TRNG is unpredictable if and only if it is an ideal TRNG. The Average Shannon Entropy (ASE) can be calculated to estimate how closely a generator resembles an ideal TRNG. ASE formula is calculated as H(X) = lim − k→∞
1 k
X
P (σk )log2 P (σk ),
(10)
σ∈{0,1}k
where the summation includes the finite set that collects all binary k-tuples of the form σk = {b0 , ..., bk−1 }, bi ∈ {0, 1}. P (σk ) is the probability to generate the k-tuple σk , and if P (σk ) = 0, then P (σk )log2 P (σk ) is assumed to be zero. Since the output of the proposed TRNG consists of 1-byte random numbers, k = 8. The ASE ranges between [0,1] with 1 representing the ideal case.
18
Wei-Zhu Yeoh et al.
Based on the aforementioned definitions, the following test is carried out to demonstrate the unpredictable nature of the proposed generator as proposed in [?]. The TRNG will first produce a 1-MegaByte (MB) random number sequence with a specific requirement, whereby the TRNG will be reset to its initial state after generating each block of 256 random bits (32 random numbers). This process is repeated until the full 1-MB random data sequence = 32768 repetitions are has been gathered. A total number of 1024×1024×8 256 required generate the 1-MB random sequence. Next, the 1-MB sequence is used to calculate the ASE based on (??) which results in H(X) = 0.999977. This result indicates that the proposed generator closely resembles an ideal TRNG that has both the independence and memoryless property. Resetting the TRNG to its initial state is similar to reinitializing a PRNG using the same initial seed repeatedly. In the case of the PRNG, reinitializing it with the same seed will cause it to fail this test because the same set of 256 random bits will be produced repeatedly, leading to a low ASE value. The aforementioned test demonstrates that the proposed TRNG is at least Markovian. Next, the Coron test can be applied to obtain exact information of the entropy per bit for a fixed block size. The settings for the Coron test is based on [?], which are L = 8, Q = 2560, and K = 256000. The output of the test is f = 7.99824 which exceeds the requirement of f > 7.976 to fulfil the security criteria. Therefore, the proposed TRNG has shown to possess the forward and backward security property. Along with findings in Section ??, the generator has fulfilled the P2 property for RNGs.
6 Performance Evaluation 6.1 Standard Performance The proposed algorithm is implemented on the Android platform and written using Java. The Android device used for simulations is a Huawei P9 with HiSilicon Kirin 955 SOC. This specific SOC uses an octa-core CPU with four highperformance Cortex-A72 cores and four low-performance Cortex-A53 cores which clocks at 2.5GHz and 1.8GHz respectively. The camera equipped with this device can capture images at a resolution of 2976 x 3968. The throughput of the proposed TRNG is 22.5 Mbps when the initial camera setup time is included. If the camera is initialized ahead of time or the TRNG has been operating for a long period of time, the throughput is approximately 29.2 Mbps. The proposed TRNG outperforms previously proposed image-based TRNGs for mobile devices. Both Zhang et al. [?] and Sanguinetti et al. [?] recorded a similar throughput of 1-2 Mbps which is at least 11 times slower than the proposed generator. Even though the performance was measured on different devices, the proposed TRNG should still perform better despite the improvement of mobile CPUs over the past three years.
Title Suppressed Due to Excessive Length
19
The proposed TRNG’s speed was measured based on capturing the necessary number of images to generate the target random number sequence rather than using an entropy pool. If an entropy pool was used, the performance of the TRNG would be increased. The code can also be further optimized by using C programming language or the Android-specific Renderscript instead of Java. In addition, the generator can be parallelized for further performance boost as described in the following subsection.
6.2 Parallel Performance Parallel computation is applied only on the postprocessing module as described in Section ??. An analysis was performed to evaluate the effect of the number of threads on the performance of the postprocessing module based on the following parameters, Overhead, To = pTp − Ts , Speedup, S =
Ts , Tp
Ef f iciency, E =
S , p
Running Cost, C = pTp ,
(11) (12) (13) (14)
where p is the number of cores or working threads, Ts and Tp are the serial and parallel runtime of the algorithm respectively. Overhead indicates the additional cost whereas speedup indicates the performance improvement when running the algorithm in parallel. Efficiency measures the performance improvement as a function of the number of threads or cores being used whereas the running cost is the total computational cost for running multiple threads in parallel. Fig. ?? and Table ?? both show that the postprocessing module benefits greatly from parallel computation. It has the potential to speed up the postprocessing process for up to five times its original throughput. The efficiency of parallelization drops when the number of threads is increased past four because the additional threads will start to utilize the low-performance cores. These low-performance cores cannot complete tasks as quickly as its high-performance counterparts thus contributing to reduced efficiency. Increasing the number of threads beyond the number of available cores does not yield any gain in performance. Instead, there appears to be a performance degradation due to busy thread context switching. Therefore, the number of working threads should be equal to the number of available computing cores to maximize the throughput of the parallelized postprocessing module. Based on the aforementioned findings, the proposed TRNG is computed in parallel using eight threads to optimize its efficiency. The resulting performance boost improves the TRNG’s throughput from 22.5 Mbps to 118.9 Mbps.
20
Wei-Zhu Yeoh et al.
Fig. 5 Parallel performance of the postprocessing module
Table 4 Analysis of Parallelization Threads
Tp (s)
Ts (s)
To
S
E
C
1
5.98
5.98
0.00
1.00
1.00
5.98
2
3.05
5.98
0.11
1.96
0.98
6.09
3
2.06
5.98
0.21
2.90
0.97
6.19
4
1.56
5.98
0.26
3.83
0.96
6.24
5
1.49
5.98
1.45
4.02
0.80
7.43
6
1.41
5.98
2.49
4.24
0.71
8.46
7
1.26
5.98
2.86
4.73
0.68
8.84
8
1.11
5.98
2.93
5.37
0.67
8.91
9
1.15
5.98
4.37
5.20
0.58
10.34
10
1.13
5.98
5.31
5.30
0.53
11.29
7 Conclusion In this work, a new TRNG for Android-based mobile devices was proposed. The generator captures image data samples using a mobile device’s digital camera and amplifies their entropy by using a spatiotemporal chaotic system. The use of the built-in digital camera does not require any additional hardware therefore the algorithm can be easily deployed to any existing Android-based mobile device. The proposed TRNG successfully passed multiple statistical test suites such as ENT, NIST, and DIEHARDER which indicates the absence of statistical defects. The generator also has forward and backward unpredictability which is a requirement for high security applications. In terms of performance, the proposed TRNG achieves a high throughput of approximately 22.5 Mbps. It is at least 11 times faster than existing image-based mobile TRNGs. When parallel computation is used, the TRNG can achieve a throughput of approximately 118.9 Mbps. In short, the proposed algorithm is a viable TRNG for Android-based mobile platforms, capable of generating true random numbers at a high throughput without the need of external hardware.
Title Suppressed Due to Excessive Length
21
Acknowledgment This is a pre-print of an article that will be published in Multimedia Tools and Applications. The final authenticated version will be available online at: https://doi.org/10.1007/s11042-018-7015-0. This work has been partially supported by Universiti Sains Malaysia under Grant No. 304/PKOMP/6315190 and the National Natural Science Foundation of China under Grant No. 61702212. References 1. Android image format. URL https://developer.android.com/reference/android/graphics/ImageFormat.html 2. Android distribution dashboard (2018). URL https://developer.android.com/about/dashboards/ 3. Addabbo, T., Fort, A., Rocchi, S., Vignoli, V.: Chaos Based Generation of True Random Bits, pp. 355–377. Springer Berlin Heidelberg, Berlin, Heidelberg (2009). DOI 10.1007/978-3-540-95972-4 17. URL https://doi.org/10.1007/978-3-540-95972-4 17 4. Aksoy, S., Haralick, R.M.: Feature normalization and likelihood-based similarity measures for image retrieval. Pattern Recognition Letters 22(5), 563 – 582 (2001). DOI https://doi.org/10.1016/S0167-8655(00)00112-4. URL http://www.sciencedirect.com/science/article/pii/S0167865500001124. Image/Video Indexing and Retrieval 5. Altaf, M., Ahmad, A., Khan, F.A., Uddin, Z., Yang, X.: Computationally efficient selective video encryption with chaos based block cipher. Multimedia Tools and Applications (2018). DOI 10.1007/s11042-018-6022-5. URL https://doi.org/10.1007/s11042018-6022-5 6. Bassham, L.E., Rukhin, A.L., Soto, J., Nechvatal, J.R., Smid, M.E., Leigh, S.D., Levenson, M., Vangel, M., Heckert, N.A., Banks, D.L.: A statistical test suite for random and pseudorandom number generators for cryptographic applications. Tech. rep., National Institute of Standards and Technology (2010). URL https://www.nist.gov/publications/statistical-test-suite-random-andpseudorandom-number-generators-cryptographic 7. Bouda, J., Krhovjak, J., Matyas, V., Svenda, P.: Towards true random number generation in mobile environments. In: A. Jøsang, T. Maseng, S.J. Knapskog (eds.) Identity and Privacy in the Internet Age, pp. 179–189. Springer Berlin Heidelberg, Berlin, Heidelberg (2009) 8. Brown, R.G.: dieharder (2018). URL http://webhome.phy.duke.edu/ rgb/General/dieharder.php 9. Carter, J., Wegman, M.N.: Universal classes of hash functions. Journal of Computer and System Sciences 18(2), 143 – 154 (1979). DOI https://doi.org/10.1016/0022-0000(79)90044-8. URL http://www.sciencedirect.com/science/article/pii/0022000079900448 10. Cicek, I., Pusane, A.E., Dundar, G.: A novel design method for discrete time chaos based true random number generators. Integration, the VLSI Journal 47(1), 38 – 47 (2014). DOI https://doi.org/10.1016/j.vlsi.2013.06.003. URL http://www.sciencedirect.com/science/article/pii/S0167926013000308 11. Coron, J.S.: On the security of random sources. In: Public Key Cryptography, pp. 29–42. Springer Berlin Heidelberg, Berlin, Heidelberg (1999) 12. Davis, D., Ihaka, R., Fenstermacher, P.: Cryptographic randomness from air turbulence in disk drives. In: Y.G. Desmedt (ed.) Advances in Cryptology — CRYPTO ’94, pp. 114–120. Springer Berlin Heidelberg, Berlin, Heidelberg (1994) 13. Dodis, Y., Pointcheval, D., Ruhault, S., Vergniaud, D., Wichs, D.: Security analysis of pseudo-random number generators with input: /dev/random is not robust. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, CCS ’13, pp. 647–658. ACM, New York, NY, USA (2013). DOI 10.1145/2508859.2516653. URL http://doi.acm.org/10.1145/2508859.2516653 14. Gan, Z., Chai, X., Yuan, K., Lu, Y.: A novel image encryption algorithm based on lft based s-boxes and chaos. Multimedia Tools and Applications 77(7), 8759–8783 (2018). DOI 10.1007/s11042-017-4772-0. URL https://doi.org/10.1007/s11042-017-4772-0
22
Wei-Zhu Yeoh et al.
15. Kanak, A., Ergun, S.: A practical biometric random number generator for mobile security applications. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E100.A(1), 158–166 (2017). DOI 10.1587/transfun.E100.A.158 16. Keuninckx, L., Soriano, M.C., Fischer, I., Mirasso, C.R., Nguimdo, R.M., der Sande, G.V.: Encryption key distribution via chaos synchronization. Sci. Rep. 7(43428) (2017). DOI 10.1038/srep43428 17. Marsaglia, G.: The marsaglia random number cdrom including the diehard battery of tests of randomness (1995). URL http://stat.fsu.edu/pub/diehard/ 18. Oteo, J.A., Ros, J.: Double precision errors in the logistic map: Statistical study and dynamical interpretation. Phys. Rev. E 76, 036214 (2007). DOI 10.1103/PhysRevE.76.036214. URL https://link.aps.org/doi/10.1103/PhysRevE.76.036214 19. Sanguinetti, B., Martin, A., Zbinden, H., Gisin, N.: Quantum random number generation on a mobile phone. Phys. Rev. X 4, 031056 (2014). DOI 10.1103/PhysRevX.4.031056. URL https://link.aps.org/doi/10.1103/PhysRevX.4.031056 20. Schindler, W., Killmann, W.: Evaluation criteria for true (physical) random number generators used in cryptographic applications. In: Cryptographic Hardware and Embedded Systems - CHES 2002, Lecture Notes in Computer Science, vol. 2523, pp. 431–449. Springer Berlin Heidelberg (2003). DOI 10.1007/3-540-36400-5 31. URL http://dx.doi.org/10.1007/3-540-36400-5 31 21. Suciu, A., Lebu, D., Marton, K.: Unpredictable random number generator based on mobile sensors. In: 2011 IEEE 7th International Conference on Intelligent Computer Communication and Processing, pp. 445–448 (2011). DOI 10.1109/ICCP.2011.6047913 22. Teh, J.S., Samsudin, A., Akhavan, A.: Parallel chaotic hash function based on the shuffle-exchange network. Nonlinear Dynamics 81(3), 1067–1079 (2015). DOI 10.1007/s11071-015-2049-6. URL https://doi.org/10.1007/s11071-015-2049-6 23. Teh, J.S., Samsudin, A., Al-Mazrooie, M., Akhavan, A.: Gpus and chaos: a new true random number generator. Nonlinear Dynamics 82(4), 1913–1922 (2015). DOI 10.1007/s11071-015-2287-7. URL https://doi.org/10.1007/s11071-015-2287-7 24. Walker, J.: Pseudorandom number sequence test program (2008). URL http://www.fourmilab.ch/random/ 25. Wallace, K., Moran, K., Novak, E., Zhou, G., Sun, K.: Toward sensor-based random number generation for mobile and iot devices. IEEE Internet of Things Journal 3(6), 1189–1201 (2016). DOI 10.1109/JIOT.2016.2572638 26. Wei, W., Guo, H.: Bias-free true random-number generator. Opt. Lett. 34(12), 1876– 1878 (2009). DOI 10.1364/OL.34.001876. URL http://ol.osa.org/abstract.cfm?URI=ol34-12-1876 27. Xingyuan, W., Xue, Q., Lin, T.: A novel true random number generator based on mouse movement and a one-dimensional chaotic map. Mathematical Problems in Engineering (2012) 28. Yoshizawa, Y., Kimura, H., Inoue, H., Fujita, K., Toyama, M., Miyatake, O.: Physical random numbers generated by radioactivity. Journal of the Japanese Society of Computational Statistics 2012 (1999). DOI 10.5183/jjscs1988.12.67 29. Zhang, X., Qi, L., Tang, Z., Zhang, Y.: Portable true random number generator for personal encryption application based on smartphone camera. Electronics Letters 50(24), 1841–1843 (2014). DOI 10.1049/el.2014.2870 30. Zhao, L., Liao, X., Xiao, D., Xiang, T., Zhou, Q., Duan, S.: True random number generation from mobile telephone photo based on chaotic cryptography. Chaos, Solitons & Fractals 42(3), 1692 – 1699 (2009). DOI https://doi.org/10.1016/j.chaos.2009.03.068. URL http://www.sciencedirect.com/science/article/pii/S0960077909001866