Keywordsâ satellites; cryptography; lattice; learning with errors. I. INTRODUCTION .... OÃO, where the columns of the A are the vectors, a vector ââ¤:.
Design and Implementation of a Security Processor for Satellite Communication Systems Stavroula Zouzoula, Nicolas Sklavos, Apostolos P. Fournaris SCYTALE Research Group, Computer Engineering & Informatics Dept. University of Patras, Patra, Hellas
Abstract— Satellite communication security relies on cryptography schemes that must remain unbreakable for considerable amount of time, since in a satellite’s long lifecycle, updating and patching is not an easy process. Technological progress in quantum computers tend to indicate that in the near future there will be practical quantum computers that can eventually render many of the existing cryptography algorithms obsolete. This highlights the need for post-quantum implementations to resist quantum-based attacks. The satellite communication specificities make post-quantum update a necessity. In this work, a lattice based cryptographic scheme, believed to be quantum resistant is implemented in hardware. The encryption and decryption implementation is focused on the Learning with Errors (LWE) post-quantum algorithm. The proposed implementation goal is to achieve high speed in order to match satellite communication requirements. When compared with other similar works, our implementation is found to be very fast, thus providing very interesting results. Keywords— satellites; cryptography; lattice; learning with errors
I. INTRODUCTION Satellite communication, in recent years, has gained rapid development as the number of stakeholders involved in the space innovation area is expanding. Because satellite systems are the main means of communication in cases of long distance connection and in cases where the connection of points with terrestrial means is extremely difficult, the criticality of satellite communications in terms of quality of service and speed is considerable. Satellite networks can be considered a critical infrastructure since the such networks’ health is very important for communication across the globe but also because they are preferable in cases where sensitive information, such as information about national security is to be transmitted. Satellite communication networks are preferable for very long-distance data transmission due to the advantages they offer, such as global availability, reliability and flexibility. However, for secure message exchange, such communication networks need to provide information confidentiality (e.g. through encryption). For this reason, satellites are equipped with highly secure cryptography algorithms. However, considering that
satellite life expectancy is very long, the choice of satellite installed cryptographic algorithms and implementations should be made very carefully since reprogramming in the field is very difficult. As computer technology evolved rapidly, new computation approaches are explored beyond the principles behind Moore’s law. In this framework, considerable research has been invested that past year in the development of a computing system based on quantum mechanics. Theoretically, such a machine, when becoming a reality, due to its amazingly fast capabilities for parallel computations will be able to solve many of the hardmathematical problems on which modern cryptography algorithms are based. This apocalyptic scenario, raised cryptographers awareness and motivation towards constructing practical quantum resistant (post-quantum) cryptography algorithms. The reality of quantum based attacks on satellite devices that are build nowadays and will be active in space long enough for practical quantum computer to be available, necessitate the integration of a powerful cryptographic algorithm into satellite communication systems that is resilient to quantum attacks. Post-quantum cryptography is referred to cryptographic algorithms that are believed to be resistant in quantum computers’ attacks. Post-quantum cryptographic research focuses on five different cryptographic schemes approaches. 1. Code based cryptography [1]. An example is McEliece’s cryptosystem. It is based on error correcting codes and in the hardness of decoding a message with random errors. 2. Hash-based cryptography [2]. An example is Merkle’s cryptographic scheme. It uses a cryptographic hash function. Its security relies on the collision resistance of the hash function. 3. Multivariate cryptography [3]. Its security relies on the hardness of solving a set of nonlinear equations over a finite field. 4. Supersingular elliptic curve cryptography [4]. It is a Diffie-Hellman type scheme. It is based on the
5.
difficulty of finding isogenies between supersingular elliptic curves. Lattice-based cryptography [5]. It usually involves lattices in cryptographic schemes’ constructions. Its hardness is said to be analogous to the hardness of finding the shortest path in a lattice.
In this paper, a hardware based, lattice-based cryptography implementation is proposed based on the learning with errors (LWE) method. The goal of the proposed implementation is to be applicable in the satellite communication domain where speed is a necessity. The implemented scheme is based on the work of J. Howe et al [6]. The rest of the paper is organized as follows. In section II, background information are provided satellite communication security and lattice based cryptography LWE. In section III, the proposed system is described and in section IV implementation results presented and comparisons with other works are made. Finally, section V concludes the paper. II. BACKGROUND A. Satellite Communications Satellite communication, relaying on artificial satellites, has two basic segments: the ground segment (transmission, reception equipment) and the space segment (artificial satellite antennas). The main components of satellites are the propulsion system, the power system, the communications system and the scientific instruments [7]. Satellite instruments collect different kind of information, like sensor, image and control data. These data, as well as satellite maintenance data are propagated to the satellite’s computer. The computer turns the data into binary based packets and transmits them to the transponder that codes the binary digits into a radio signal. The satellite’s antenna beams the radio signals toward Earth, where a big antenna receives those radio signals. A receiver on Earth, decodes the signals turning them back to digital data. The receiver sends the data to a computer that recovers the sent data from the satellite. Typically, security mechanisms have been used only for military missions. Through the years, satellites also transmit sensitive data for non military purposes. Thus, such data must retain their confidentiality from threats of unauthorized access. Symmetric and Asymmetric cryptography can be used to solve this issue [8]. Cryptographic algorithms have been applied on satellites, such as DES, 3DES, IDEA and AES [9]. However, due to the threat of quantum computers, the development and integration of a quantum resistant algorithm on satellites is necessary [10]. Lattice based cryptography through the LWE algorithm can provide a solution to this problem. B. Learning With Errors (LWE) The first lattice based cryptographic scheme was introduced by Ajtai in 1996 [11]. In 2005, Oded Regev presented the first lattice based public key cryptographic scheme, with proven
security, the LWE problem [12]. Regev’s work, was the milestone for subsequent research on the field. The LWE problem has the following parameters: the 𝑛≥1 dimension, which is the lattice’s security parameter, an integer modulus 𝑞≥2, a message alphabet Σ, the message’s length ℓ≥1 as well as an error distribution x over ℤ or ℤ𝑞, where x is usually defined as the discrete Gaussian distribution. The problem’s data are a desired number of 𝑎𝑖 vectors, which are drawn uniformly random from ℤ𝑞, the inner products of 𝑎𝑖 with a secret vector s 𝑠 ∈ ℤ21 , as well as some error terms e3 , which are drawn from the error distribution χ and are added to the inner products. The error distribution χ is usually defined as the discrete Gaussian distribution 𝐷ℤ,𝜎, with standard deviation 𝜎 ∈ ℝ. The inner products and the additions are modulo q. The data in matrices and vectors form are a matrix A ∈ Z9:×: , where the columns of the A are the 𝑎𝑖 vectors, a vector 𝑒 ∈ ℤ21 , which is consisted of error terms 𝑒𝑖, and a vector b ∈ Z9: , where its elements are the terms b3 = 〈a 3 , s〉 + e3 mod q . The keys of the system are consisting of a uniformly random public matrix A ∈ Z9:×: , which is generated from a trusted source and is being used by all parties in the system, two matrices RG , R H ← D:×L K , which containing Gaussian noise and the R2 matrix is the system’s private key, which is going to be used during decryption and the matrix P ≡ RG -A ∙ R H mod q , which is the public key of the system. The parameter set (n,q,σ), depending on its values, offers low, medium or high levels of security. The medium parameter set (n,q,σ)=(256,4093,3.33) is believed to be as secure as AES128. The modulus q does not necessarily need to be prime, instead it could be a power of 2, allowing a more efficient implementation of the modular reduction component [13]. The new parameter set is (n,q,σ)=(256,4096,3.33). III. PROPOSED SYSTEM DESIGN In this work, a LWE cryptographic scheme implementation is proposed. The implementation is based on J. Howe et al. [6] work. The cryptographic scheme was implemented, so as to be very time efficient thus providing high computation speed. The proposed implementation consists of two independent systems, the encryption system and the decryption system. A. Encryption System The encryption system consists of the encryption keys, the Gaussian samples, the arithmetic module and the ciphertexts. The overall high level architecture of the encryption system is depicted in Figure 1. The keys, the Gaussian samples and the ciphertexts are being stored in BRAMs. The arithmetic module consists of multiply and accumulate (MAC) components, adders, an encoder and components for the modular reduction. A message to be encrypted is inserted in the encoder unit. The encryption LWE operation uses the key and Gaussian samples as already stored value (in the BRAMs). After the encryption process the results (ciphertext) are extracted from the arithmetic module (and stored in BRAMs).
The LWE encryption scheme is described in Algorithm 1. The SUM operations are the MAC components used in the proposed implementation. The variable m is the encoded message. To read the keys and Gaussian samples in the LWE encryption scheme, it is required that a one-time initialization stage is performed on start-up. Algorithm 1: Encryption (𝐴1,𝐴2,𝑃,𝑚∈{0,1}𝑙) 1: for 𝑗 = 0 𝑡𝑜 𝑙 − 1 do 2: for 𝑖 = 0 𝑡𝑜 𝑛 − 1 do 3: 𝑆𝑈𝑀1 ∶= 𝑆𝑈𝑀1 + 𝑒G (𝑖) × 𝐴G (𝑖, 𝑗) 4: 𝑆𝑈𝑀2 ∶= 𝑆𝑈𝑀2 + 𝑒G (𝑖) × 𝐴H (𝑖, 𝑗) 5: 𝑆𝑈𝑀3 ∶= 𝑆𝑈𝑀3 + 𝑒G (𝑖) × 𝑃(𝑖, 𝑗) 6: end for 7: 𝑐G (𝑗) = 𝑆𝑈𝑀1 + 𝑒H_G (𝑗) 𝑚𝑜𝑑 𝑞 8: 𝑐H (𝑗) = 𝑆𝑈𝑀2 + 𝑒H_H (𝑗) 𝑚𝑜𝑑 𝑞 Y(𝑗) 𝑚𝑜𝑑 𝑞 9: 𝑐c (𝑗) = 𝑆𝑈𝑀3 + 𝑒c (𝑗) + 𝑚 10: end for
KEYS
e
Gaussian samples
1
e
2_1
e
2_2
The implemented encryption scheme computes the ciphertexts in parallel, so as to decrease the encryption time. For that purpose, there have been used 3 MAC components which calculate the products 〈eG , AG 〉, 〈eG , AH 〉 and 〈𝑒G , 𝑃〉 respectively, where 256 clock cycles are required to compute one raw-column product. For each MAC component, there are 2 adders that add the error terms and the encoded message. The result of each pair of adders, is inserted into a modular reduction component. The calculated values of the modular reduction components, are elements of the ciphertexts and are being stored in the corresponding memory (cG , cH, cc ). CYPHERTEXTS
e
3
12 bits
Arithmetic module
A0 12 bits
A1
P
mac
mod
mac
mod
mac
mod
The key generation stage and the Gaussian samples generation stage, are computed off-line (a precomputation operation). The public key A is uniformly generated using the Matlab tool. The Gaussian samples needed for matrices R1, R2, where R2 is the private key, as well as the Gaussian error terms needed for the encryption, are generated using Sage’s discrete Gaussian sampler [13]. The public key P is calculated using the equation 𝑃 = 𝑅G − 𝐴 ⋅ 𝑅H 𝑚𝑜𝑑𝑞. The A and R2 matrices are expressed as (A1||A2) and (R2_1||R2_2) respectively for ease of computations.
CY1 12 bits
CY2
CY3
encoder
Figure 1: High level architecture of LWE encryption scheme
A description of the LWE encryption implementation’s various components follows. 1) BRAM: As mentioned before, the encryption keys, the Gaussian samples and the ciphertexts, are being stored in BRAMs. There have been used BRAM9 and BRAM18. 2) Encoder: The encoder encodes the binary message m ∈ {0,1}𝑛. In encoding, each bit of the binary message is multiplied Y by a factor of ⌊q⁄2⌋ and is defined as encode(m) = m ∶= m ∙ ⌊q⁄2⌋ [12]. The encoding of the message is necessary due to the small noise terms being present after decryption. 3) Arithmetic module: The arithmetic module is the core of the encryption system. It computes all the needed calculations for the encryption. The calculations that are being executed are the encoding of the message, the vector-matrix products, the addition of the error vectors and the addition of the encoded message. For the vector-matrix products, a MAC component was exploited, which on a Spartan 6 FPGA is a DSP48A1 unit.
B. Decryption System The LWE decryption system architecture follows a similar structure as the LWE encryption architecture and consists of the ciphertexts, the decryption keys, the arithmetic module and the plaintext [14-16]. The keys and the ciphertexts are being stored on device in BRAMs. The arithmetic module consists of MAC components, an adder, a component for the modular reduction and a decoder. Figure 2 depicts the high level architecture of the LWE decryption scheme. The ciphertext initially is inserted into the decoder unit and then using the LWE decryption algorithm that is processed inside the arithmetic unit it is decrypted and saved in the plaintexts BRAMs. The LWE decryption scheme that is implemented in the proposed decryption architecture is described in Algorithm 2. The SUM operations are the architecture’s MAC components. To read the keys and ciphertexts in the LWE decryption scheme, it is required that a one-time initialization stage is performed on start-up. Algorithm 2: Decryption (𝐶1, 𝐶2, 𝐶3, 𝑅2_1, 𝑅2_2) 1: for 𝑗 = 0 𝑡𝑜 𝑙 − 1 do 2: for 𝑖 = 0 𝑡𝑜 𝑙 − 1 do 3: 𝑆𝑈𝑀1 ∶= 𝑆𝑈𝑀1 + 𝐶G (𝑖) × 𝑅H_G (𝑖, 𝑗) 4: 𝑆𝑈𝑀2 = 𝑆𝑈𝑀2 + 𝐶H (𝑖) × 𝑅H_H (𝑖, 𝑗) 5: end for r(𝑗) = 𝑆𝑈𝑀1 + 𝑆𝑈𝑀2 + 𝐶c(s) 𝑚𝑜𝑑 𝑞 6: 𝑚 r(𝑗)u 7: 𝑚 = 𝑑𝑒𝑐𝑜𝑑𝑒t𝑚 8: end for A description of the LWE decryption implementation’s various components follows. 1) Decoder: The decoder decodes the encrypted message. If the input of the decoder is in the range [-⌊q⁄4⌋, ⌊q⁄4⌋), it
r ) ∶= 0 if 𝑚 r∈ returns a 0, defined as decode(m [−⌊𝑞 ⁄4⌋, ⌊𝑞 ⁄4⌋) ⊂ ℤ𝑞, otherwise it returns a 1. 2) Arithmetic module: The arithmetic module computes all the needed calculations for the decryption. The calculations that are being executed are the vector-matrix products, the addition of the ciphertext and the decoding of the encrypted message.
Gaussian samples were being drawn from a trusted source and stored on device in BRAMs. The decryption implementation, as can be observed, has high hardware resources number compared to other works. This can be justified by the fact that this implementation does not share common computation blocks with the encryption implementation as in other works but is rather implemented autonomously. This approach maximizes parallelism between the encryption and decryption since both operations, work independently and in parallel. Thus, a lot of memory is needed to store the ciphertexts for decryption (the BRAMs cannot be reused between encryption and decryption) and additional components, such as DSPs, adders etc, were also used to support this parallelism having an impact on the needed hardware resources. V. CONCLUSIONS & OUTLOOK
Figure 2: High level architecture of LWE decryption scheme
The implemented decryption scheme computes the plaintext in parallel, so as to decrease the decryption time. For that purpose, there have been used 2 MAC components which calculate the products 〈𝑐G , 𝑅H_G 〉, and 〈𝑐H , 𝑅H_H 〉 respectively, where 128 clock cycles are required to compute one raw-column product. There is one 3-input adder that adds the results of the MAC components and the terms of the ciphertext 𝑐3. The result of the adder, is inserted into a modular reduction component. The calculated values of the modular reduction component, are encoded elements of the plaintext and they are getting decoded so as to recover the plaintext.
In this paper we presented a cryptographic scheme that could be applied on satellites, the learning with errors cryptography. The scheme was implemented in hardware, in a way that will match satellite communication high speed needs. As proven by the implementation results, the proposed implementation is indeed, the fastest when compared to other similar works. However, the achieved high speed of the cryptographic process, had an impact on the employed hardware resources, that are increase compared to other works. Although there is a great progress in security of telecommunications field, there is still need for further research in view of the post-quantum era. Further investigation is needed on how cryptographic schemes could consume less resources, while simultaneously being time efficient [10], [15-16], for a great number of current and future applications such as Ubiquitous Computing, Mobile Computing and the Internet of Things [18], in addition to satellite communication systems. REFERENCES
IV. RESULTS The implemented architectures were synthesized with Xilinx’s tool, ISE Design Suit 14.7. The targeted platform was a Spartan 6 XC6SLX45 FPGA. In Table 1 the proposed implementations measurements for FPGA Chip Covered area in LUTs, Flip Flops (FF), FPGA Slices, DSPs and BRAMs are presented as well as speed measurements in Implementation Maximum Frequency (in MHz) and needed clock cycles for an encryption/decryption. Also, in Table 1, comparison with other similar works are made. From Table 1, it is observed that the encryption/decryption time delay in the implemented scheme is significantly reduced compared to the other related works. More specifically, encryption time delay in the proposed implementation is around 1/3 of the implementation in [6] and the decryption time delay is ½ of the implementation in [6]. Also, encryption and decryption time delay of the proposed implementation is ¼ of the implementation in [17]. Regarding hardware resources usage, the proposed implementation has a high number of LUTs, FF and BRAMs. This can be justified by the fact a discrete Gaussian sampler was not implemented as a logic element. Instead, the needed
[1]
R. Overbeck, N. Sendrier, “Code-based cryptography”, In Post-Quantum Cryptography, pp 95-146, 2009. [2] J. Buchmann, E. Dahmen, M. Szydlo, “Hash-based Digital Signature Schemes”, In Post-Quantum Cryptography, pp 35-94, 2009. [3] J. Ding, B. Yang, “Multivariate Public Key Cryptography”, In PostQuantum Cryptography, pp 193-242, 2009. [4] L. De Feo, D. Jao, J. Plut, “Towards Quantum-Resistant Cryptosystems from Supersingular Elliptic Curve Isogenies”, Journal of Mathematical Cryptology 8(3). pp 209-247, 2014. [5] D. Micciancio, O. Regev, “Lattice-based Cryptography”, In PostQuantum Cryptography, pp 147-192, 2009. [6] J. Howe, C. Moore, M. O’Neill, F. Regazzoni, T. Güneysu and K. Beeden, “Lattice-based Encryption Over Standard Lattices in Hardware”, DAC '16, proceedings of the 53rd Annual Design Automation Conference (162), Austin, 2016. [7] JSAT International, “Satellite Components”, 2018. [8] W. Stallings, “Cryptography and Network Security”, 6th edition, Upper Saddle River, Pearson, ISBN: 0133354695, 2014. [9] P. Banu. “Satellite On-Board Encryption”, Ph.D thesis, School of Electronics and Physical Sciences, University of Surrey, Guildford, UK, 2007. [10] N. Sklavos, “On the Hardware Implementation Cost of Crypto-Processors Architectures”, Information Systems Security, The official journal of (ISC)2, A Taylor & Francis Group Publication, Vol. 19, Issue: 2, pp 5360, 2010.
[11] M. Ajtai and M., “Generating hard instances of lattice problems (extended abstract),” in Proceedings of the twenty-eighth annual ACM symposium on Theory of computing - STOC ’96, 1996, pp. 99–108. [12] O. Regev, “On Lattices, Learning with Errors, Random Linear Codes, and Cryptography”, Journal of the ACM (JACM) 56 (6), 2009. [13] R. Lindner and C. Peikert, “Better key sizes (and attacks) for LWE-based encryption”, In CT-RSA, pp 319-339, 2011. [14] M. Albrecht, “Discrete Gaussian samplers over lattices”, 2014. [15] N. Sklavos, I. D. Zaharakis, A. Kameas, A. Kalapodi, “Security & Trusted Devices in the Context of Internet of Things (IoT)”, Euromicro Conference on Digital System Design (DSD), Vienna, Austria, 2017. [16] N. Sklavos, I. D. Zaharakis, “Cryptography and Security in Internet of Things (IoTs): Models, Schemes, and Implementations”, IEEE proceedings of the 8th IFIP International Conference on New TABLE 1. Operation & Algorithm LWE Encrypt LWE Decrypt LWE Encrypt[12] LWE Decrypt[12] LWE Encrypt[14] LWE Decrypt[14]
Technologies, Mobility and Security (NTMS’16), Larnaca, Cyprus, November 21-23, 2016. [17] T. Pöppelmann and T. Güneysu, “Area optimization of lightweight latticebased encryption on reconfigurable hardware”, In ISCAS, pp 2796-2799, 2014. [18] I. D. Zaharakis, N. Sklavos, A. Kameas, “Exploiting Ubiquitous Computing, Mobile Computing and the Internet of Things to Promote Science Education”, IEEE proceedings of the 8th IFIP International Conference on New Technologies, Mobility and Security (NTMS’16), Larnaca, Cyprus, November 21-23, 2016.
RESOURCE CONSUMPTION AND PERFORMANCE OF LWE CRYPTOGRAPHIC SCHEMES Device XC6SLX45 XC6SLX45 S6LX45 S6LX45 S6 S6
LUT/FF/SLICES 353 / 192 / 275 218 / 107 / 169 6152 / 4804 / 1811 63 / 58 / 32 282 / 238 / 95 94 / 87 / 32
BRAM/DSP 79 / 3 27 / 2 73 / 1 13 / 1 2/1 1/1
MHz 190 189 125 144 144 189
Cycles 32768 16384 98304 32768 136212 66338