Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Shemal Shroff Shoaib Bhuria Yash Naik Peter Hall Introduction to Security Relevance to FPGA Design and Manufacture flow for an FPGA Things to secure and why? Types of Attack Prevention PUFs Provisions and policies adopted by a network administrator To prevent and monitor: Unauthorized access, Misuse, Modification, Denial of a computer network and network-accessible resources. Simmonds, A; Sandilands, P; van Ekert, L (2004). "An Ontology for Network Security Attacks". Lecture Notes in Computer Science. Lecture Notes in Computer Science 3285: 317–323 Research on “FPGA Security” has been active since the early 2000s. Several commercial and military applications employ programmable logic. This makes design security important for safety and national security. WP365, Solving Today’s Design Security Concerns, Xilinx White Paper. To learn the confidential cryptographic key. One-to-one copy or “cloning” together with its key. Reverse engineering of encryption algorithm. Execute certain cryptographic operation with presumably secret key. E.g. pay-tv and in-government communications Thomas Wollinger and Christoff Paar, Security Aspects of FPGAs in Cryptographic Applications in New Algorithms, Architectures and Applications for Reconfigurable Computing, Springer, 2005, Ch. 21, pp 265-278. Saar Drimer, Volatile FPGA Design Security – A Survey, v0.96, April 2008. Figure: Simplified depiction of the FPGA design, manufacturing, packaging, and testing processes. Saar Drimer, Volatile FPGA Design Security – A Survey, v0.96, April 2008. Figure: Development, manufacturing, and distribution of an FPGA-based system. Saar Drimer, Volatile FPGA Design Security – A Survey, v0.96, April 2008. B. Dipert. Cunning circuits confound crooks. http://www.e-insite.net/ednmag/contents/images/21df2.pdf, October 12 2000. Bitstream Configuration of the device Bitstream has all the configuration bits required for programming the FPGA. If the bitstream is compromised then your design can be cloned or reverse engineered. To protect the logic of FPGA To prevent manipulation of design using JTAG. Single Event Upset (SEU) or faults Verify that the application is trusted to be correct. Authenticate the application. Black box Attack Reverse engineering Bitstream Cloning of sRAM FPGAs Readback Attack Side Channel Attack Attacks Fault injection Hardware virus Configuration of the device Manipulating design through JTAG Voltage modification Temperature 1. Black Box Attack 2. Reverse-Engineering of the Bitstreams 3. Cloning of sRAM FPGAs 4. Readback Attack 5. Side Channel Attacks Step 1: The attacker inputs all possible combinations, while saving the corresponding outputs. Step 2: Develops a K-map to simplify the resulting tables Step 3: Extracts the logic of the FPGA. Thomas Wollinger and Christoff Paar, Security Aspects of FPGAs in Cryptographic Applications in New Algorithms, Architectures and Applications for Reconfigurable Computing, Springer, 2005, Ch. 21, pp 265-278. A C AB 00 01 11 10 B C Output (Y) 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 1 0 0 0 0 0 1 1 0 1 0 0 0 1 0 1 0 1 1 0 0 1 1 1 0 Y = (A.B)’.B.C’ = A’BC’ Not a real threat nowadays, due to: complexity of the designs size of state-of-the-art FPGAs. Common I/O pins which makes it difficult to connect to the right pin. An attacker has to connect to device’s pin of a known function like, Microprocessor interrupt input, And also, Figure out whether to: Drive a pin with a voltage, Sense its output state, or both isn’t a straightforward exercise. B. Dipert. Cunning circuits confound crooks. http://www.e-insite.net/ednmag/contents/images/21df2.pdf, October 12 2000. Thomas Wollinger and Christoff Paar, Security Aspects of FPGAs in Cryptographic Applications in New Algorithms, Architectures and Applications for Reconfigurable Computing, Springer, 2005, Ch. 21, pp 265-278. A = 32 bits B = 32 bits Adder Output We have, in total, 264 input combinations. Lets assume that latency for the adder is 10 ns. Therefore, time to apply all the combinations is 264 x10 ns. This takes approximately 5849 years which is equivalent to 5.849 x 1011 hours. Reconstructing the original circuit details Altering the design Incorporating it in other designs Reverse Engineering Saar Drimer, Volatile FPGA Design Security – A Survey, v0.96, April 2008. Thomas Wollinger and Christoff Paar, Security Aspects of FPGAs in Cryptographic Applications in New Algorithms, Architectures and Applications for Reconfigurable Computing, Springer, 2005, Ch. 21, pp 265-278. These are the toughest to crack.Why? Increase in gate counts w.r.t number of I/O pins Antifuse Encryption PUFs B. Dipert. Cunning circuits confound crooks. http://www.einsite.net/ednmag/contents/images/21df2.pdf, October 12 2000. Security implications of storing data unprotected and external to FPGA Non-volatile memory Transmitted during power up Vulnerability = can be easily eavesdropped Feasible Thomas Wollinger and Christoff Paar, Security Aspects of FPGAs in Cryptographic Applications in New Algorithms, Architectures and Applications for Reconfigurable Computing, Springer, 2005, Ch. 21, pp 265-278. Non-volatile + FPGA on one chip Battery-Backed RAM eFUSE Device DNA Encryption PUFs Thomas Wollinger and Christoff Paar, Security Aspects of FPGAs in Cryptographic Applications in New Algorithms, Architectures and Applications for Reconfigurable Computing, Springer, 2005, Ch. 21, pp 265-278. Battery-Backed RAM 256-bit key stored in volatile on-chip memory cells. Must receive continuous power from the external battery. eFUSE securely store bitstream decryption key. No BB-RAM and external battery. The OTP eFUSE links are permanently programmed. No need battery backup. Device DNA Virtex-6 has embedded, unique device identifier (Device DNA). unique 57-bit identifier is nonvolatile and permanently programmed Present in all FPGAs. For easy debugging. Read the configuration of FPGA through JTAG. Thomas Wollinger and Christoff Paar, Security Aspects of FPGAs in Cryptographic Applications in New Algorithms, Architectures and Applications for Reconfigurable Computing, Springer, 2005, Ch. 21, pp 265-278. A security bit can be used to prevent the readback functionality. Although, fault injection has proven successful to overcome these countermeasures in FPGA. PUFs side channel can leak important information. Side channel can be: power consumption Light Electromagnetic radiation. Power analysis of bitstream A. Bogdanov, A. Moradi et. Al, efficient and side-channel resistant authenticated encryption of FPGA Bitstreams, International Conference on Reconfigurable computing and FPGAs, 2012. Thomas Wollinger and Christoff Paar, Security Aspects of FPGAs in Cryptographic Applications in New Algorithms, Architectures and Applications for Reconfigurable Computing, Springer, 2005, Ch. 21, pp 265-278. Magnetic field surrounding FPGA Loop antenna to pick variations of field 160 bit EC point Multiplication Prior info of Encryption is must Power trace from an RSA operation Uses standard square and multiply Square and multiply operations have visibly different power profiles ‘1’ relates to squaring step followed by a multiplication step ‘0’ in the exponent involves only a squaring step CMOS transistors emit photons. Electrons gain energy when current flows. Emission energy is much higher for transition 0->1 than 1->0 To observe the light emitted, the chip needs to be opened either from its backside or front side, depending on its package type. Photons collected by high sensitivity photon sensor. InGaAs detectors have best quantum efficiency. J.Di. Battista, J. Courrege, B. Rouzeyre, L. Torres and P. Perdu, “When Failure Analysis meets Side-Channel Attacks”, CHES 2010, IACR, Santa Barbara, California, USA. First the light emission activity is localized by turning the cryptoprocessor is on/off. It is not necessary to know either the architecture of the algorithm, or its implementation. This technique is now less used because of the increasing number of metal layers which act as a light screen. There are two kinds of countermeasures: Hardware and software Software countermeasures refer to algorithmic changes, such as masking of secret keys with random values, More Complex Algorithms Hardware countermeasures often deal either with some form of power trace smoothing or with transistor-level changes of the logic. This technique is now less used because of the increasing number of metal layers which act as a light screen. Thomas Wollinger and Christoff Paar, Security Aspects of FPGAs in Cryptographic Applications in New Algorithms, Architectures and Applications for Reconfigurable Computing, Springer, 2005, Ch. 21, pp 265-278. Temperature Modification Voltage Modification Fault Injection/Single Event Upsets Hardware Virus Manipulating design through JTAG Modify operating voltages or temperatures of FPGA. Causes unintended behavior. Can be used to extract data or bypass certain security features. Monitor and correctly respond to fluctuations in the operating temperature and voltage. Virtex-6 FPGA System Monitor (SYSMON) CRC circuitry Zeroization of Device Thomas Wollinger and Christoff Paar, Security Aspects of FPGAs in Cryptographic Applications in New Algorithms, Architectures and Applications for Reconfigurable Computing, Springer, 2005, Ch. 21, pp 265-278. Hardware virus or a hardware Trojan Kill switch: thinning certain crucial wires, so that electro migration eventually eliminates part of a wire, creating an open connection Manipulating the design through JTAG Disable write feature in JTAG Don’t download untrusted designs. Known as Physical Uncloneable Function. Physical entity easy to manufacture but difficult to clone. PUFs implement a challenge- response authentication. Unpredictable response. This is because of the physical factors. PUFs generate different outputs for same inputs. Also, they can generate same outputs for different inputs. This randomness is due to the Challenge-Response Pairs. Ideal for cryptographic applications Arbiter PUFs Based on MUXes and Arbiter Ring Oscillator or RO-PUF Based on Delay Circuit and Counters Note: RO PUFs are more suitable for ASICs and FPGAs. Therefore, we will concentrate on it. Consists of N oscillators circuits. Each Oscillator has a unique frequency. At any instance two oscillators are picked by the MUXes. Every counter will counter number of cycles. Output will be 0 or 1 depending on counter values. Sensitive to temperature variations Limited number of Outputs Limited number of Challenge Response Pairs It’s not over yet… Reconfigurable hardware offers advantages for cryptographic applications. Algorithm flexibility Algorithm distribution Algorithm modification Resource efficiency Architecture efficiency Source: A. Coman, R. Frățilă. “Cryptographic Applications using FPGA Technology.” SECITC 2010. Algorithm flexibility Security protocols such as SSL or IPSec are algorithm agnostic. Encryption algorithm chosen when setting up session. A configuration controller can set up hardware implementation of algorithm for each session. Source: A. Dandalis, V. K. Prasanna. “An Adaptive Cryptographic Engine for IPSec Architectures.” FCCM. April 2000. Algorithm Distribution Infeasible to upgrade algorithm in ASIC deployment without new hardware. Reconfigurable hardware in field can update algorithm or add algorithm to protocol. Source: A. Coman, R. Frățilă. “Cryptographic Applications using FPGA Technology.” SECITC 2010. Algorithm Modification Adapting preexisting algorithm to application. Standard library algorithm is not sufficient. No off-the-shelf ASIC will implement your custom algorithm. Passwd: DES 25 times in a row + salt. Resource Efficiency With security protocols, public-key is used to send private session key. Unrelated symmetric-key session uses the transferred key. Run-time reconfiguration to repurpose resources. Architecture Efficiency Synthesize encrypt/decrypt hardware given the key. General purpose multipliers use more area. Fixed key for current configuration → Constant multiplier. Can be implemented in LUT as times-table. Non-constant multiplicand is input to LUT. http://www.andraka.com/multipli.htm Architecture Efficiency Field arithmetic on GF(2n) used for both block and stream ciphers. For example: GF(23) = {(000), (001), (010), (011), (100), (101), (110), (111)} Squaring is a common operation. Depending on basis used: Simply a bit shift. Simply in insertion of interlacing zeroes. For example, in GF(27) (1101)2 = (1010001) because (x3 + x2 + 1)2 = x6 + x4 + 1 No nice algorithm for interlacing zeroes between bits. Reconfigurable hardware – generate squaring register, depending on what field will be used for algorithm. Must perform FOIL of the polynomials. Followed by modulo reduction the primitive polynomial. General algorithm for GF(2w). int galois_shift_multiply(int x, int y, int w){ int prod; int i, j, ind; int k; int scratch[33]; prod = 0; for (i = 0; i < w; i++) { scratch[i] = y; if (y & (1 << (w-1))) { y = y << 1; y = (y ^ prim_poly[w]) & nwm1[w]; } else { y = y << 1; } } for (i = 0; i < w; i++) { ind = (1 << i); if (ind & x) { j = 1; for (k = 0; k < w; k++) { prod = prod ^ (j & scratch[i]); j = (j << 1); } } } return prod; } Galois.tar - Fast Galois Field Arithmetic Library in C/C++Copyright (C) 2007 James S. Plank Problems in software: No ISA-level GF(2n) multiplication instruction. Bit operations are required to examine one bit at-a-time. Need to perform modular reduction Computational bottleneck, especially if n is unknown at runtime. Must loop and XOR. A. Mahboob, N. Ikram. “A New Multiplication Technique for GF(2m) with Cryptographic Significance.” WISA 2004. Many hardware implementations to parallelize polynomial-basis finitefield multiplication. For GF(24). Came from general derivation for GF(2n). Reconfigurable hardware can synthesize multipliers needed, given fields to use. A. Reyhani-Masoleh, M. A. Hasan. “Low Complexity Bit Parallel Architectures for Polynomial Basis Multiplication over GF(2m).” IEEE Trans on Computers, August 2004. Block Cyphers Like AES, DES Stream Cyphers Acterbahn,A5/1 They divide the complete plaintext in to blocks . They work on symmetric key Cryptography The key used is same for all blocks. The number of bits per block is equal to number of bits in the key They require more memory and more resources than stream Cyphers. A one-time pad is theoretically unbreakable. Cryptanalysis tools have nothing to go on. Key as long as message. Key generated by true random process, such as particle radiation. Stream cyphers approach a one-time pad. Generate cyphertext with a keystream. Key as long as the plain-text message, or a very long period. Note: Stream Cyphers are ideal for FPGAs. Can be easily implemented with Feedback Shift Register. Small hardware footprint. Just XOR keystream with plain-text. Decryption is same. Recurrence relation: ak+4 = ak + ak+1, k≥0 Keystream exits at right. Characteristic poly: p(x) = 1 + x3 + x4 p(x) cannot be factored Only one cycle Source: https://en.wikipedia.org/wiki/Linear_feedback_shift_register Linear complexity can also be defined as the shortest LFSR that can generate the sequence. The feedback function of an LFSR consists only of XOR logic. The taps chosen give rise to a characteristic polynomial. Roots of this polynomial’s factors are in the cycles of states the register can take on. If polynomial is irreducible, only one cycle (can’t factor it!) If polynomial is primitive, all states are roots, so LFSR generates a maximum length sequence. LFSR/polynomial is non-singular, so all states are reachable. A non-linear feedback shift-register has more than addition operation in recurrence relation. Feedback logic may be XOR, AND and/or finite-field arithmetic. An NLFSR from our project: Source: K. Mandal, G. Gong. ”Probabilistic Generation of Good Span n Sequences from Nonlinear Feedback Shift Registers”, CACR Tech Report. 2012. A de Bruijn sequence has high linear complexity Defined as a sequence where each subsequence of a given length occurs only once. Primitive LFSR seen earlier generated a de Bruijn sequence Linear complexity was 4. Implementing for the first time a stream cypher based on “Cryptographically Strong de Bruijn Sequences with Large Periods” by Mandal and Gong. Presented at SAC 2012 conference. Using WGT as a kernel NLFSR. By changing length of NLFSR, and which kernel polynomial is used for WGT, can change period of keystream generated! Ideal for reconfig: synthesize different stream cypher given length of message to be encrypted. With 40-bit register, obtain keystream with: Period: 240 Linear Complexity: > 216 >> 40 if using a LFSR.