Download Lecture 5 Embedded Systems Hardware

Advanced Embedded Systems Lecture 5 Embedded Systems Hardware 1 Advanced Embedded Systems  A classical design information flow, for complex ESs, is shown in fig.:  Hardware for ESs is less standardized than hardware for personal computers; However, there are hardware components frequently used in ESs: keys, sensors, microcontrollers, DSPs, LCDs, leds, seven segment displays, serial memories etc. Communication is mostly implemented through serial interfaces: RS232, I2C, CAN etc.   2 Advanced Embedded Systems  Fig. shows a classical structure for an ES used in control applications:  Fig. shows the reactive feature of an ES: it reads and monitors the external environment and executes an external operation based on the data read; There are variations of these scheme imposed by different components; For example there are sensors which give digital data (in serial form), there are information processing units which include A/D converters and there are execution elements (actuators) requiring digital data (also in serial form);   3 Advanced Embedded Systems Sensors     There are a lot of sensors for every physical quantity; Acceleration sensors: it contains a small mass in its center; when accelerated, the mass will be displaced from its initial position and will change the resistance of the tiny wires connected to it; Rain sensors: the automotive industry became an important application area for rain sensors; a lot of cars contain them, commanding the speed of the wipers in accordance with the amount of rain; Artificial eyes: application areas: robotics and medicine;   Medicine: a little camera is attached to glasses; it is connected to a computer which translates the patterns in electrical pulses; these pulses are sent directly to the brain, through electrodes; the resolution obtained (2003) is in order of 128 x 128 pixels, enabling a blind person to drive a car in controlled areas; Robotics: cameras connected to computers; 4 Advanced Embedded Systems  Image sensors: two types: charge-coupled devices (CCDs) and CMOS; in both cases, arrays of light sensors are used;       The architecture of CMOS sensor arrays is similar to that of standard memories: individual pixels can be randomly addressed and read at an array boundary; CMOS sensors are made in CMOS technology and they can be integrated on the same chip with the processing unit; they are smart sensors; CMOS sensors require a single power supply voltage and interfacing is easy, so they are cheap; In contrast, CCD sensors are adequate for high quality, expensive optical applications (video cameras, optical telescopes); In CCD technology, charges have to be transferred from one pixel to the next until they can finally be read; Images generated with CCDs have low level of noise, so they are of higher quality than those generated by CMOS sensors; but interfacing is more complex leading to higher costs; 5 Advanced Embedded Systems  Biometrical sensors:        Are used for security, more exactly in authentication; Classical password based authentication is limited; Biomedical authentication tries to identify a certain person by scanning parts of its body: face, iris, finger print; Finger print sensors are fabricated in CMOS technology and facer recognition can be made with image sensors; The hit rate is lower than in password based authentication; Proximity sensors: indicate how close are two moving objects; an application area: cars with proximity sensors for helping the driver to park in small places; Wireless sensors:     Include on the same chip the sensor, the processing unit and an interface for wireless communications; Are connected in networks; Low consumption is mandatory; Many application areas: meteorology, medicine, smart houses, surveillance and tracking etc. 6 Advanced Embedded Systems A/D converters     Information processing units work with digital values; if sensors give analog values, they must be converted using A/D converters; First, the analog voltage must be sampled and hold; The transistor operates like a switch; each time the switch is closed by the clock, the capacitor is charged to a value equal to the incoming voltage Ve; after opening the switch, the voltage remains essentially the same until the switch is closed again; Each of the values stored on the capacitor can be considered as an element of a discrete sequence of values Vx, obtained from an analog signal Ve; the values Vx will be converted in digital form; 7 Advanced Embedded Systems   Two types: independent and included in other circuits; Flash A/D converter:    Each comparator has 2 inputs, denoted as + and -; if V+ > V-, the output gives a 1 and 0 otherwise; In the A/D converter, all inputs are connected to a voltage divider; If input voltage Vx > Vref, the comparator at the top will generate a 1; the encoder will identify the most significant 1 and will encode the case Vx > Vref as the largest output value; 8 Advanced Embedded Systems      If input voltage Vx < Vref but still > ¾ Vref, the comparator at the top will generate a 0 and the next comparator will generate a 1; the encoder will encode this value as the second largest value; Similarly for the cases: 2/4 Vref < Vx < ¾ Vref, 1/4 Vref < Vx < 2/4 Vref and 0 < Vx < ¼ Vref, which will be encoded as the third largest, the fourth largest and the smallest values, respectively; Advantage: high speed, no clock; it can be used in high-speed video applications; Disadvantage: hardware complexity: n – 1 comparators are needed to distinguish between n values; Successive approximation: 9 Advanced Embedded Systems        It is based on binary search and on successive approximation; for that, a register is necessary; Initially, the most significant output bit of the successive approximation register is set to 1 and all other bits are set to 0; this digital value is converted to an analog value, corresponding to 0.5 x the maximum input voltage; if Vx > the generated analog value, the m.s.b. is kept to 1, otherwise is reset to 0; A same process is applied to the next bit; it will remain 1 if the input value is within the second or the fourth quarter of the input value range; it will be reset otherwise; The same process is applied to all the bits from the approximation register; Advantage: hardware efficiency: for distinguishing n digital values, log2n bits are needed in the approximation register and the D/A converter; Disadvantage: low speed, since it needs f(log2n) steps; It is appropriate for applications where high precision conversions at moderate speeds are required; ex.: audio applications; 10 Advanced Embedded Systems Communication   Communication is done on communication media: wireless, wires, optical etc. through abstract entities called channels; Communication requirements:    Real-time behavior: very important and must be taken into account from the design phase; some low-cost solution (e.g. Ethernet) are not appropriate; Efficiency: communication media can be quite expensive; for ex. point to point connections in large buildings are a very expensive solution; the situation is worse if separate wires are foresight for control, data and addresses; with separate wires is almost impossible to add new modules; the weight of the wires must also be considered, for ex. in cars; the most efficient solution is the bus; Appropriate bandwidth and communication delay: bandwidth requirements of ESs may vary in accordance with the requirements of the application; high bandwidth means high cost so it is important to provide only the necessary bandwidth; 11 Advanced Embedded Systems      Support for event-driven communication: communication with the external environment can be done by polling the sources or by interrupts; the first solution is simpler but the delay may be too large; interrupts are appropriate for event-oriented communication but they require a specific software, and possibly hardware, support; Robustness: reliable communication must be maintained even in harsh conditions: large temperature domain (- 200C - + 1800C in cars), close to major sources of electromagnetic radiation, in presence of mechanical vibrations, major light sources etc.; voltage levels and clock frequency may be affected; Fault tolerance: ESs should work even after faults occur; classical solutions, as restarts in general purpose computers, cannot be accepted; retries are frequently used after communications with errors; retries may affect the real-time requirement; Maintainability, diagnosability: it concerns the possibility to repair ESs in reasonable time domains; Privacy: solutions must be found for ensuring privacy of confidential information; 12 Advanced Embedded Systems  Electrical robustness    Single-ended signaling and Differential signaling; Single-ended signaling:      Signals are represented by voltages reported to ground; A single ground wire is sufficient for a certain number of signals; Is susceptible to external noise (for ex. from a motor which is switched on); It is difficult to establish high-quality common ground signals between a large number of systems, due to the resistance and inductance of the ground wires; Differential signaling:   Each signal is transferred on two wires; they are twisted; If the voltage on the first wire is greater than the voltage on the second wire it is encoded as a logical 1, otherwise as a logical 0; 13 Advanced Embedded Systems        Signals do not generate any currents on the ground wires, hence the quality of the ground wires becomes less important; Noise is added to the two wires in the same way and the comparator will remove all the noise; that is why the differential signaling ensures a much longer transmition (for example 1200 m in a serial interface compared with 30 m in a RS232 single-ended serial interface); The logic value depends just on the polarity of the voltage between the two wires; the magnitude of the voltage can be affected by reflections or by the resistance of the wires but the decoded value will not be affected; Signals do not generate any currents on the ground wires, hence the quality of the ground wires becomes less important; No common ground is necessary; hence there is not need to establish high quality ground wires between a large number of communicating systems; this affects positively the cost; Differential signaling allows a larger throughput than single-ended signaling; Disadvantage: the need for two wires for a signal; it dramatically increases the number of wires; also, there is a more complex electronic for sending and receiving signals; 14 Advanced Embedded Systems  Guaranteeing real-time behavior     The communication on buses, like Ethernet, are affected by collisions, which can affect the real-time feature; Carrier-sense multiple access/ collision detect (CSMA/CD) method: if a collision occurs, the systems must stop, wait for some time and retry; the waiting time is chosen randomly and it may happen a new collision at retry; collisions can repeat a number of times resulting in waste of time; this method is not appropriate when real-time constraints exist; Carrier-sense multiple access/ collision avoidance (CSMA/CA) method: collisions are avoided; priorities are assigned to partners and communication media are allocated to partners during arbitration phases; when a system wants to communicate it must wait an arbitration phase and indicate its will; if a system with higher priority wants also to communicate, the first system has to remove its indication and wait another arbitration phase; CSMA/CA guarantees a predictable real-time behavior for the system with the highest priority, considering an upper bound on the time between arbitration phases; for the other systems, real-time behavior can be guaranteed only if the higher priority partners do not access continuously the media; 15 Advanced Embedded Systems Processing units       Only several types of processing units are appropriate for ESs: ASICs, reconfigurable logic and processors; The efficiency, measured in operations/Watt is higher for ASICs and lower, with one order of magnitude, for reconfigurable logic and with two orders of magnitude for processors; Flexibility is higher for processors, lower for reconfigurable logic and very low for ASICs; the flexibility of the processors is given by their programmability feature; Minimization of power and energy consumption is important; Power consumption influences the size of the power supply, the design of the voltage regulators, the dimensions of the interconnections and the cooling process; Minimizing the energy consumption is important especially in mobile applications, since battery technology is only slowly improving; 16 Advanced Embedded Systems      The energy consumption affects also the reliability, since the lifetime of electronic circuits decreases at high temperatures; The energy for a certain application is closely related to the power required per operation, since the mathematical relationship between them; According to the mathematical relationship, reducing the power consumption also decreases the energy consumption but it is not necessarily always true; In some cases a slightly increased power consumption may lead to an important reduction in execution time resulting a decrease of the energy; Application-Specific Integrated Circuits (ASICs)   Ensures high performances (high speed, energy efficiency) but requires high cost (for the mask of the chips); a price in the order of 105 euros for a mask is quite common; Appropriate if the market accepts the costs or for a large market; 17 Advanced Embedded Systems  Reconfigurable logic: represents a compromise between the high costs of ASICs and low speed and high energy consumption of processors;   The function it executes can be changed using configuration data; Application areas:      Fast prototyping: in experimental phases; Low volume applications; Reconfigurable logic usually includes RAM to store configurations; since RAM is volatile, ROM or Flash memories are necessary for providing the configuration data to RAM at power-up; Field Programmable Gate Arrays (FPGA) are the most common form of reconfigurable logic; they consist of arrays of processing elements which can be programmed after fabrication; Example: the Xilinx Virtex-II:    It contains up to 112 x 104 configurable logic blocks (CLB) interconnected through a programmable interconnect structure; Contains also up to 1108 input/ output connections and special clock processing; Contains also 168 18x18 bit multipliers and 3024 kbits of RAM; 18 Advanced Embedded Systems  Each CLB consists of 4 so-called slices: 19 Advanced Embedded Systems        Each slice contains two 16 bit memories, F and G; these memories can be used as look-up tables, LUT, for implementing all 216 boolean functions of 4 variables; Using multiplexers, MUXF5, MUXFx, several of these memories can also be combined for creating LUTs for up to 8 variables; They can also serve as ordinary RAM or as shift registers, SRLs; Each slice also includes two output registers and some special logic (ORCY, CY) for additions; Configuration data determines the setting of multiplexers, the clocking of registers, the content of RAM and the connection between CLBs; Typically, the configuration data is generated from a high-level description of the functionality of the hardware, for ex. in VHDL; Integration of reconfigurable logic with processors is possible. 20 Advanced Embedded Systems  Processors  Key advantage: flexibility;  Microcontrollers, DSPs, microprocessors Main requirement: efficiency; has different aspects;   Energy-efficiency:    Architectures have to be optimized for their energy-efficiency and care must be taken for not loosing efficiency in the software generation process; for ex. compilers generating 50% overhead in terms of number of cycles are not desirable; Energy efficiency must be considered from the design of the instruction set to the design of the manufacturing process; Techniques for making processors energy efficient:     Gated clocking; Dynamic power management; Dynamic voltage scaling; Gated clocking: parts of the processor are decoupled from the clock during idle periods; 21 Advanced Embedded Systems  Dynamic power management: processors have several low power modes in addition to the standard mode; each low power mode has a different power consumption and a different time for transitions into the normal operating mode; fig. shows an example:   The higher is the saving of the power in a low power mode, the smaller is the number of operations done by the processor in that mode; Dynamic voltage scaling: the energy consumption of CMOS processors increases quadratically with the supply voltage Vdd;  The power consumption of CMOS: where α is the switching activity, CL is the load capacitance, Vdd is the supply voltage and f is the clock frequency; 22 Advanced Embedded Systems  The delay of CMOS circuits is described by the relation: where k is a constant and Vt is the threshold voltage;  Vt has an impact on the transistor input voltage required to switch the transistor on; for ex., for a maximum supply voltage of 3.3 V, Vt may be in the order of 0.8 V; consequently, the maximum clock frequency is a function of the supply voltage;  However, decreasing the supply voltage reduces the power quadratically, while the speed is only linearly decreased;  Ex.: the Crusoe processor has 32 voltage levels, between 1.1 and 1,6 V, and the clock can be varied between 200 MHz and 700 MHz, in increments of 33 MHz; transition from one voltage/ frequency pair to another one requires about 20 ms;  Code-size efficiency: capacity of internal memory is limited and, typically, there is no external memory; the code size must be minimized; 23 Advanced Embedded Systems   CISC machines are more efficient, in code size, than RISC machines; RISC machines are faster; Compression techniques:     Reduces both the area of the memory in the chip and the energy necessary to fetch the instructions; Due to the reduced bandwidth requirements, fetching can also be faster; A decoder is necessary between the processor and the instruction memory for recreating the original instructions on the fly; A variation of the compression technique is the existence of the second instruction set; ex.: the ARM processors; the original ARM instruction set is 32 bit wide but there is also a 16 bit wide set, called THUMB; during execution THUMB instructions are dynamically converted into full ARM instructions; the disadvantage is in software development cost; 24 Advanced Embedded Systems  Run-time efficiency: in order to meet time constraints, without high clock frequencies, architecture can be customized to certain application domain; ex.: DSPs;  In digital signal processing, digital filter generating is a very frequent operation; the next equation describes a digital filter generating an output sequence, y = (y0, y1, …) from an input sequence x = (x0, x1, …):  A certain output element, yi, correspond to a weighted average over the last n sequence elements of x and can be computed iteratively using the following equations: yi,j = yi, j-1 * aj where yi, -1 = 0 and yi = yi, n-1 DSPs are designed such that each iteration can be encoded as a single instruction;  25 Advanced Embedded Systems  Ex.: the internal architecture of an DSP: 26 Advanced Embedded Systems    D and P are two memories, accessed through a special addressing unit, AGU; there are separate units for additions and multiplications, each with their own argument registers, AX, AY, AF, MX, MY and MF; the multiplier is connected to a second adder for computing series of multiplications and additions quickly; The update of the partial sum is essentially done in a single cycle; for that, the two memories are allocated to hold the two arrays x and a and address registers are allocated such that relevant pointers can be easily updated in the AGU; partial sums, yi,j, are stored in MR; The pipelined computation involves registers A1, A2, MX and MY, like in the following implementation of the filter: MR:=0; A1:=1; A2:=n-2; MX:=x[n-1]; MY:=a[0]; for (j=1; j<=n; j++) {MR:=MR + MX * MY; MX:=x[A2]; MY:=a[A1]; A1++; A2--} 27 Advanced Embedded Systems  A single instruction encodes the loop body, comprising the following operations:        Reading two arguments, from argument registers MX and MY, multiplying them and adding the product to register MR storing values yi,j; Fetching the next elements of arrays a and x from meories P and D and storing them in argument registers MX and MY; Updating pointers to the next arguments, stored in address registers A1 and A2; Testing for the end of the loop; This way, each iteration requires only one instruction and for that, several operations are done in parallel; this leads to relatively low clock frequencies; The registers in this architecture perform different functions; they are said to be heterogeneous; heterogeneous register files are a common characteristic for DSPs; In order to avoid extra cycles for testing for the end of the loop, zerooverhead loop instructions exist in DSPs; with them, a single or a small number of instructions can be executed a fixed number of times; 28 Advanced Embedded Systems  Microcontrollers: the classical 8051:            Is the core of a large family of 8 bit microcontrollers; CMOS technology; Includes 4 Kbytes ROM memory and 128 bytes RAM memory; Includes an ALU and a boolean processor; Has 4 I/ O ports which can be used as general purpose ports but have also alternative functions; Can address up to 64 kbytes external program memory and up to 64 kbytes external data memory; Has 2 independent timers; Includes a full duplex serial UART; The instruction set is oriented on real-time applications; The interrupt system can manage 5 external and internal sources, with 2 priority levels; Low consumption: 16 mA in normal mode, 3.7 mA in Idle mode and 50 μA in Power Down mode; 29 Advanced Embedded Systems 30 Advanced Embedded Systems  DSPs: other application oriented features:       Specialized addressing modes: modulo addressing; is useful when the addressed elements are in a ring buffer; addresses can be incremented and decremented until the first or last element of the buffer is reached; Separate address generation units: addresses are stored in dedicated address registers; this allows the indirect addressing modes, saving machine instructions, cycles and energy; Saturating arithmetic: changes the way overflows and underflows are handled; in standard binary arithmetic, wrap-around is used for the values returned after an ov. or und.; saturating arithmetic returns the closest result to the true one; in video and audio applications; Fixed-point arithmetic: floating-point hardware increases the cost and power consumption; consequently, 80% of DSPs do not contain this hardware; however, many of them have support for fixed-point numbers; Multiple memory banks or memories: this allows to fetch both arguments of an operation at the same time; Multiply/ accumulate instructions: such an instruction performs multiplications followed by additions. 31 Advanced Embedded Systems  Memories:  For increasing the run-time and energy efficiency, memory hierarchies should be used; the reason is that large memories require more energy per access and are also slower than small memories;  The gap between the processor and memory speeds is increasing; while the speed of memory is increasing by a factor f about 1.07/ year, processor speed is increasing by a factor of 1.5 – 2/ year; 32 Advanced Embedded Systems       Therefore, it is efficient to use small and fast memories as buffers between the main memory and the processor; The PC solution: the cache memory; the hardware checks require additional energy and caches cannot offer predictability of real-time performance; The alternative solution: small memories can be mapped into the address space: They are called scratch pad memories; The compiler should allocate frequently used variables and instructions to that address space; no checking is necessary; As a result the energy per access is reduced; 33 Advanced Embedded Systems  Fig. compares the energy/ access in case of cache memories and scratch pad memories: Output devices   Displays: LEDs, LCDs, seven segment displays, small touchscreens; Electro-mechanical devices: action on the environment through electrical motors, transforming rotation in movement;  They are generally called actuators and there is a large spectrum of actuators from very tiny ones, in the μm area (a challenging application area is the human body), to very big ones, capable of moving tons of weight; 34 Advanced Embedded Systems  D/ A converters:      The operational amplifier amplifies the voltage difference between the two inputs by a very large factor; Due to resistor R1, resulting output voltages are fed back to input -, reducing the input voltage; the differential voltage between the two inputs is reduced to zero and since input + is connected to gnd the voltage between input – and gnd is zero; The main idea is to generate a current proportional to the value represented by a bit-vector x and convert this current in a voltage; Current I is the sum of the currents through the resistors; The current through a resistor is 0 if the corresponding bit of bit-vector x is 0; if this is 1, the current corresponds to the weight of that bit since the resistor values are chosen accordingly; 35 Advanced Embedded Systems  The equation for I is (one of the Kirchoff’s law): The other Kirchoff’s law gives (due to the zero value at input -): V + R1*I’ = 0;  The current into the input of the operational amplifier is practically 0 and I = I’; hence: V + R1*I = 0;  From the first and the last equations we obtain:    It denotes the natural number represented by bit-vector x; The output voltage is proportional to the value represented by x. 36

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 5 Embedded Systems Hardware