Advanced Embedded Systems
Lecture 5
Embedded Systems
Advanced Embedded Systems
A classical design information flow, for complex ESs, is shown in
Hardware for ESs is less standardized than hardware for personal
However, there are hardware components frequently used in ESs:
keys, sensors, microcontrollers, DSPs, LCDs, leds, seven segment
displays, serial memories etc.
Communication is mostly implemented through serial interfaces:
RS232, I2C, CAN etc.
Advanced Embedded Systems
Fig. shows a classical structure for an ES used in control
Fig. shows the reactive feature of an ES: it reads and monitors the
external environment and executes an external operation based on
the data read;
There are variations of these scheme imposed by different
For example there are sensors which give digital data (in serial
form), there are information processing units which include A/D
converters and there are execution elements (actuators) requiring
digital data (also in serial form);
Advanced Embedded Systems
There are a lot of sensors for every physical quantity;
Acceleration sensors: it contains a small mass in its center; when
accelerated, the mass will be displaced from its initial position and
will change the resistance of the tiny wires connected to it;
Rain sensors: the automotive industry became an important
application area for rain sensors; a lot of cars contain them,
commanding the speed of the wipers in accordance with the amount
of rain;
Artificial eyes: application areas: robotics and medicine;
Medicine: a little camera is attached to glasses; it is connected to a
computer which translates the patterns in electrical pulses; these
pulses are sent directly to the brain, through electrodes; the
resolution obtained (2003) is in order of 128 x 128 pixels, enabling a
blind person to drive a car in controlled areas;
Robotics: cameras connected to computers;
Advanced Embedded Systems
Image sensors: two types: charge-coupled devices (CCDs) and
CMOS; in both cases, arrays of light sensors are used;
The architecture of CMOS sensor arrays is similar to that of standard
memories: individual pixels can be randomly addressed and read at an
array boundary;
CMOS sensors are made in CMOS technology and they can be
integrated on the same chip with the processing unit; they are smart
CMOS sensors require a single power supply voltage and interfacing is
easy, so they are cheap;
In contrast, CCD sensors are adequate for high quality, expensive
optical applications (video cameras, optical telescopes);
In CCD technology, charges have to be transferred from one pixel to the
next until they can finally be read;
Images generated with CCDs have low level of noise, so they are of
higher quality than those generated by CMOS sensors; but interfacing is
more complex leading to higher costs;
Advanced Embedded Systems
Biometrical sensors:
Are used for security, more exactly in authentication;
Classical password based authentication is limited;
Biomedical authentication tries to identify a certain person by scanning
parts of its body: face, iris, finger print;
Finger print sensors are fabricated in CMOS technology and facer
recognition can be made with image sensors;
The hit rate is lower than in password based authentication;
Proximity sensors: indicate how close are two moving objects; an
application area: cars with proximity sensors for helping the driver to
park in small places;
Wireless sensors:
Include on the same chip the sensor, the processing unit and an
interface for wireless communications;
Are connected in networks;
Low consumption is mandatory;
Many application areas: meteorology, medicine, smart houses,
surveillance and tracking etc.
Advanced Embedded Systems
A/D converters
Information processing units work with digital values; if sensors give
analog values, they must be converted using A/D converters;
First, the analog voltage must be sampled and hold;
The transistor operates like a switch; each time the switch is closed
by the clock, the capacitor is charged to a value equal to the
incoming voltage Ve; after opening the switch, the voltage remains
essentially the same until the switch is closed again;
Each of the values stored on the capacitor can be considered as an
element of a discrete sequence of values Vx, obtained from an
analog signal Ve; the values Vx will be converted in digital form;
Advanced Embedded Systems
Two types: independent and included in other circuits;
Flash A/D converter:
Each comparator has 2 inputs, denoted as + and -; if V+ > V-, the output
gives a 1 and 0 otherwise;
In the A/D converter, all inputs are connected to a voltage divider;
If input voltage Vx > Vref, the comparator at the top will generate a 1; the
encoder will identify the most significant 1 and will encode the case Vx >
Vref as the largest output value;
Advanced Embedded Systems
If input voltage Vx < Vref but still > ¾ Vref, the comparator at the top will
generate a 0 and the next comparator will generate a 1; the encoder will
encode this value as the second largest value;
Similarly for the cases: 2/4 Vref < Vx < ¾ Vref, 1/4 Vref < Vx < 2/4 Vref
and 0 < Vx < ¼ Vref, which will be encoded as the third largest, the
fourth largest and the smallest values, respectively;
Advantage: high speed, no clock; it can be used in high-speed video
Disadvantage: hardware complexity: n – 1 comparators are needed to
distinguish between n values;
Successive approximation:
Advanced Embedded Systems
It is based on binary search and on successive approximation; for that, a
register is necessary;
Initially, the most significant output bit of the successive approximation
register is set to 1 and all other bits are set to 0; this digital value is
converted to an analog value, corresponding to 0.5 x the maximum input
voltage; if Vx > the generated analog value, the m.s.b. is kept to 1,
otherwise is reset to 0;
A same process is applied to the next bit; it will remain 1 if the input
value is within the second or the fourth quarter of the input value range;
it will be reset otherwise;
The same process is applied to all the bits from the approximation
Advantage: hardware efficiency: for distinguishing n digital values, log2n
bits are needed in the approximation register and the D/A converter;
Disadvantage: low speed, since it needs f(log2n) steps;
It is appropriate for applications where high precision conversions at
moderate speeds are required; ex.: audio applications;
Advanced Embedded Systems
Communication is done on communication media: wireless, wires,
optical etc. through abstract entities called channels;
Communication requirements:
Real-time behavior: very important and must be taken into account from
the design phase; some low-cost solution (e.g. Ethernet) are not
Efficiency: communication media can be quite expensive; for ex. point to
point connections in large buildings are a very expensive solution; the
situation is worse if separate wires are foresight for control, data and
addresses; with separate wires is almost impossible to add new
modules; the weight of the wires must also be considered, for ex. in cars;
the most efficient solution is the bus;
Appropriate bandwidth and communication delay: bandwidth
requirements of ESs may vary in accordance with the requirements of
the application; high bandwidth means high cost so it is important to
provide only the necessary bandwidth;
Advanced Embedded Systems
Support for event-driven communication: communication with the
external environment can be done by polling the sources or by
interrupts; the first solution is simpler but the delay may be too large;
interrupts are appropriate for event-oriented communication but they
require a specific software, and possibly hardware, support;
Robustness: reliable communication must be maintained even in harsh
conditions: large temperature domain (- 200C - + 1800C in cars), close to
major sources of electromagnetic radiation, in presence of mechanical
vibrations, major light sources etc.; voltage levels and clock frequency
may be affected;
Fault tolerance: ESs should work even after faults occur; classical
solutions, as restarts in general purpose computers, cannot be accepted;
retries are frequently used after communications with errors; retries may
affect the real-time requirement;
Maintainability, diagnosability: it concerns the possibility to repair ESs in
reasonable time domains;
Privacy: solutions must be found for ensuring privacy of confidential
Advanced Embedded Systems
Electrical robustness
Single-ended signaling and
Differential signaling;
Single-ended signaling:
Signals are represented by voltages reported to ground;
A single ground wire is sufficient for a certain number of signals;
Is susceptible to external noise (for ex. from a motor which is switched on);
It is difficult to establish high-quality common ground signals between a large
number of systems, due to the resistance and inductance of the ground wires;
Differential signaling:
Each signal is transferred on two wires; they are twisted;
If the voltage on the first wire is greater than the voltage on the second wire it
is encoded as a logical 1, otherwise as a logical 0;
Advanced Embedded Systems
Signals do not generate any currents on the ground wires, hence the quality
of the ground wires becomes less important;
Noise is added to the two wires in the same way and the comparator will
remove all the noise; that is why the differential signaling ensures a much
longer transmition (for example 1200 m in a serial interface compared with 30
m in a RS232 single-ended serial interface);
The logic value depends just on the polarity of the voltage between the two
wires; the magnitude of the voltage can be affected by reflections or by the
resistance of the wires but the decoded value will not be affected;
Signals do not generate any currents on the ground wires, hence the quality
of the ground wires becomes less important;
No common ground is necessary; hence there is not need to establish high
quality ground wires between a large number of communicating systems; this
affects positively the cost;
Differential signaling allows a larger throughput than single-ended signaling;
Disadvantage: the need for two wires for a signal; it dramatically increases the
number of wires; also, there is a more complex electronic for sending and
receiving signals;
Advanced Embedded Systems
Guaranteeing real-time behavior
The communication on buses, like Ethernet, are affected by collisions,
which can affect the real-time feature;
Carrier-sense multiple access/ collision detect (CSMA/CD) method: if a
collision occurs, the systems must stop, wait for some time and retry; the
waiting time is chosen randomly and it may happen a new collision at
retry; collisions can repeat a number of times resulting in waste of time;
this method is not appropriate when real-time constraints exist;
Carrier-sense multiple access/ collision avoidance (CSMA/CA) method:
collisions are avoided; priorities are assigned to partners and
communication media are allocated to partners during arbitration
phases; when a system wants to communicate it must wait an arbitration
phase and indicate its will; if a system with higher priority wants also to
communicate, the first system has to remove its indication and wait
another arbitration phase;
CSMA/CA guarantees a predictable real-time behavior for the system
with the highest priority, considering an upper bound on the time
between arbitration phases; for the other systems, real-time behavior
can be guaranteed only if the higher priority partners do not access
continuously the media;
Advanced Embedded Systems
Processing units
Only several types of processing units are appropriate for ESs:
ASICs, reconfigurable logic and processors;
The efficiency, measured in operations/Watt is higher for ASICs and
lower, with one order of magnitude, for reconfigurable logic and with
two orders of magnitude for processors;
Flexibility is higher for processors, lower for reconfigurable logic and
very low for ASICs; the flexibility of the processors is given by their
programmability feature;
Minimization of power and energy consumption is important;
Power consumption influences the size of the power supply, the
design of the voltage regulators, the dimensions of the
interconnections and the cooling process;
Minimizing the energy consumption is important especially in mobile
applications, since battery technology is only slowly improving;
Advanced Embedded Systems
The energy consumption affects also the reliability, since the lifetime
of electronic circuits decreases at high temperatures;
The energy for a certain application is closely related to the power
required per operation, since the mathematical relationship between
According to the mathematical relationship, reducing the power
consumption also decreases the energy consumption but it is not
necessarily always true;
In some cases a slightly increased power consumption may lead to
an important reduction in execution time resulting a decrease of the
Application-Specific Integrated Circuits (ASICs)
Ensures high performances (high speed, energy efficiency) but requires
high cost (for the mask of the chips); a price in the order of 105 euros for
a mask is quite common;
Appropriate if the market accepts the costs or for a large market;
Advanced Embedded Systems
Reconfigurable logic: represents a compromise between the high
costs of ASICs and low speed and high energy consumption of
The function it executes can be changed using configuration data;
Application areas:
Fast prototyping: in experimental phases;
Low volume applications;
Reconfigurable logic usually includes RAM to store configurations; since
RAM is volatile, ROM or Flash memories are necessary for providing the
configuration data to RAM at power-up;
Field Programmable Gate Arrays (FPGA) are the most common form of
reconfigurable logic; they consist of arrays of processing elements which
can be programmed after fabrication;
Example: the Xilinx Virtex-II:
It contains up to 112 x 104 configurable logic blocks (CLB) interconnected
through a programmable interconnect structure;
Contains also up to 1108 input/ output connections and special clock
Contains also 168 18x18 bit multipliers and 3024 kbits of RAM;
Advanced Embedded Systems
Each CLB consists of 4 so-called slices:
Advanced Embedded Systems
Each slice contains two 16 bit memories, F and G; these memories can be
used as look-up tables, LUT, for implementing all 216 boolean functions of 4
Using multiplexers, MUXF5, MUXFx, several of these memories can also be
combined for creating LUTs for up to 8 variables;
They can also serve as ordinary RAM or as shift registers, SRLs;
Each slice also includes two output registers and some special logic (ORCY,
CY) for additions;
Configuration data determines the setting of multiplexers, the clocking of
registers, the content of RAM and the connection between CLBs;
Typically, the configuration data is generated from a high-level description of
the functionality of the hardware, for ex. in VHDL;
Integration of reconfigurable logic with processors is possible.
Advanced Embedded Systems
Key advantage: flexibility;
Microcontrollers, DSPs, microprocessors
Main requirement: efficiency; has different aspects;
Architectures have to be optimized for their energy-efficiency and care
must be taken for not loosing efficiency in the software generation
process; for ex. compilers generating 50% overhead in terms of number
of cycles are not desirable;
Energy efficiency must be considered from the design of the instruction
set to the design of the manufacturing process;
Techniques for making processors energy efficient:
Gated clocking;
Dynamic power management;
Dynamic voltage scaling;
Gated clocking: parts of the processor are decoupled from the clock
during idle periods;
Advanced Embedded Systems
Dynamic power management: processors have several low power
modes in addition to the standard mode; each low power mode has a
different power consumption and a different time for transitions into the
normal operating mode; fig. shows an example:
The higher is the saving of the power in a low power mode, the smaller is the
number of operations done by the processor in that mode;
Dynamic voltage scaling: the energy consumption of CMOS processors
increases quadratically with the supply voltage Vdd;
The power consumption of CMOS:
where α is the switching activity, CL is the load capacitance, Vdd is the supply
voltage and f is the clock frequency;
Advanced Embedded Systems
The delay of CMOS circuits is described by the relation:
where k is a constant and Vt is the threshold voltage;
Vt has an impact on the transistor input voltage required to switch the
transistor on; for ex., for a maximum supply voltage of 3.3 V, Vt may be in the
order of 0.8 V; consequently, the maximum clock frequency is a function of
the supply voltage;
However, decreasing the supply voltage reduces the power quadratically,
while the speed is only linearly decreased;
Ex.: the Crusoe processor has 32 voltage levels, between 1.1 and 1,6 V, and
the clock can be varied between 200 MHz and 700 MHz, in increments of 33
MHz; transition from one voltage/ frequency pair to another one requires
about 20 ms;
Code-size efficiency: capacity of internal memory is limited and,
typically, there is no external memory; the code size must be
Advanced Embedded Systems
CISC machines are more efficient, in code size, than RISC machines;
RISC machines are faster;
Compression techniques:
Reduces both the area of the memory in the chip and the energy necessary to
fetch the instructions;
Due to the reduced bandwidth requirements, fetching can also be faster;
A decoder is necessary between the processor and the instruction memory
for recreating the original instructions on the fly;
A variation of the compression technique is the existence of the second
instruction set; ex.: the ARM processors; the original ARM instruction set
is 32 bit wide but there is also a 16 bit wide set, called THUMB; during
execution THUMB instructions are dynamically converted into full ARM
instructions; the disadvantage is in software development cost;
Advanced Embedded Systems
Run-time efficiency: in order to meet time constraints, without high
clock frequencies, architecture can be customized to certain
application domain; ex.: DSPs;
In digital signal processing, digital filter generating is a very frequent
operation; the next equation describes a digital filter generating an
output sequence, y = (y0, y1, …) from an input sequence x = (x0, x1, …):
A certain output element, yi, correspond to a weighted average over the
last n sequence elements of x and can be computed iteratively using the
following equations:
yi,j = yi, j-1 * aj
where yi, -1 = 0
and yi = yi, n-1
DSPs are designed such that each iteration can be encoded as a single
Advanced Embedded Systems
Ex.: the internal architecture of an DSP:
Advanced Embedded Systems
D and P are two memories, accessed through a special addressing unit,
AGU; there are separate units for additions and multiplications, each
with their own argument registers, AX, AY, AF, MX, MY and MF; the
multiplier is connected to a second adder for computing series of
multiplications and additions quickly;
The update of the partial sum is essentially done in a single cycle; for
that, the two memories are allocated to hold the two arrays x and a and
address registers are allocated such that relevant pointers can be easily
updated in the AGU; partial sums, yi,j, are stored in MR;
The pipelined computation involves registers A1, A2, MX and MY, like in
the following implementation of the filter:
MR:=0; A1:=1; A2:=n-2; MX:=x[n-1]; MY:=a[0];
for (j=1; j<=n; j++)
{MR:=MR + MX * MY; MX:=x[A2]; MY:=a[A1];
A1++; A2--}
Advanced Embedded Systems
A single instruction encodes the loop body, comprising the following
Reading two arguments, from argument registers MX and MY, multiplying
them and adding the product to register MR storing values yi,j;
Fetching the next elements of arrays a and x from meories P and D and
storing them in argument registers MX and MY;
Updating pointers to the next arguments, stored in address registers A1 and
Testing for the end of the loop;
This way, each iteration requires only one instruction and for that,
several operations are done in parallel; this leads to relatively low clock
The registers in this architecture perform different functions; they are
said to be heterogeneous; heterogeneous register files are a common
characteristic for DSPs;
In order to avoid extra cycles for testing for the end of the loop, zerooverhead loop instructions exist in DSPs; with them, a single or a small
number of instructions can be executed a fixed number of times;
Advanced Embedded Systems
Microcontrollers: the classical 8051:
Is the core of a large family of 8 bit microcontrollers;
CMOS technology;
Includes 4 Kbytes ROM memory and 128 bytes RAM memory;
Includes an ALU and a boolean processor;
Has 4 I/ O ports which can be used as general purpose ports but have
also alternative functions;
Can address up to 64 kbytes external program memory and up to 64
kbytes external data memory;
Has 2 independent timers;
Includes a full duplex serial UART;
The instruction set is oriented on real-time applications;
The interrupt system can manage 5 external and internal sources, with 2
priority levels;
Low consumption: 16 mA in normal mode, 3.7 mA in Idle mode and 50
μA in Power Down mode;
Advanced Embedded Systems
Advanced Embedded Systems
DSPs: other application oriented features:
Specialized addressing modes: modulo addressing; is useful when the
addressed elements are in a ring buffer; addresses can be incremented
and decremented until the first or last element of the buffer is reached;
Separate address generation units: addresses are stored in dedicated
address registers; this allows the indirect addressing modes, saving
machine instructions, cycles and energy;
Saturating arithmetic: changes the way overflows and underflows are
handled; in standard binary arithmetic, wrap-around is used for the
values returned after an ov. or und.; saturating arithmetic returns the
closest result to the true one; in video and audio applications;
Fixed-point arithmetic: floating-point hardware increases the cost and
power consumption; consequently, 80% of DSPs do not contain this
hardware; however, many of them have support for fixed-point numbers;
Multiple memory banks or memories: this allows to fetch both arguments
of an operation at the same time;
Multiply/ accumulate instructions: such an instruction performs
multiplications followed by additions.
Advanced Embedded Systems
For increasing the run-time and energy efficiency, memory hierarchies
should be used; the reason is that large memories require more energy
per access and are also slower than small memories;
The gap between the processor and memory speeds is increasing; while
the speed of memory is increasing by a factor f about 1.07/ year,
processor speed is increasing by a factor of 1.5 – 2/ year;
Advanced Embedded Systems
Therefore, it is efficient to use small and fast memories as buffers
between the main memory and the processor;
The PC solution: the cache memory; the hardware checks require
additional energy and caches cannot offer predictability of real-time
The alternative solution: small memories can be mapped into the
address space:
They are called scratch pad memories;
The compiler should allocate frequently used variables and instructions
to that address space; no checking is necessary;
As a result the energy per access is reduced;
Advanced Embedded Systems
Fig. compares the energy/ access in case of cache memories and
scratch pad memories:
Output devices
Displays: LEDs, LCDs, seven segment displays, small touchscreens;
Electro-mechanical devices: action on the environment through
electrical motors, transforming rotation in movement;
They are generally called actuators and there is a large spectrum of
actuators from very tiny ones, in the μm area (a challenging
application area is the human body), to very big ones, capable of
moving tons of weight;
Advanced Embedded Systems
D/ A converters:
The operational amplifier amplifies the voltage difference between the
two inputs by a very large factor;
Due to resistor R1, resulting output voltages are fed back to input -,
reducing the input voltage; the differential voltage between the two
inputs is reduced to zero and since input + is connected to gnd the
voltage between input – and gnd is zero;
The main idea is to generate a current proportional to the value
represented by a bit-vector x and convert this current in a voltage;
Current I is the sum of the currents through the resistors;
The current through a resistor is 0 if the corresponding bit of bit-vector x
is 0; if this is 1, the current corresponds to the weight of that bit since the
resistor values are chosen accordingly;
Advanced Embedded Systems
The equation for I is (one of the Kirchoff’s law):
The other Kirchoff’s law gives (due to the zero value at input -):
V + R1*I’ = 0;
 The current into the input of the operational amplifier is practically 0 and I
= I’; hence:
V + R1*I = 0;
 From the first and the last equations we obtain:
It denotes the natural number represented by bit-vector x;
The output voltage is proportional to the value represented by x.