Download 15_EHW

Evolvable Hardware EWH • EHW:  A bio-inspired technique for hardware design. • Living beings:  DNA constitute the encoding of every living being on the Earth. − ACTG strings. • Reconfigurable logic:  Bitstream determines the logic. − 01 strings. 2 Living Beings vs. Circuits In DNA, the amount of guanine is equal to cytosine and the amount of adenine is equal to thymine. The A:T and C:G pairs are structurally similar. 3 POE Model • The space of artificial bio-inspired systems can be partitioned along these three axes. 1. Phylogeny:  Temporal evolution of a certain genetic material in individuals and species. − − Evolutionary algorithms (EA): simplified artificial counterpart of phylogeny in nature. Mutation, Crossover, …. 2. Epigenesis:  Learning process during an individual’s lifetime. − ANNs: the system’s synaptic weights change through interactions with the environment. 3. Ontogeny:  Development of a single individual from its own genetic material (without environmental interaction). − Self-replicating and self-repairing cellular automata. 4 Epigenesis • Artificial neural network (ANN):  Massively parallel distributed computing units made up of very simple basic elements.  Feature: Storing experiential knowledge making it available for future use.  Inspired from animals’ brains: − Benefit from a massively parallel cellular architecture. − A learning process allows acquiring a certain knowledge. − This knowledge is stored in the form of synaptic weights interconnecting neurons.  Able to compute nonlinear input-output functions.  Adaptable (adjustable synaptic weights and network topology can adapt to its operating environment). 5 ANN • Perceptron:  Most known neuron model: ηi(t) =j wijxj(t) +βi  ηi(t): weighted sum for neuron i at time t,  xj(t): the input value coming from neuron j,  wij: the weight value for the synapse connecting neuron j to neuron i,  βi: the bias value for the neuron i. • Perceptron output: yi(t) = (1+e-ηi(t)/T)-1  T: Slope of the sigmoid function. 6 ANN Supervised Learning • Artificial neural network Supervised learning 7 ANN Unsupervised Learning • Unsupervised learning:  There is no information about the task to be performed, synaptic modifications depend on correlations among input data. • Applications:  Clustering,  Pattern recognition,  Reconstruction of corrupted data,  …. 8 Genetic Algorithms • GA:  An iterative procedure applied to a constant-size population of individuals.  Each individual represents a possible solution. − Eventually one is chosen.  Each individual is coded by a finite string of symbols known as the genome.  Each genome gives rise to the individual’s phenotype, which constitutes the actual solution (e.g. a circuit) to the problem at hand (e.g., a robot controller).  The individual receives a score (fitness) depending on the performance exhibited during its evaluation. 9 GA Steps 1. Initialization:  Create an initial population of individuals − by defining a set of genomes in a random or heuristic manner. 2. Decoding:  Generate the phenotypes for the individuals in the current population by decoding (mapping) the genotypes. 3. Fitness evaluation:  Evaluate individuals according to some predefined quality criterion (fitness). 4. Genetic operators:  Apply genetically inspired operators to the current population. 5. Iterate:  If a predefined convergence condition has not been met, go back to step 2 to evaluate a new generation. Otherwise, deliver the best individual evaluated. 10 Genetic Operators • Selection:  Individuals are selected into a mating pool for reproduction according to their fitness. − Stochastic or deterministic selection. • Crossover:  Two genomes are selected to be split and swapped at a random position. • Mutation:  The genome is randomly changed. 11 12 Conventional Circuit Design • Circuit design:  A hard engineering task  Vulnerable to human error,  For large circuits the optimality of a solution cannot be guaranteed.  Design automation has become a challenge.  Increasing complexity of circuits  Higher abstraction levels needed. EWH: a solution 13 Evolutionary Circuit Design • EHW:  From a given behavior specification of a circuit, an EA will search for a bitstream describing a circuit that satisfies it. − Most works: application of EAs to synthesis. −  Evolutionary circuit design is more descriptive than EHW. 14 Evolutionary Circuit Design • Major advantage:  Designer’s job is reduced to constructing the evolutionary setup: Specifying 1. 2. 3. 4. Circuit requirements, Basic elements, A decoding mechanism, Testing scheme used to assign fitness − often the most difficult.   Automatic generation of the circuit. 15 EWH • Two critical questions when setting up a system: 1. How to map a phenotype from a genotype? 2. How to compute the fitness of a circuit? 16 Low-Level Languages • Low-level languages − • • Genome encoding steps:  A set of basic logic gates must be chosen (e.g., AND, OR, and NOT)  and codified along with the interconnections between gates Problems:  Genome’s length: order of tens of thousands of bits, − • Directly incorporating the bit string representing the configuration of a programmable circuit within the genome  Evolution practically impossible  Many circuits are invalid. Solutions by XC6200:  MUX-based  Direct correspondence between the bit string of a cell and the actual logic circuit.  Separate configuration of each cell  Remarkedly faster 18 Fitness Calculation • Fitness calculation:  Off-chip: − High-level language for genome representation. −  Have to transform the encoded system to evaluate fitness −  Only final solution is actually implemented in hardware.  On-chip: − Low-level language −  Direct configuration −  Can use real hardware during the evolutionary process. 19 EHW Classification • Classes acc. to the level of bio-inspiration: 20 Extrinsic Evolution • Extrinsic evolution:  All operations are carried out in software,  Solution possibly loaded into a real circuit. − Traditional evolutionary techniques for synthesis.  At different abstraction levels − Scheduling and allocation, − Logic synthesis, − Placement and routing.  Not suitable for evolving at bitstream level. 21 Intrinsic Evolution • Intrinsic evolution:  A real circuit is used during the evolutionary process for output computation,  Most operations are still carried out in software. 22 Thompson Frequency Recognizer • FPGA: Xilinx XC6216  A 10x10 corner of 64x64 array was used.  No configuration can damage the device. −  EA can manipulate configuration without legality constraints or checking.  Configuration: 1800 bits. 23 Thompson Frequency Recognizer • Circuit:  Discriminate between 1kHz and 10kHz tones. • Aim:  Output goes to 5v when one tone appears at input.  Output goes to 0v otherwise. • GA:  Population size: 50  Individuals: 1800-bit strings  Initial population: random  Next generation: − Copy the fittest individual − Crossover rate: 70% − Number of mutations per genotype: 2.7 24 Thompson Frequency Recognizer • PC  runs EA • Tone generator:  generates five 500ms bursts of 1kHz square wave  and five 500ms bursts of 10kHz square wave 25 Thompson Frequency Recognizer • Inputs to circuit:  10 test tones shuffled randomly 500ms 1 500ms 21 1 10 • FPGA:  takes test tones  generate outputs 26 Thompson Frequency Recognizer • Integrator:  integrates FPGA outputs over 500ms  generates it for test tone number t (t = 1,2, …, 10) 500ms 1 500ms 21 1 10 • Fitness: • S1:  set of five 1kHz tones • S10:  set of five 10kHz tones k1=1/30730, k2=1/30527 27 Thompson Frequency Recognizer • Objective:  Maximizing the difference: − average output voltage when 1kHz input is present and − average output voltage when 1kHz input is present. 28 • Oscilloscope screen  for best individual in some generations • Experiment time:  2-3 weeks  no human time 29 Final Circuit 30 Final Circuit 31 Intrinsic Evolution • Problem:  Large genome size. • Solutions:  Variable-length chromosome GAs (VGA): − Genome does not directly represent the configuration bit string but rather codifies the possible logical operations and interconnections.  Evolution at the function level: − Basic units are not elementary logic gates (e.g., AND, OR, and NOT) but rather higher-level functions (e.g. sine-wave generator, multiplier). − Problem: No such commercial FPGA − Solution: [Murakawa96] proposed F2PGA (Functionbased FPGA) 33 Complete Evolution • Complete evolution:  All operations (selection, crossover, mutation) and fitness evaluation, are carried out intrinsically, in hardware. − Different from biological evolution: not open ended: − There is a predefined goal. • Two types: 1. Centralized 2. Population-oriented 34 Complete Evolution • Centralized evolution:  There is a single evolvable circuit and a single evolvable algorithm computation: − EA is executed in an on-chip processor.  Popular − because it greatly enhances the autonomy of the circuit − EHW can adapt to a changing environment during its lifetime. 35 Complete Evolution • Centralized evolution:  Implementations of EAs in general purpose processors:  Disadvantage: − Lower performance  Advantages: − More user-friendly interface for implementing chromosome manipulations, fitness evaluations, and memory access. − Easier algorithm upgrades. 36 Complete Evolution • Population-oriented:  There is a hardware implementation of the full population, (not only of one individual).  Usually based on cellular automata model 37 Complete Evolution • CA:  a discrete dynamic system that performs computations in a distributed fashion on a spatially extended grid. • cellular automaton:  An array of cells (n-dim, n=1, 2, 3) • Cell:  can be in one of a finite number of possible states,  are updated synchronously in discrete timesteps according to a local, identical interaction rule  its state at the next timestep is determined by the current state of a surrounding neighborhood of cells. • Transitions:  specified in the form of a rule table: − shows the cell’s next state for each possible neighborhood configuration. 38 Complete Evolution • Population-oriented based on the cellular programming EA:  Genetic operators are computed in a distributed way: − Each automaton modifies its own rule based on its own and its neighbors’ fitness.  Each cell contains a genome that represents its rule table.  These genomes are initialized at random and then are subjected to evolution. 39 Example • Andres Upegui, Eduardo Sanchez, “Evolving hardware with self-reconfigurable connectivity in Xilinx FPGAs,” NASA/ESA Conference on Adaptive Hardware and Systems (AHS), 2006. 40 Cellular Automata (CA) • CA:  An array of identical computing cells.  A cell is defined by − a set of discrete states, − a rule for determining the transitions between states.  States are synchronously updated according to the rule, − The rule is function of the current state from the cell itself and the states of the surrounding neighbors: fi (si, sj) (j  neighbors of i) 41 Cellular Automata (CA) • Cellular programming:  algorithm that considers a genome per cell − (instead of a genome for the whole system as typical evolving algorithms).  Initial node rules are initialized at random.  Initial states are initialized at random.  CA runs for M iterations.  Repeat it for a number of different initial states.  Fitness is assigned locally to each node.  Genetic operators (reproduction, crossover, and mutation) are applied to genomes.  Evolutionary operators act on a local manner: − By limiting to use genomes from neighbor cells. 42 Cellular Automata (CA) • Cellular programming: nfi: the number of fitter neighbors of cell I − if nfi =0 (i is fitter than its neighbors) then rule i is unchanged − if nfi =1 (i has a fitter neighbor) then i is replaced by the fittest one, followed by mutation − if nfi ≥ 2 (i has two or more fitter neighbors) then i is replaced by a crossover of the two fittest ones, followed by mutation 43 Random Boolean Networks (RBN) • RBN:  A hardware architecture of a cellular system allowing a completely arbitrary connectionism. • Differences with CA:  RBN neighbourhood is asymmetric: − if A state is an input to B, it does not implies that B state is an input to A.  RBN neighborhood is non-uniform: − if Ak is connected to Ak+1,it doesn’t imply that Ak+1 is connected with Ak+2; (for k+2 ≤ N). 44 RBN • RBN architecture proposed in this paper:  Each cell contains: − A rule implemented in LUT − A FF storing the cell state − flexible routing resources implemented in the form of multiplexers.  Cells’ state is updated by a rule − a Boolean function. 45 RBN Architecture • An output from the cell  can be driven by the cell’s state or by any other input, − allowing the outputs to act as a bypass from distant cell states. − (In a typical 2-D CA, outputs would be always driven by the cell’s state). • rule inputs  can be driven by any input or by the cell’s state. • Fewer input rules:  If two multiplexers select the same driver, the 4-inputs rule becomes a 3-inputs rule,  if all multiplexers select the same input,  a 1-input rule. 46 RBN Architecture • Points:  cell 3,1 has 4 inputs (N, S, E, and C),  cell 3,3 has just 2 (N and E),  and cell 1,3 has only 1 input (C) and is completely  isolated from the other nodes.  Driver-less net. 47 RBN Architecture • Generating a random connectionism:  Randomly generating values of multiplexers’ selections, while forcing random drivers for drive-less nets. 48 Implementation • Microblaze soft-processor running on a Virtex-II • Hard macro for RBN cell (4 slices in a CLB)  If used synthesis tools, would take 5 CLBs 49 Implementation • Self-reconfigurability in Virtex II:  ICAP (Internal Access Configuration Port) allows an on-chip processor to self-reconfigure the FPGA  One can directly modify some portions of the configuration bitstream without depending on Xilinx tools as XPART (a Xilinx internal tool) or Jbits [Upegui05]. − Even if Virtex II bitstream is not documented, LUT contents can be localized in the configuration bitstream by comparing the bitstream changes after specific design modifications. 50 Implementation • Implementing routing and MUXes:  Routing configuration of Virtex II FPGA is complicated and not documented at all.  Technically, it would be possible to use FPGAs’ routing resources to multiplex functions’ inputs by activating the correct PIPs (programmable interconnection points).  However, reverse engineering PIPs configuration is very complex to be done by just comparing some bitstream differences. 51 Implementation • Implementing MUXes by LUTs: • LUT contents:  0000 0000 1111 1111 → sel = A1  0000 1111 0000 1111 → sel = A2  0011 0011 0011 0011 → sel = A3  0101 0101 0101 0101 → sel = A4 − Implementing larger multiplexers requires the use − of extra LUTs 52 Application: Firefly • Firefly synchronization problem:  Synchronizing the firing of a set of 2-state nodes.  Nodes are initialized at a random state,  After a number of iterations each node must swap from one state to the other, synchronizing with his neighbors. firefly_simulation_short.wmv 53 Firefly • Fitness computation:  MicroBlaze reads the nodes’ state.  When completed the number of iterations, we compute the phase of the majority of the nodes, and then we let the RBN execute four more iterations.  If the sequence is 0-1-0-1 when the majority phase is 1 the fitness is 1, otherwise the fitness is 0.  If the sequence is 1-0-1-0 when the majority phase is 0 the fitness is 1, otherwise the fitness is 0. 54 Firefly • Simulations for 100 generations:  For 20 different initial states (individuals) do: − Random initialization of cell states − Let the RBN run for 34 iterations. − Compute partial fitness for each cell  For each cell, compute total fitness as the sum of partial fitness.  Update cell rule according to the cell fitness. • Deliver the best result – the one with the highest average fitness. • In 1000 simulations, 3% managed to fully synchronize. 55 Open-Ended Evolution • Open-ended evolution:  Admits no externally imposed fitness criterion − but rather an implicit, emergent, dynamic one  The only form of evolution known to produce such devices as: − eyes, wings, and nervous systems  Only open-ended evolution can be truly considered EHW, − Still an elusive goal at present. • Application:  Autonomous robots: − Machines capable of operating in unknown environments without human intervention (Space) 56 Evolvable Hardware Platforms Evolvable Hardware Platforms • Usually a cellular structure of uniform or non-uniform components:  Sometime we can evolve the components’ functionality  Sometime we can evolve the connectivity  Sometimes both. • FPGAs fit well in the 3rd category 58 Evolvable Hardware Platforms • Problem:  Huge search space to explore: −  prevents EA from finding a solution. • Solution:  Constrain search space by − Defining a set of logic cells (ANN, or more complex cells) − Constrain the connectionism (to a certain neighborhood). 59 Evolvable Hardware Platforms • Evolvable substrate can be implemented using: 1. exploiting the flexibility provided by the FPGA’s configuration logic   −  configuration bitstream of the FPGA is directly generated. better use of FPGA resources— penalty: very low-level circuit descriptions may have illegal configurations (in genetically evolved bitstreams) that cause short circuits; 2. building a custom chip  user can define configuration bitstreams ( prevent illegal configurations).  penalty: cannot benefit from advanced fabrication processes.  penalty: cannot benefit from advanced CAD tools. 60 Xilinx XC6200 Family • MUX based connection architecture:   can download arbitrary bitstream: − no risk. • Cell-level partial reconfiguration 61 References  [Hauck08] Scott Hauck, André DeHon, “Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation," Elsevier, 2008.  [Upegui05] A. Upegui and E. Sanchez, "Evolving hardware by dynamically reconfiguring Xilinx FPGAs", Evolvable Systems: From Biology to Hardware, LNCS, vol. 3637, pp. 56-65, 2005. 62

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 15_EHW