Download Ant Colony Systems Data Mining

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Principal component analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Cluster analysis wikipedia , lookup

K-means clustering wikipedia , lookup

Transcript
MIL
FPGA Co-Processor Enhanced
Ant Colony Systems Data
Mining
Jason Isaacs and Simon Y. Foo
Machine Intelligence Laboratory
FAMU-FSU College of Engineering
Department of Electrical and Computer Engineering
MAPLD2005/P249
MIL








Isaacs
Presentation Outline
Introduction
Significance of Research
Concise Background on ACS
Summary of Data Mining focused on
Clustering
Discussion of ACS-based Data Mining
FPGA Co-processor Enhancement
Conclusions
Future Work
2
MAPLD2005/P249
MIL
Project Goal: to design and implement an Ant Colony
Systems toolbox for non-combinatorial problem solving.
This toolbox will comprise both hardware and software
based solutions.
Isaacs
3
MAPLD2005/P249
MIL
Ant Colony Systems Project Overview
 This work aims at advancing fundamental research
in Ant Colony Systems.
 The major objectives of this project are:
 Develop a set of behavior models
 Design ACS algorithms for solutions to noncombinatorial problems
 Analyze algorithms for hardware implementations
 Implement FPGA Modules – CURRENT
 Incorporate all modules into a cohesive toolbox
Isaacs
4
MAPLD2005/P249
MIL
Introduction to Ant Colony Systems
 Ants are model organisms for bio-simulations due to both their relative
individual simplicity and their complex group behaviors.
 Colonies have evolved means for collectively performing tasks that are
far beyond the capacities of individual ants. They do so without direct
communication or centralized control – Stigmergy.
 Previous Research: our use of simulated ants to generate random
numbers proved a novel application for ACS.
 Prior to 1992, ACS was used exclusively to study real ant behavior.
 However, in the last decade, beginning with Marco Dorigo’s 1992 PhD
Dissertation “Optimization, Learning and Natural Algorithms,” modeling
the way real ants solve problems using pheromones, ant colony
simulations have provided solutions to a variety of NP-hard combinatorial
optimization problems
Isaacs
5
MAPLD2005/P249
MIL
ACS Application Area: Data Mining
 Ant Colony real-world behaviors applicable
to Data Mining:






Isaacs
Ant Foraging
Cemetery Organization and Brood Sorting
Division of Labor and Task Allocation
Self-organization and Templates
Co-operative Transport
Nest Building
6
MAPLD2005/P249
MIL
Cemetery Organization and Brood Sorting
Isaacs
7
MAPLD2005/P249
MIL
Ant Colony Nest Examples
Isaacs
8
MAPLD2005/P249
MIL
Flowchart for the ACS Data Mining System
Data
Feature/Object
Update Cognitive M ap
NO
Store New Object
YES
ACS Data M ining
Recognize
NEST (Data Warehouse)
Classification
Clustering
Connection Topology
Isaacs
9
MAPLD2005/P249
Knowledge Discovery and
Data Mining
MIL
 What is Data Mining?
 “Discovery of useful summaries of data”
 Also, Data Mining refers to a collection of
techniques for extracting interesting
relationships and knowledge hidden in data.
 It is best described as “the nontrivial process of
identifying valid, novel, potentially useful, and
ultimately understandable patterns in data.”
(Fayyad, et al 1996)
Isaacs
10
MAPLD2005/P249
MIL
Knowledge Discovery in Databases
Cleaning
Integration
Selection
Transformation
Data
Warehouse
Data
Mining
Evaluation
Visualization
Prepared
data
Patterns
Knowledge
Knowledge
Base
Data
Isaacs
11
MAPLD2005/P249
MIL
Typical Tasks in Data Mining






Isaacs
Classification
Prediction
Clustering
Association Analysis
Summarization
…
12
MAPLD2005/P249
MIL
Clustering
 What is Clustering?
 Given points in some space, often a highdimensional space, group the points into a
small number of clusters, each cluster
consisting of points that are “near” in some
sense.
Isaacs
13
MAPLD2005/P249
The k-Means Algorithm
MIL
 k-means picks k cluster centroids and assigns points to the clusters by picking
the closest centroid to the point in question. As points are assigned to clusters,
the centroid of the cluster may migrate.
 For a very simple example of five points in two dimensions. Suppose we
assign the points 1, 2, 3, 4, and 5 in that order, with k = 2. Then the points 1
and 2 are assigned to the two clusters, and become their centroids for the
moment.
 When we consider point 3, suppose it is closer to 1, so 3 joins the cluster of 1,
whose centroid moves to the point indicated as a. Suppose that when we
assign 4, we find that 4 is closer to 2 than to a, so 4 joins 2 in its cluster, whose
center thus moves to b. Finally, 5 is closer to a than to b, so it joins the cluster
{1,3}, whose centroid moves to c.
Isaacs
14
MAPLD2005/P249
MIL
The k-Means Algorithm
Having located the centroids of the k clusters, we can reassign
all points, since some points that were assigned early may
actually wind up closer to another centroid, as the centroids
move about. If we are not sure of k, we can try different values
of k until we find the smallest k such that increasing k does not
much decrease the average distance of points to their centroids.
Isaacs
15
MAPLD2005/P249
MIL
ACS Notation and Heuristics
E = {Oi,…, On} Set of n data or objects collected.
Oi = {vi,…, vk} Each object is a vector of k numerical attributes.
Vector similarity is measured by Euclidean distance (can use
other: Minkowski, Hamming, or Mahalanobis).
Dmax = max D{Oi, Oj}, where Oi,Oj  E
Isaacs
16
MAPLD2005/P249
MIL
ACS Notation and Heuristics
•2-D search area, in general, must be at least m2  n, but experiments have
shown that m2  4n provides good results.
•A heap/pile H is considered to be a collection of two or more objects. This
collection is located on a given single cell rather than just spatially connected.
This limitation prevents overlaps.
O1
O3
O2
O4
O4
O1
O5
O2
O5
O3
Spatial pattern cluster
Isaacs
Single-cell ranked cluster
17
MAPLD2005/P249
MIL
ACS Distance Measures
• Dmax is the maximum distance between two objects of H:
Dmax ( H )  max D(Oi , O j )
O i ,O j H
• Ocenter is the center of mass of all objects in H: (not necessarily a real
object)
1
Ocenter ( H ) 
nH
 Oi
Oi H
• Odissim is the most dissimilar object in H, i.e. which maximizes
D(., Ocenter ( H ))
• Dmean is the mean distance between the objects of H and the center of
mass Ocenter :
1
Dmean ( H ) 
Isaacs
nH
 D(Oi , Ocenter ( H ))
Oi H
18
MAPLD2005/P249
MIL
ACS Unsupervised Learning and Clustering
Algorithm
 Initialize randomly the ant positions
 Repeat
 For each anti Do
 Move anti
 If anti does not carry any object Then look at 8-cell
neighborhood and pick up object according to pick-up
algorithm
 Else (anti is already carrying an object O) look at 8-cell
neighborhood and drop O according to drop-off
algorithm
 Until stopping criterion
Isaacs
19
MAPLD2005/P249
ACS Data Mining Algorithm
Top Level
1.
2.
3.
4.
5.
Isaacs
MIL
Load Database
Data Compression
Object Clustering
Clustering of Similar Groups
Reevaluate Objects in Groups
20
MAPLD2005/P249
ACS Data Mining Algorithm
Top Level


MIL
Load Database
Select Compression Method
 Wavelets
 Principle Component Analysis
 None

Repeat for Max_Iterations1 – Object Clustering
 Begin Ants Redistribute Objects
 K-means

Repeat for Max_Iterations2 – Clustering of Similar Groups
 Ants Redistribute Piles (Clusters) of Objects
 K-means

Repeat for Max_Iterations3 – Reevaluate Objects in Groups
 Ants Redistribute Objects in Clusters with a Probability based on Least Similar
Objects Distance from the Mean of the Cluster
 K-means
Isaacs
21
MAPLD2005/P249
MIL
ACS Object Pick-up Algorithm
1.
2.
Label 8-cell neighborhood as “unexplored”
Repeat
1.
2.
Consider the next unexplored cell c around anti with the following order: cell 1is
NW, cell 2 is N, cell 3 is NE, … N is the direction the ant is facing.
If c is not empty Then do one of the following:
1.
2.
3.
3.
3.
Isaacs
If c contains a single object O, Then load O with probability Pload, Else
If c contains a heap of two objects, Then remove one of the two with a probability
Pdestroy, Else
If c contains a heap H of more than 2 objects, Then remove the most dissimilar object
Odissim(H) from H provided that
Label c as “explored”
D(Odissim( H ), Ocenter ( H ))
 Tremove
Dmean ( H )
Until all 8 cells have been explored or one object has been loaded
22
MAPLD2005/P249
MIL
ACS Object Drop-off Algorithm
1.
2.
Label 8-cell neighborhood as “unexplored”
Repeat
1.
Consider the next unexplored cell c around anti with the following order: cell 1is
NW, cell 2 is N, cell 3 is NE, … N is the direction the ant is facing.
1.
2.
If c is empty Then drop O in cell with a probability Pdrop, Else
If c contains a single object O’, Then drop O to create a heap H provided that:
D(O, O ' )
 Tcreate
Dmax
3.
Else
If c contains a heap H, Then drop O on H provided that:
D(O, Ocenter ( H ))  D(Odissim( H ), Ocenter ( H ))
2.
3.
Isaacs
Label c as “explored”
Until all 8 cells have been explored or carried object has been dropped
23
MAPLD2005/P249
Parameter Table
Parameter
Isaacs
MIL
Role
Value (or Range)
Speed
Distance ant can travel in one time step
[1,10]
Pdirection
Probability to move in same direction
[0.5,1]
Maxcarry
Maximum object carry time
[20,200]
Pload
Probability to pick-up an object
[0.4,0.8]
Pdestroy
Probability to destroy an heap of 2
objects
Pdrop
Probability to drop an object
[0.4,0.8]
Tremove
Minimum dissimilarity for removing
an object from a heap
[0.1,0.2]
Tcreate
Maximum dissimilarity permitted for
creating a heap of two objects
[0.05,0.2]
24
[0,0.6]
MAPLD2005/P249
MIL
K-means Algorithm
1. Take as input the partition P of the data set found
by the ants in the form of k heaps: Hi,…,Hk
2. Repeat
1. Compute Ocenter(Hi),…, Ocenter(Hk)
2. Remove all objects from heaps,
3. For each object Oi E:
1. Let Hi, j [1, k] be the heap whose center is the closest to Oi,
2. Assign Oi to Hj,
4. Compute the resulting new partition P = H1,…,Hk’ by
removing all empty clusters,
3. Until stopping criterion
Isaacs
25
MAPLD2005/P249
MIL
Benchmark Databases
The following public domain data sets were obtained
from the UCI (University of California at Irvine) Machine Learning Repository. These have been used
extensively for classification tasks using different
paradigms. The main characteristics of each of these
domains are described in the three slides.
Isaacs
26
MAPLD2005/P249
MIL
Tested Databases
 Golf
 Very simple database, 4 attributes, 2 classes
 Balloons
 The influence of prior knowledge on concept acquisition, 4 data sets, 4
attributes, 2 classes
 Wine
 Well behaved class structure, 178 instances, 13 attributes, 3 classes
 Hepatitis
 Poorly distributed database, 155 instances, 19 attributes, 2 classes
 Iris (plant)
 Very popular database, 150 instances, 4 attributes, 3 classes.
 Wisconsin Breast Cancer
 High dimensional database, 198 instances, 32 attributes, 2 classes
Isaacs
27
MAPLD2005/P249
MIL
Golf Data Results
Outlook Temperature Humidity
Windy
Decision
sunny continuous continuous true/false play/don't play
overcast
rain
sunny
sunny
overcast
rain
rain
rain
overcast
sunny
sunny
rain
sunny
overcast
overcast
rain
85
80
83
70
68
65
64
72
69
75
75
72
81
71
85
90
78
96
80
70
65
95
70
80
70
90
75
80
FALSE
TRUE
FALSE
FALSE
FALSE
TRUE
TRUE
FALSE
FALSE
FALSE
TRUE
TRUE
FALSE
TRUE
Given Data
Isaacs
Don’t' Play
Don’t' Play
Play
Play
Play
Don’t' Play
Play
Don’t' Play
Play
Play
Play
Play
Play
Don’t' Play
1
1
2
3
3
3
2
1
1
3
1
2
2
3
85
80
83
70
68
65
64
72
69
75
75
72
81
71
85
90
78
96
80
70
65
95
70
80
70
90
75
80
0
1
0
0
0
1
1
0
0
0
1
1
0
1
0
0
1
1
1
0
1
0
1
1
1
1
1
0
Numerical Equivalent
28
0.3333
0.3333
0.6667
1
1
1
0.6667
0.3333
0.3333
1
0.3333
0.6667
0.6667
1
1
0.9412
0.9765
0.8235
0.8
0.7647
0.7529
0.8471
0.8118
0.8824
0.8824
0.8471
0.9529
0.8353
0.8854
0.9375
0.8125
1
0.8333
0.7292
0.6771
0.9896
0.7292
0.8333
0.7292
0.9375
0.7813
0.8333
0
1
0
0
0
1
1
0
0
0
1
1
0
1
0
0
1
1
1
0
1
0
1
1
1
1
1
0
Normalized
MAPLD2005/P249
MIL
Golf Data Results
Number in Cluster
Don’t Play
Play
Don’t Play
ERROR
4
3
1
0
1
0
0
0
1
0
0
0
0
1
0
0
1
8
4
3
0
0
1
1
1
0
1
0
1
1
0
1
1
0
2
2
4
1
0
0
0
0
0
0
1
0
0
0
0
0
0
Objects (1-14)
Position of Cluster
Isaacs
29
MAPLD2005/P249
MIL
Golf Data Results
Number in Cluster
No Errors
Play
9
3
3
0
0
1
1
1
0
1
0
1
1
1
1
1
0
Don’t Play
3
4
3
0
1
0
0
0
1
0
0
0
0
0
0
0
1
Don’t Play
2
5
5
1
0
0
0
0
0
0
1
0
0
0
0
0
0
Objects (1-14)
Position of Cluster
Isaacs
30
MAPLD2005/P249
MIL
Wine Database
Data is the results of a chemical analysis of wines grown in the same region
in Italy but derived from three different cultivars.
The attributes are
1) Alcohol
2) Malic acid
3) Ash
4) Alcalinity of ash
5) Magnesium
6) Total phenols
7) Flavanoids
8) Nonflavanoid phenols
9) Proanthocyanins
10)Color intensity
11)Hue
12)OD280/OD315 of diluted wines
13)Proline
Isaacs
Number of Instances
class 1 59
class 2 71
class 3 48
Error: 0.050562
5 class 1 mislabeled as class 2
3 class 2 mislabeled as class 3
1 class 3 mislabeled as class 2
31
MAPLD2005/P249
MIL
Iris (Plant) Database
This is perhaps the best known database to be found in the pattern recognition literature.
Attribute Information:
1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm
Errors:4 mislabeled as #2
Errors:3 mislabeled as #3
0.046667
Number of Instances: 150 (50 in each of
three classes)
-- Iris Setosa
-- Iris Versicolour
-- Iris Virginica
Errors:2 mislabeled as #3
Errors:4 mislabeled as #2
0.04
Errors: 0.047
4 mislabeled as type 2
3 mislabeled as type 3
Isaacs
Errors: 0.04
2 mislabeled as type 3
4 mislabeled as type 2
32
MAPLD2005/P249
MIL
ACS DM: Optimization of Parameters
 Number of Total Iterations
 Compression Method (PCA, Wavelet, None)
 Cluster Method
 Objects Only
 Objects and Groups of Objects
 Objects, Groups, then Objects again
 Number of Ants
 K-Means Iterations
 Distance Measure (Euclidean, Minkowski, Hamming, or
Mahalanobis)
 Others (RNG, Ants Movement Distance, Ant Carrying
Capacity)
Isaacs
33
MAPLD2005/P249
MIL
ACS DM: Object Grouping Only
Database
WPBC
WPBC
WPBC
WPBC
WPBC
WPBC
Wine
Wine
Wine
Wine
Wine
Wine
Hepatitis
Hepatitis
Hepatitis
Hepatitis
Hepatitis
Hepatitis
Iris
Iris
Iris
Iris
Iris
Iris
Isaacs
Compression
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
Vector
Length
33
12
9
33
12
9
13
7
7
13
7
7
19
8
13
19
8
13
4
4
1
4
4
1
Hybrid
Number of
Number of
Iterations Ant Count Max_ACS Max_K-Means Data Points Resultant Groups
50
80
30
15
198
34
50
80
30
15
198
29
50
80
30
15
198
47
100
80
30
15
198
32
100
80
30
15
198
18
100
80
30
15
198
34
50
80
30
15
178
36
50
80
30
15
178
25
50
80
30
15
178
33
100
80
30
15
178
20
100
80
30
15
178
22
100
80
30
15
178
23
50
80
30
15
80
22
50
80
30
15
80
18
50
80
30
15
80
24
100
80
30
15
80
18
100
80
30
15
80
16
100
80
30
15
80
14
50
80
30
15
150
5
50
80
30
15
150
14
50
80
30
15
150
6
100
80
30
15
150
5
100
80
30
15
150
14
100
80
30
15
150
6
34
Error
1.6854
7.3034
1.6854
2.809
7.3034
1.6854
8.75
13.75
11.25
8.75
16.25
13.75
4
3.3333
6
4
3.3333
6
Num ADM
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
MAPLD2005/P249
MIL
ACS DM: Object and Cluster Grouping Only
Database
WPBC
WPBC
WPBC
WPBC
WPBC
WPBC
Wine
Wine
Wine
Wine
Wine
Wine
Hepatitis
Hepatitis
Hepatitis
Hepatitis
Hepatitis
Hepatitis
Iris
Iris
Iris
Iris
Iris
Iris
Isaacs
Compression
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
Vector
Length
33
12
9
33
12
9
13
7
7
13
7
7
19
8
13
19
8
13
4
4
1
4
4
1
Hybrid
Number of
Number of
Iterations Ant Count Max_ACS Max_K-Means Data Points Resultant Groups
50
80
30
15
198
21
50
80
30
15
198
18
50
80
30
15
198
43
100
80
30
15
198
13
100
80
30
15
198
16
100
80
30
15
198
29
50
80
30
15
178
33
50
80
30
15
178
21
50
80
30
15
178
30
100
80
30
15
178
19
100
80
30
15
178
14
100
80
30
15
178
21
50
80
30
15
80
22
50
80
30
15
80
17
50
80
30
15
80
24
100
80
30
15
80
18
100
80
30
15
80
11
100
80
30
15
80
14
50
80
30
15
150
4
50
80
30
15
150
4
50
80
30
15
150
3
100
80
30
15
150
3
100
80
30
15
150
6
100
80
30
15
150
5
35
Error
1.6854
6.7416
1.6854
2.809
8.9888
1.6854
8.75
13.75
11.25
8.75
16.25
13.75
13.333
13.333
3.333
4
8
6
Num ADM
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
MAPLD2005/P249
MIL
ACS DM: Object, Cluster, and Object
Database
WPBC
WPBC
WPBC
WPBC
WPBC
WPBC
Wine
Wine
Wine
Wine
Wine
Wine
Hepatitis
Hepatitis
Hepatitis
Hepatitis
Hepatitis
Hepatitis
Iris
Iris
Iris
Iris
Iris
Iris
Isaacs
Compression
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
None
Wavelet DB3
PCA
Vector
Length
33
12
9
33
12
9
13
7
7
13
7
7
19
8
13
19
8
13
4
4
1
4
4
1
Hybrid
Number of
Number of
Iterations Ant Count Max_ACS Max_K-Means Data Points Resultant Groups
50
80
30
15
198
19
50
80
30
15
198
18
50
80
30
15
198
38
100
80
30
15
198
12
100
80
30
15
198
14
100
80
30
15
198
27
50
80
30
15
178
31
50
80
30
15
178
21
50
80
30
15
178
29
100
80
30
15
178
7
100
80
30
15
178
11
100
80
30
15
178
21
50
80
30
15
80
22
50
80
30
15
80
16
50
80
30
15
80
22
100
80
30
15
80
16
100
80
30
15
80
8
100
80
30
15
80
11
50
80
30
15
150
3
50
80
30
15
150
4
50
80
30
15
150
3
100
80
30
15
150
3
100
80
30
15
150
6
100
80
30
15
150
5
36
Error
1.6854
6.7416
1.6854
2.2472
6.7416
2.2472
8.75
12.5
11.25
10
16.25
16.25
4
13.333
3.3333
4
8
6
Num ADM
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
MAPLD2005/P249
MIL
Why Move to Hardware?
 For such large datasets the ACS classifier perform
remarkably well. However,
 Speed of classification is very limited in software.
 The computational bottlenecks lay in the number of multiply
and adds that must be performed for each object. In addition,
the requirement of a square root for each distance measurement
adds complexity.
Isaacs
37
MAPLD2005/P249
MIL
Target Hardware:
Avnet’s Virtex II Pro Board
 Uses Virtex II Pro XC2VP20
 Many Options for I/O.
 32 Bit PCI Bus has Data Throughput of Over 100 MB per Second.
Isaacs
38
MAPLD2005/P249
MIL
ACS-DM System Top-Level HW
K-Means
ACS
Cellular
Automata
Random Number
Generator
Ant Colony
Actions and Data
Module
Data
Comparison
Module
Database Actions and
Information Module
Isaacs
Data Actions
and
Information
39
MAPLD2005/P249
MIL
ACSDM Hardware Design
Isaacs
40
MAPLD2005/P249
MIL
K-Means Distance Calculator with
CORDIC Square Root
Isaacs
41
MAPLD2005/P249
MIL
Device Utilization Summary
Isaacs

Selected Device : 2vp20ff896-6




















Number of Slices:
6600 out of 9280 71%
Number of Slice Flip Flops:
8312 out of 18560 44%
Number of 4 input LUTs:
7661 out of 18560 41%
Number of bonded IOBs:
266 out of 556 48%
Number of BRAMs:
3 out of 88 3%
Number of MULT18X18s:
8 out of 88 9%
Number of GCLKs:
1 out of 16 6%
=========================================================================
TIMING REPORT
Clock Information:
-----------------------------------+------------------------+-------+
CORDIC Sqrt data
Clock Signal
| Clock buffer(FF name) | Load |
path is greatest
-----------------------------------+------------------------+-------+
bottleneck causing
clk
| BUFGP
| 1419 |
-----------------------------------+------------------------+-------+
high period
Timing Summary:
Minimum period: 16.499ns (Maximum Frequency: 60.611MHz)
Minimum input arrival time before clock: 4.491ns
Maximum output required time after clock: 6.087ns
Maximum combinational path delay: 5.102ns
42
MAPLD2005/P249
MIL
Hardware Euclidean Distance Result
V1
Isaacs
0.49655
0.89977
0.82163
0.64491
0.81797
0.66023
0.34197
0.28973
0.34119
0.53408
0.72711
0.30929
0.8385
0.56807
0.37041
0.70274
0.54657
0.44488
0.69457
0.62131
V2
0.83812
0.01964
0.68128
0.37948
0.8318
0.50281
0.70947
0.42889
0.30462
0.18965
0.19343
0.68222
0.30276
0.54167
0.15087
0.6979
0.37837
0.86001
0.85366
0.59356
Dist 
N
 V1
n 1



 V 2n
Result from Matlab = 1.5058
Result from Hardware = 1.5172
Vectors are Fix 8_7 on input





n
2
Then after add: Fix 9_7
Then after multi: Fix 18_14
Then after accum: Fix 20_14
Then after CORDIC Sqrt: Fix 42_36
Error is present in round-off and Cordic
Sqrt
43
MAPLD2005/P249
MIL
Ant Colony Actions: Movement
CARNG is a simple 32-bit rule 30 that is
user initialized for reproducibility
RNG Ant(N)
Isaacs
Ant Move-Direction Filter
RNG Ant(1)
RNG Ant(2)
Current Location Data
Ant Change
Location
Ant Colony
Data
Current Location
Last Location
Have Data Status
New Location Data
44
MAPLD2005/P249
MIL
Pheromone Trail Result from
Hardware Co-simlulation
A single ant is
simulated for clarity
and the Darker Red
is most recent
position
Isaacs
45
MAPLD2005/P249
Ant Colony Actions:
Object Load/Drop
MIL
Were Probabilities and Thresholds Met?
Enable Drop/Load Y/N
Current Location
Carried Status
Ant Change
Have Data
Status
Object
Information
Current Location
Carried Status
Current Have Data Status
Ant Colony
Data
Current Location
Last Location
Have Data Status
New Have Data Status
Isaacs
46
MAPLD2005/P249
MIL
ACS DM Hardware: Storage Requirements

Preprocessed Data (Number of Objects * Vector Length, 8- to 32-bit fixedpoint)
 Object Vectors
 Object Locations
 Object Status

Parameter Values (16 32-bit fixed-point)
 Probabilities
 Thresholds
 Limits


Max Distance (1 32-bit fixed-point)
Groups (Number of Objects * Number of Groups, 1-bit and 3*Number of
Groups 8-bit)
 Members
 Means (Object Vector Length * 32-bit fixed-point)
 Locations

Isaacs
Ant Locations and Have-Object Status (Number of Ants * 8-bit, plus 1-bit
status)
47
MAPLD2005/P249
MIL
Isaacs
48
MAPLD2005/P249
MIL
PCI Bridge
Isaacs
49
MAPLD2005/P249
MIL
Block Diagram
• Virtex-II Pro is focal point.
• Spartan acts as bridge to PCI
• On Board Memory
• 32 MB SDRAM
• 2 MB SRAM
• 16 MB FLASH
• 128 MB DDR SDRAM
• 64 MB Compact Flash
• Ethernet
• RS232
• 4 AvBus Connectors
• 2 PMC Connectors
Isaacs
50
MAPLD2005/P249
Conclusions/Future Work
MIL
 Continue to design the ACS Data Mining System
 Implement an improved Memory Manager
 Correct Errors associated with Round-off and the CORDIC Sqrt.
 Implement the Group Clustering Algorithm
 Optimize the PC/FPGA interfacing to create our own low-cost integrated
system.
 Our problems currently reside on the PCI interface design shipped with the Avnet
Development Board. We are working hard to resolve this issue, but in the end we
may have to consider another board. Also shown in presentation P248.
 We also need to improve the speed. 60Mhz is too slow.
 Optimize data through put and calculating efficiency of the distance metric
algorithm, i.e., consider a multi-stage pipeline or employ the use of more lookup tables.
 The ultimate goal is to demonstrate the ability of ACS algorithms to perform
as well as other well-know techniques allowing for computational speed-up
utilizing FPGAs as co-processors.
Isaacs
51
MAPLD2005/P249