Download Useful techniques with vector spaces.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Eigenvalues and eigenvectors wikipedia , lookup

Bivector wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Matrix multiplication wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Cross product wikipedia , lookup

Exterior algebra wikipedia , lookup

Laplace–Runge–Lenz vector wikipedia , lookup

Matrix calculus wikipedia , lookup

Euclidean vector wikipedia , lookup

Geometric algebra wikipedia , lookup

Vector space wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Four-vector wikipedia , lookup

Transcript
Von Neumann to Bloom: Useful algorithmic techniques
with vector spaces.
W P Cockshott
Topics
•
•
•
•
Vector spaces and image representation
Vector spaces and Bloom filtering
Vector spaces in economics
Metrics used when comparing image
patches
• Vector spaces and Noethers principle -what space should we use when modelling a
system?
Vector spaces and image representation
• use of basis spaces
– Cosine Transform, Wavelet Decomposition
• use of approximation vectors
– - Shannons coding theorem
• clustering, voronoi regions, analytic decomposition,
hierarchical decomposition
Patches are vectors
A patch on an image
can be unrolled to
make a vector. Thus
local areas of images
can be treated as
vectors
In this example we
have an 8 x 6 patch
which amounts to a 48
dimensional vector
use of basis spaces
• Systems of image representation such as
DCT or wavelets work by changing the basis
of the vector space to one that is more
amenable to certain operations such as
compression.
• The key to these transforms is the concept
of an orthogonal basis or coordinate
system in which areas of a picture can be
represented.
Orthogonal basis
•
An orthogonal pair of vectors is a pair of vectors at right angles.
For example the vectors along the X and Y axes [1,0], [0,1] are at right angles and
thus
orthogonal.
Similarly the 45degree vectors [1,1], [1,-1] are at right angles and thus orthogonal.
[0,1]
[1,1]
[1,0]
[1,-1]
Vector Lengths
v
v2=(vx)2+(vy)2
Orthonormal vectors
If in addition to being at right angles the vectors are of length 1,
then they are orthonormal.
For example the pairs of vectors ([0,1],[1,0]) and
Are both orthonormal pairs
[0,1]
[1,0]
Inner product operations
• We can measure how far along
an axis provided by one of the
basis vectors a given v is by
using the inner product operator.
• Let v be defined in the x,y basis
and let p,q be and alternative
orthonormal basis. The point v in
terms of p,q is given by the
vector [p.v,q.v]
p
y
q
x
Extension to n dimensions
•
Transforming to and from a Basis
•
Given v of dimension n and orthonormal basis
vectors b1,b2,..,bn we can express v in this new basis
as w=v [b1,b2,..,bn ] = [v.b1,v.b2,..,v.bn ] without loss of
information
• Thus we can change the basis by multiplying by a
matrix of the new basis vectors
• we can then reconstruct v from w as follows
This last stage involves forming v by a superposition of
the basis vectors
Linear transform
These transforms are rotation
operations in an
appropriate vector space.
• effect of rotations by 45 on
the unit vectors x = [1, 0], y
= [0, 1]. The result is that
• x → [a, b] = [ 1/√2, 1/√2]
and
• y → [−a, b] = [−1/√2, 1/√2].
Reasons for doing this
• We typically want to transform image vectors
onto a different basis space in which the
image energy is unevenly spread.
• If we take the basis formed by cosine waves
of different frequencies for example, we find
that the bases representing high frequencies
tend to have little energy and can be
discarded.
• This is the basis of the MPEG system used
in digital TV
Metrics used when comparing image patches
• Suppose we have a binocular pair of
cameras and we want to compute the
disparity field.
• We can take a patch on the left image and
search for a patch on the right image that
looks most similar to it.
• The inner product operator turns out to be
useful for detecting such similarity in
appearance, provided that we do two things
first.
Compensate for brightness and contrast
• Original images may
differ in brightness and
contrast
• First perform difference
of Gaussian filter to
remove local mean
brightness
• Then take local patches
and normalise them as
vectors ( set |v|=1)
• Then compare patches
using inner product
Inner product and unit hyper-sphere
• By normalising our
vectors we cause them
to lie on the unit hypersphere
• The inner product then
computes the cosine of
the angle between
vectors.
• This turns out to be a
good visual similarity
metric.
Vector spaces and Bloom filtering – an IR application
• The original use of Bloom filtering in Content
Addressable Filestore
• Recast this into vector space representation
• Compare with Dominic Widdows’ technique
of random projections
CAFS was an intelligent disk controller from ICL
•It could do content
access of the data on
disk, and using a clever
vector space technique
invented by Bloom it
could do relational join.
•It had hardware match
units that could recognise
patterns on a track.
•It also had a number of
bitmap rams for doing
filtering.
CAFS BOARD
3.5’’ disk
Performing relational join
Department Table
DepartmentID
DepartmentName
31
Sales
33
Engineering
LastName
Department ID
Department Name
34
Clerical
Smith
34
Clerical
35
Marketing
Jones
33
Engineering
Robinson
34
Clerical
Employee Table
Result table
LastName
DepartmentID
Steinberg
33
Engineering
Rafferty
31
Rafferty
31
Sales
Jones
33
Steinberg
33
Robinson
34
Smith
34
Jasper
36
Filter buffers
Suppose we had 8 filter buffers on board, each
in the form of 64K bit ram chips.(1980s
remember)
• Pass through the Department table and for
each department no generate 8 hash
address, one to each buffer chip.
– Set the corresponding bit
• Pass through the Employee table, and for
each dept no, generate 8 hash address, one
to each buffer chip.
– If all 8 bits are set then pull out the employee as a
joining record.
Analysis
• Suppose that we have 128 distinct field
values occuring in the first relation.
• Then each RAM will have approx 128 bits
set, and the probability of any one bit being
set is 128/64k=1/512
• Probability of all 8 RAMS having a bit set is
1/4096
• Thus after two passes of the data we will
have found the matching entries to very high
probability
Relation to vector spaces
• Consider concatenating the 8 buffers, we get a
boolean vector of length 512K call this m
• In this we have set 8 random bits for each field that
we have hashed, call this k.
• Each of these these operations thus creates us an m
element pseudo basis vector.
• Why pseudo-basis, because the vectors created by
distinct keys are almost certainly orthogonal.
• After a pass through the first relation we have in our
buffers a superposition of these basis vectors
Iverson’s Generalised Inner Product
• Iverson included four operators to modify the action of
functions.
• An operator has one or two functions as its arguments
and produces a modified function as a result; the
modified function then acts on the argument or
arguments to the right (and maybe also to the left) of
the operator expression. The operators are
–
–
–
–
–
inner product (.),
outer product ( ∘ .),
reduction (/) and
scan (\).
For example the plus dot times inner product (written + .
×) is conventional matrix multiplication.
• ( OR . AND) is then one possible Boolean inner
product, but if values are 0,1 this is equivalent to (+.
×)
Projection space
•
•
•
•
•
•
In relational databases, the fields are typically strings, say
up to 40 chars long.
The cardinality of the set of all possible 40 character strings
is vast.
The Bloom filters project n keys from this huge set into the
space spanned by pseudo basis vectors of length m each of
which has k non-zero elements.
Since these are almost orthogonal we can superpose them
using Boolean 1 bit arithmetic and the inner product of this
superposition using Iversons generalised inner product can
then be used for set membership.
Let b be our superposed filter buffer, and vk be the pseudo
basis of key k, then using Iversons notation the approximate
set membership test is b(×. ×)vk
Algebraically this is just what Dominic Widdows described,
earlier this month except that Bloom used Iverson’s
generalised inner product and the values 0,1 rather than 1,0,1 as used by Widdows
Vector spaces in economics
• Work of von Neumann - limits to growth
• Work of Sraffa - modeling prices
• Metrics to use when comparing theories of
prices
– cosine, mean absolute deviation, correlation
Work of von Neumann
• 1933 Johann von Neumann, Mathematische
Grundlagen der Quantenmechanik
• 1932 First presentation by von Neumann of the
lecture eventually published as 1945 J Neumann A
Model of General Economic Equilibrium - Review of
Economic Studies,
– The latter work brings to bear many of the techniques
he develops in the former.
• 1945 "First Draft of a report to the EDVAC," lays the
grounds for the current architecture of computers,
makes large scale linear algebra calculations
practical.
Starts field of matrix economics
Important results
• System of equations to determine
equilibrium prices
• Determination of profit rate
• Determination of maximal growth rate
• Turnpike theorem, composition of industrial
output required for maximal economic
growth.
Later impact of matrix economics
•
Theory of prices Pierro Sraffa and neo-Ricardian School
–
•
•
empirical tests of Sraffian price theories are one of my
interests
Input output tables – Leontief
Linear Programming – Kantorovich, Koopmans and Danzig
– Theory of economic planning, this is a long lasting interest of
mine
Leontief, Kantorovich, Koopmans get Nobel Prizes for this work
It was argued by Kaldor who knew von Neumann (and more
recently by Kurz and Salvadori) that the origins of von
Neumanns model should be seen in the 18th-19th century
classical economists (Quesnay,Ricardo, Marx) rather than
the later ‘neo-classical’ school which is now dominant.
Explanation of von Neumann model
• I will present a simple von Neumann
economic model as a Pascal program to give
an idea of how it works.
This is a very simple economy, with only 3 products, but
the principles remain the same however many products
there are.
Technology Matrix A
• The matrix A encodes the technology of the
economy
• element Ai,j of the matrix specifies how much
of the jth commodity is required to produce a
unit of the ith commodity , e.g.:
corn
corn 0.2
coal 0.0
iron 0.0
coal
0.1
0.2
0.7
iron
0.02
0.1
0.1
Labour and wage vectors
• We next introduce a labour vector L which
specifies the labour needed to produce one
unit of each output
• L= corn
coal
iron
0.2
0.1
0.02
To survive labour consumes a real wage vector
real wage per unit of labour
corn
coal
iron
0.50000
0.20000
0.00000
Other variables used in the example model
Iterate to stability
We then iterate to a stable state to derive the prices, profit rate
and wage. As is conventional in such models we express
prices as relative prices, taking the relative price of the first
commodity to be 1.
Set new prices
Total labour
y = output
fix real wage and converge on prices and profit rate
n
y
x
u
s
p
r
w
corn
8.00000
10.00000
2.00000
9.00000
1.00000
1.00000
0.13742
0.60835
coal
3.20000
7.00000
3.80000
6.60000
0.40000
0.54880
iron
0.90000
2.00000
1.10000
1.10000
0.90000
0.68783
What is important here
Technology and the real wage will, on the
assumption of profit rate equalisation fix :
1. All prices
2. The rate of profit
Kurz says that some of this was developed a year or two earlier by a
colleague of von N called Remak who called these prices
superposition prices, but Remaks model was less general. Kurz
says von N was replying to Remak.
Maximal growth path
• Von Neumann was concerned with the maximal rate
of growth of the economy on the assumption that
capitalists reinvested all their profits.
• If this was to occur the mix of surplus outputs
produced had to match the inputs needed for steady
growth the next year. We can derive this by adding an
additional rule, which will cause the system to
converge on a maximal growth path.
Max Growth path
y
u
s
r
w
10.57631
9.29777
1.27854
0.13751
0.61019
6.99289
6.14754
0.84535
1.16904
1.02772
0.14132
Note that for all industries the ratio of si/ui = r
U is now an eigenvector of the technology:labour matrix
and (1+r) is the corresponding eigenvalue
Applications
Matrix economics of this sort is obviously very
relevant to the theory of economic planning,
and to understand the processes going on in
a rapidly industrialising economy like modern
China or Russia in the 1930s.
In China today, national output is hugely
directed towards the production of capital
goods ( around 50% of national output)
which is necessary on a maximal growth
path.
Meta-theory: Vector spaces and Noether’s principle
Put very loosely, Emmy Noether’s principle
says that in a physical system, a conserved
quantity at one level of abstraction
corresponds to a symmetry property of the
system at another level of abstraction.
– Translational symmetry implies conservation of
momentum
– Temporal symmetry implies conservation of
energy
Example
• Top shows points in
phase space traversed
by a projectile thrown
upward in a
gravitational field
• Bottom, points in the
space of (altitude,
velocity squared)
traversed by the particle
• Bottom diagram shows
conservation of energy
Find the symmetry
v
• If we transform to yet
another space we find
the path is a circle.
• Here we can see the
symmetry associated
with the conservation of
energy.
• The new representation
means that we can treat
the path as the result of
rotational symmetry in a
vector space.
Another example is in quantum mechanics.
Why do we use amplitudes whose square gives
us probabilities?
• Because this vector space, which allows the
application of unitary rotation operators,
projects onto the space of probabilities which
are a scalar conserved property.
• This is the same as the change of
representation we had to illustrate
conservation of energy.
Conclusion
• Look for a change of representation that
enables you to detect symmetry in your
system.
• Vector space representations allow us to
model rotational and translational
symmetries.
• The best representation may not be the
obvious representation at first sight.