Download 6CCS3PAL-7CCSPDA: Answers Exercises for week 2: Sum of n = 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Parallel port wikipedia , lookup

Transcript
6CCS3PAL-7CCSPDA: Answers
Exercises for week 2:
1) Sum of n = 2k numbers using n/2 processors. Input array A=[1,2,3,4,5,6,7,8]
1.1) Show each step to output
1.2) Show total time
1.3) Show total work
Answer 1:
Using algorithm P-Sum shown in first slide of 2.PRAM-CBtree/3.SumMatrix.pdf. An array of size n can be solved by k=log2n steps. For the given size
of n=8 this is k=3 steps.
1.1:
a) Copy elements of array A to an array B to retain original data.
B=[1,2,3,4,5,6,7,8]
b)
Step1: for 1 ≤ i ≤ 8/21 (4 times) add up elements [2i-1] and [2i]
[1, 2, 3, 4, 5, 6, 7, 8]
B=[3, 7, 11, 15, …]
Step 2: for 1 ≤ i ≤ 8/22 (2 times) add up elements [2i-1] and [2i]
[3, 7, 11, 15, …]
B=[10, 26, …]
Step 3: for 1 ≤ i ≤ 8/23 (1 time) add up elements [2i-1] and [2i]
[10, 26, …]
B=[36, …]
c) Output answer at B[1] = 36
1.2:
The time for a) is constant (a single command). T(a) = 1
The time for b) is k=3 (once per loop iteration). T(b) = k = log2n
The time for c) is constant (a single command). T(c) = 1
Total time is for n elements is thus Ttotal = 1 + log2n + 1 = Θ(log2n)
1.3:
The work for a) is n: the size of the copied array. W(a) = n
The work for b) is the amount of additions done (red arrow pairs).
This is determined by the line: for 1 ≤ i ≤ 8/2h in the algorithm. Running this k
times and increasing h every iteration, the overall additions done are n/2 + n/4 +
n/8+….+2 + 1 = n-1. For example adding 3 numbers together takes 2 total additions.
W(b) = n-1
The work for c) is constant (single, known location). W(c) = 1
Total work for n elements is thus Wtotal = n + n-1 + 1 = 2n = Θ(n)
Estimated running time with n elements on p processors:
Tp(n) = Wtotal/p + Ttotal = Θ(n/p + log2n)
Exercises for week 3:
1) Find the maximum member of array A=[3, 17, 27, 6] using a parallel EREW
solution.
1.1) Show each step to output
Answer 1:
1.1:
Using algorithm EREW Max-of-Array shown on the fourth slide of 2.PRAMCBtree/5.tree-algs.pdf.
The size of the given data is n=4. We construct an array A of size 2n-1 and place the
original data into the array from A[n] to A[2n-1]. Elements A[1] to A[n-1] are empty.
m = log2n = 2 and k = m – 1 = 1
For k = 1 step -1 to 0 (each step decrease the value of k until you reach 0):
Step k=1) for all j such that 21 ≤ j ≤ 22 -1 in parallel find the maximum of the pairs
A[2j] and A[2j + 1]) and place it into A[j].
j=2: A=[…,…,…,3,17,27,6] A[4] is 3 A[5] is 17. 17>3 thus A[2] = 17
j=3: A=[…,…,…,3,17,27,6] A[6] is 27 A[7] is 6. 27>6 thus A[3] = 27
Step k=0) for all j such that 20 ≤ j ≤ 21 -1 in parallel find the maximum of the pairs
A[2j] and A[2j + 1]) and place it into A[j].
j=1: A=[…,17,27,3,17,27,6] A[2] is 17 and A[3] is 27. 27>17 thus A[1] = 27
End of array
Output A[1] = 27
2) Find the maximum member of array A =[10, 9, 13, 62, 6, 3, 1, 14] using a
parallel EREW solution.
2.1) Show each step to output
2.2) Show total time
2.3) Show total work
Answer 2:
2.1:
Using the same algorithm as above, an array can be presented as a tree with the root at
A[1] and the leaves in A[2l-1] to A[2l-1] where l is the last level of the tree starting
from 0 at the root.
n=8
m = log28 = 3
k=3-1=2
A =[…, 10, 9, 13, 62, 6, 3, 1, 14] can be presented as:
Step k = 2) for all j such that 22 ≤ j ≤ 23 -1 in parallel find the maximum of the pairs
A[2j] and A[2j + 1]) and place it into A[j].
Step k = 1) for all j such that 21 ≤ j ≤ 22 -1 in parallel find the maximum of the pairs
A[2j] and A[2j + 1]) and place it into A[j].
Step k = 0) for all j such that 20 ≤ j ≤ 21 -1 in parallel find the maximum of the pairs
A[2j] and A[2j + 1]) and place it into A[j].
Output A[1] = 62
2.2:
Total time: the amount of parallel pairwise comparison steps = Θ(log2n)
2.3:
Total work: the size of the original array of elements = Θ(n)
3) Find the maximum member at the smallest index of array A=[16, 112, 8, 112,
112] using a CRCW parallel solution
Answer 3:
Using the algorithm CRCW Max-of-Array on slide 5 of 2.PRAM-CBtree/5.treealgs.pdf
As the algorithm compares each element with the rest, it can be graphically presented
as a complete graph.
1) Initialise array M with 0s indicating that every member could potentially be the
largest value.
2) For all ordered pairs, compare their values in parallel and assign M = 1 to smaller
members of the pairs.
3) In parallel, consider and compare the indices of all members that have an M value
of 0. Assign 1 to members that have a larger index
4) Output the member with the M value of 0: index 2 with value 112
4) Find if value x=5 is in array A=[2, -1, 5, 7, 33, 0, 5, 5] using a EREW parallel
solution.
4.1) What problems occur for this solution?
4.2) What three algorithms would a complete solution use to find the
smallest index of a value X in an array A?
Answer 4:
Using the algorithm EREW Is-X-In-Array from slide 7 of 2.PRAM-CBtree/5.treealgs.pdf
Temp = [5, 5, 5, 5, 5, 5, 5, 5]
A = [2, -1, 5, 7, 33, 0, 5, 5]
In parallel compare each member at index i…n of arrays Temp and A: if the elements
are the same, assign value i to Temp[i] else assign value ∞.
After comparison Temp = [∞,∞,3,∞,∞,∞,7,8]
4.1) The problems that occur are the initialization of the array Temp with values of x
and the final retrieval of the answer after comparison.
4.2) To resolve the two issues the final solution utilizes 3 consecutive algorithms
(from the slides):
a) EREW Broadcast query item X to array Temp[1::n] This initializes the
Temp array
b) EREW Is-X-In-List for array L[1::n] This marks Temp entries where array
L is equal to X
c) EREW Min-Binary-fan-in with array Temp[1::n] This returns smallest
marked temp entry
5) Find the minimum index of a solution for exercise 4) using a binary fan-in
solution.
Answer 5:
a) Using algorithm EREW Broadcast on slide 8 from 2.PRAM-CBtree/5.treealgs.pdf
x=5
size of array = 8
k = log2size = 3
Temp[1] =5
Temp = [5]
For i = 0…2
For 2i + 1≤ j ≤2i+1
Temp[j] = Temp[j-2i]
i=0
2≤j≤2
Temp[2] = Temp[1]
Temp = [5, 5]
i=1
3≤j≤4
Temp[3] = Temp[1]
Temp[4] = Temp[2]
Temp = [5, 5, 5, 5]
i=2
5≤j≤8
Temp[5] = Temp[1]
Temp[6] = Temp[2]
Temp[7] = Temp[3]
Temp[8] = Temp[4]
Temp = [5, 5, 5, 5, 5, 5, 5, 5]
b) Using the algorithm EREW Is-X-In-Array from slide 7 of 2.PRAM-CBtree/5.treealgs.pdf (same as Answer 4)
Temp = [5, 5, 5, 5, 5, 5, 5, 5]
A = [2, -1, 5, 7, 33, 0, 5, 5]
In parallel compare each member at index i…n of arrays Temp and A: if the elements
are the same, assign value i to Temp[i] else assign value ∞.
After comparison Temp = [∞,∞,3,∞,∞,∞,7,8]
c) Using the algorithm EREW Min-Binary-fan-in from slide 9 of 2.PRAMCBtree/5.tree-algs.pdf
Temp = [∞,∞,3,∞,∞,∞,7,8]
n=8
k=3
for j = 1…3
for all 1≤i≤n/2j
Compare each pair Temp[2i-1] and Temp[2i] and assign the smaller value to
Temp[i]
j=1
1≤i≤4
Temp[1] ≤ Temp[2] thus Temp[1] = Temp[1]
Temp[3] ≤ Temp[4] thus Temp[2] = Temp[3]
Temp[5] ≤ Temp[6] thus Temp[3] = Temp[5]
Temp[7] ≤ Temp[8] thus Temp[4] = Temp[7]
Temp = [∞,3,∞,7,…]
j=2
1≤i≤2
Temp[1] ≥ Temp[2] thus Temp[1] = Temp[2]
Temp[3] ≥ Temp[4] thus Temp[2] = Temp[4]
Temp = [3,7,…]
j=3
1≤i≤1
Temp[1] ≤ Temp[2] thus Temp[1] = Temp[1]
Temp = [3,…]
Output answer Temp[1] = 3. The smallest occurrence of x = 5 in array A = [2, -1, 5, 7,
33, 0, 5, 5] is at index 3.
6) Find the smallest index of occurrence x=3 in the array A=[3, 16, 3, 29, 57, 8, 3,
4] using 3 EREW parallel algorithms. Show work for all three major steps.
Answer 6:
a) Using algorithm EREW Broadcast on slide 8 from 2.PRAM-CBtree/5.treealgs.pdf
x=3
size of array = 8
k = log2size = 3
Temp[1] = 3
Temp = [3]
For i = 0…2
For 2i + 1≤ j ≤2i+1
Temp[j] = Temp[j-2i]
i=0
2≤j≤2
Temp[2] = Temp[1]
Temp = [3, 3]
i=1
3≤j≤4
Temp[3] = Temp[1]
Temp[4] = Temp[2]
Temp = [3, 3, 3, 3]
i=2
5≤j≤8
Temp[5] = Temp[1]
Temp[6] = Temp[2]
Temp[7] = Temp[3]
Temp[8] = Temp[4]
Temp = [3, 3, 3, 3, 3, 3, 3, 3]
b) Using the algorithm EREW Is-X-In-Array from slide 7 of 2.PRAM-CBtree/5.treealgs.pdf (same as Answer 4)
Temp = [3, 3, 3, 3, 3, 3, 3, 3]
A=[3, 16, 3, 29, 57, 8, 3, 4]
In parallel compare each member at index i…n of arrays Temp and A: if the elements
are the same, assign value i to Temp[i] else assign value ∞.
After comparison Temp = [1,∞, 3, ∞,∞,∞,7, ∞]
c) Using the algorithm EREW Min-Binary-fan-in from slide 9 of 2.PRAMCBtree/5.tree-algs.pdf
Temp = [1,∞, 3, ∞,∞,∞,7, ∞]
n=8
k=3
for j = 1…3
for all 1≤i≤n/2j
Compare each pair Temp[2i-1] and Temp[2i] and assign the smaller value to
Temp[i]
j=1
1≤i≤4
Temp[1] ≤ Temp[2] thus Temp[1] = Temp[1]
Temp[3] ≤ Temp[4] thus Temp[2] = Temp[3]
Temp[5] ≤ Temp[6] thus Temp[3] = Temp[5]
Temp[7] ≤ Temp[8] thus Temp[4] = Temp[7]
Temp = [1,3,∞,7,…]
j=2
1≤i≤2
Temp[1] ≤ Temp[2] thus Temp[1] = Temp[1]
Temp[3] ≥ Temp[4] thus Temp[2] = Temp[4]
Temp = [1,7,…]
j=3
1≤i≤1
Temp[1] ≤ Temp[2] thus Temp[1] = Temp[1]
Temp = [1,…]
Output answer Temp[1] = 1. The smallest occurrence of x = 3 in array A=[3, 16, 3,
29, 57, 8, 3, 4] is at index 1.
Exercises for week 4:
1) Find the prefix sums of array A=[3, 7, 8, 3, 9, 2, 3, 1] using an EREW binary tree
solution.
Answer 1:
Using algorithm EREW-Prefix-Sum from the last slide of 2.PRAM-CBtree/5.treealgs.pdf
The array A can be represented as a tree with the values n…2n-1 as the leaves:
n=8
m=3
In k = m - 1 step – 1 to 0 we add assign the elements 1…n-1 the value of the sum of
their children (See Answer 1 from week 2 above).
We then initialise an array B such that B[1] = A[1]:
We then consider the members of B at indices j at 2…2n-1. If the index is odd, we
assign the member the value at index (j-1)/2. For example for j = 3: B[3] = B[1].
If the considered index is even, as is the case with j = 2, we assign the member
the value of its parent in array B minus the value of its sibling in array A. The
value of j = 2 is thus: B[2] = B[1] – A[3].
Calculating each level of the in parallel and moving away from the root we obtain
the tree:
The answer is the array of members at indices n…2n-1 of array B, or the leaves of
the tree: [3, 10, 18, 21, 30, 32, 35, 36]
2) Find the prefix sums of array A=[7, 2, 3, 8, 7, 4, 6, 16] //This data is different
from the one in the original class.
Answer 2:
Adding up the children of parents we obtain the following tree:
Initialising array B, moving down its tree and assigning new values based on indices
we obtain the tree:
Output the answer of the leaves: [7, 9, 12, 20, 27, 31, 37, 53]
3) Show the structure of a linked list with the parent array P=[1, 3, 1, 2]
Answer 3:
4) Show the structure of a complete binary tree with the parent array
P=[1,1,1,2,2,3,3]
Answer 4:
Exercises for week 5:
1) Given the parent array P=[5, 1, 6, 2, 7, 4, 8, 8]:
a) show the structure of the underlying data structure
b) find the distance from the root of each member using a parallel
ranking algorithm. Show each step.
Answer 1:
a) The structure of list L starting from the root is: 8—7—5—1—2—4—6—3
b) Using algorithm List-Rank from slide 4 of 2.PRAMCBtree/8.pointerjump.pdf
Notation:
k – element
P(k) – parent of k
PP(k) – parent of parent of k
dist – distance from root
_dist – new distance
_P(k) – new parent of k
rank – rank of element
Step 0: Initialise all dist values to 1 except where k = P(k) which has
dist=0
k
P(k)
dist
PP(k)
8
8
0
8
7
8
1
8
5
7
1
8
1
5
1
7
2
1
1
5
4
2
1
1
6
4
1
2
3
6
1
4
Step 1: Where P(k) does not equal PP(k) assign dist(k) the value
dist(k)+dist(P(k)). Assign P(k) the value of PP(k)
k
dist
P(k)
PP(k)
P(k)=PP(k)?
_dist
_P(k)
8
0
8
8
Y
0
8
7
1
8
8
Y
1
8
5
1
7
8
N
2
8
1
1
5
7
N
2
7
2
1
1
5
N
2
5
4
1
2
1
N
2
1
6
1
4
2
N
2
2
3
1
6
4
N
2
4
6
2
2
5
N
4
5
3
2
4
1
N
4
1
Step 2: Where P(k) does not equal PP(k) assign dist(k) the value
dist(k)+dist(P(k)). Assign P(k) the value of PP(k)
k
dist
P(k)
PP(k)
P(k)=PP(k)?
_dist
_P(k)
8
0
8
8
Y
0
8
7
1
8
8
Y
1
8
5
2
8
8
Y
2
8
1
2
7
8
N
3
8
2
2
5
8
N
4
8
4
2
1
7
N
4
7
Step 3: Where P(k) does not equal PP(k) assign dist(k) the value
dist(k)+dist(P(k)). Assign P(k) the value of PP(k)
k
dist
P(k)
PP(k)
P(k)=PP(k)?
_dist
_P(k)
8
0
8
8
Y
0
8
7
1
8
8
Y
1
8
5
2
8
8
Y
2
8
1
3
8
8
Y
3
8
2
4
8
8
Y
4
8
4
4
7
8
N
5
8
6
4
5
8
N
6
8
3
4
1
8
N
7
8
Output answer:
k
rank
8
0
7
1
5
2
1
3
2
4
4
5
6
6
2) Given the parent array P=[1,1,1,2,8,3,4,6] show graphically the steps of an
algorithm to find the root(s) of the underlying data structure.
Answer 2:
Using the algorithm FOREST-ROOT from the last slide of 2.PRAMCBtree/8.pointerjump.pdf.
Original structure:
For all elements in parallel, set P(k) to P(P(k))
3
7
1)
2)
3) For the given interconnected network M, show the steps of an algorithm that
finds the smallest member. What is the optimal running time of such an algorithm
and why?
M=
3
17
6
8
500 72 64 11
25
32
4
1
16
2
10
8
Answer 3:
Using the algorithm MIN-2D-MESH from slide 3 of 3.IC-networks/2.mesh.pdf.
q=4
a) For columns j = 0…3 in parallel:
For rows i = 2…0 sequentially:
Compare elements at [i, j] and [i + 1, j] and copy the smaller value to
[i, j]
b) For columns j = 2…0 sequentially:
Compare elements at [0, j] and [0,j+1] and copy the smaller value to [0,j]
Output answer at [0,0]
a) j = 0…3
i = 2:
3
17
6
8
3
17
6
8
500 72 64 11
500 72 64 11

16
2
4
1
8
16
2
10
8
8
3
17
6
8
16
2
4
1
25
32
4
1
16
2
10
3
17
6
i = 1:
500 72 64 11

16
2
4
1
16
2
4
1
16
2
10
8
16
2
10
8
3
17
6
8
3
2
4
1
16
2
4
1
16
2
4
1
16
2
4
1
16
2
4
1
16
2
10
8
16
2
10
8
3
2
4
1

3
2
1
1
3
2
1
1

3
1
1
1
3
1
1
1

1
1
1
1
i=0

b) j = 2:
j = 1:
j = 0:
Output answer at [0,0] = 1
The optimal running time is width – 1 + height -1 = with + height – 2 as the
algorithm must parse the width and height of the mesh/matrix in-order to sequentially
compare elements in pairs.
4) Find the prefix sums of the 2D mesh M given below using a parallel algorithm.
Show each step.
M=
1
4
2
3
7
6
11
2
4
Answer 4:
Using the algorithm Prefix computation from slide 9 of 3.IC-networks/2.mesh.pdf
1) In parallel add up the members of each row such that:
For 1 ≤ j ≤q- 1
Si,j = Si,j-1 + Xi,j
1
4
2
3
7
6
11
2
4

1
5
7
3
10 16
11 13 17
2) Add up the members of last column such that:
For 1 ≤ i ≤q- 1
Si,q-1 = Si,q-1 + Si-1,q-1
1
5
7
3
10 16

1
5
7
3
10 23
11 13 40
11 13 17
3) Factor last column across rows in parallel
For 1 ≤ i ≤ q -1 in parallel do
For 0 ≤ j ≤ q -2 do
Si,j = Si,j + Si-1,q-1 (i.e. add the sum at the end of the last row)
1
5
3
10 23
1
7
11 13 40

5
7
10 17 23
34 36 40
5) Find the prefix sums of the 2D mesh M given below using a parallel algorithm.
Show each step.
M=
7
1
3
4
16
9
11
23
8
7
13
10
5
6
4
0
Answer 5:
Using the algorithm Prefix computation from slide 9 of 3.IC-networks/2.mesh.pdf
1) In parallel add up the members of each row such that:
For 1 ≤ j ≤q- 1
Si,j = Si,j-1 + Xi,j
7
7
8
11 15
1
3
4
16 9
11
23
8
7
13
10
8
15 28 38
5
6
4
0
5
11 15 15

16 25 36 59
2) Add up the members of last column such that:
For 1 ≤ i ≤q- 1
Si,q-1 = Si,q-1 + Si-1,q-1
7
8
7
11 15
16 25 36 59

8
11
15
16 25 36
74
8
15 28 38
8
15 28 112
5
11 15 15
5
11 15 127
3) Factor last column across rows in parallel
For 1 ≤ i ≤ q -1 in parallel do
For 0 ≤ j ≤ q -2 do
Si,j = Si,j + Si-1,q-1 (i.e. add the sum at the end of the last row)
7
8
11
15
16 25 36
74

7
8
11
15
31
40
51
74
89
102 112
8
15 28 112
82
5
11 15 127
117 123 127 127
6) Describe a parallel solution for matrix-vector multiplication and apply it to the
matrix-vector pair M-V given. Show each step.
M=
V=
1
925
26
31
16
5
11
2
27
9
3
3
Answer 6:
Using algorithm Matrix Vector Multiplication from page 1 of 3.ICnetworks/3.processor-ring-mult.pdf
1) Spilt up the matrix:
M1=
M2 =
925
16
27
M3 =
26
5
9
31
11
3
2) Split up the vector:
V1= 1 V2=2 V3=3
3) In parallel calculate matching pairs Mi * Vi
M1 * V1 =
M2 * V2 =
M3 * V3 =
925
16
27
52
10
18
93
33
9
4) Add up all obtained matrices:
925
16
27
+
52
10
18
+
93
33
9
=
1070
59
54
7) Given the matrix A and a network of 4 processors denoted by x, show the
illustrated steps of matrix multiplication on 1D networks.
A=
x=
1
2
3
4
3
0
2
2
0
3
5
3
6
4
4
1
1
1
0
2
Answer 7:
Explanation brought out in the last slides of 3.IC-networks/2.mesh.pdf
The columns of the original matrix are shifted. The procedure then considers the
matrix from bottom to top, calculates existing members with network values in
the corresponding location and passes the answer on to the next network
position to be added to the multiplication answer there. The final network
position places its overall result into a final answer construct.
1
0
5
1
3
3
2
6
1
2
2
3
1
3
4
4
0
4
0
2
Step 1:
1
0
5
1
3*1
3
2
6
1
2
2
3
1
3
4
0
4
0
4
2
Step 2:
2
2
3
1
1
0
5
1
3*5
3
2
6
1
4
0
4
0
3*1+3
4
2
Step 3:
2
2
3
1
1
0
5
1
3*0
3
2
6
1
4
0
4
0
3 * 3 + 15
4*1+6
2
4 * 6 + 24
2 * 0 + 10
Step 4:
2
2
3
1
1
0
5
1
3*1
3
2
6
1
4
0
4
0
3*2+0
Step 5:
1
0
5
1
3
2
2
3
1
3
2
6
1
3*2+3
4
0
4
0
4*2+6
2 * 4 + 48
10
Step 6:
1
0
5
1
3
2
2
3
1
3
3
2
6
1
4
0
4
0
4*3+9
2 * 0 + 14
56
10
Step 7:
1
0
5
1
3
2
2
3
1
3
3
2
6
1
4
0
4
0
4
2 * 4 + 21
14
56
10
Step 8:
1
0
5
1
3
2
2
3
1
3
3
2
6
1
4
4
0
4
0
2
29
14
56
10
8) Given the matrix A and a network of 3 processors denoted by x, show the
illustrated steps of matrix multiplication on 1D networks.
A=
1
3
7
2
4
8
6
8
10
1
2
3
x=
½
¼
¼
Answer 8:
The final answer is:
3
4
15/2
7/4
Exercises for week 6:
1) Given a statement: A[i] = 2B[i] + 3, where A and B are arrays and A[i] is element
i of A:
1.1) write pseudo code for an algorithm that would use multi threading to
implement this statement.
1.2) draw a computation dag* for this algorithm given an input in the form
(B, i, j, A) where B is an array of size 4, indexed [1…4], i and j are indices and A is
the output in the form of an array.
Answer 1:
1.1:
Begin Parallel-Alg (B,i,j,A)
if i = j
A[i] = 2B[i] + 3
else{
m=
(i + j) / 2
spawn Parallel-Alg (B,i,m,A)
Parallel-Alg (B,m+1,j,A)
Sync
}
end
1.2: