Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Models 3 Professor Ioana Banicescu CSE 8843 The work-time presentation framework of parallel algorithms: WT paradigm: provides a 2 level top-down description of parallel algorithm Upper-level: suppresses details of the algorithm Lower-level: follows general scheduling principle and results in a full PRAM description Upper-level WT: Goal: describe the algorithm in terms of sequence of time units, where each time unit may include any number of concurrent operation Work: Total number of operations used Eg: For l <= i <= u par do, the statements corresponding to all values of I between l and u are executed concurrently. Example: Compute the sum S of n=2k numbers stored in an array A, processors Pi, 1<=i<=n Algorithm: (Sum) Input: n=2k numbers stored in an array A Output: S= ni=0 A(i) Begin 1. For 1<=i<=n Set B(i):= A(i) 2. For h= 1 to log n do, for 1 <= i <= n/2k Set B(i)= B(2i-1) + B(2i) 3. Set S= B(1) End This algorithm does not mention: • How many PEs are there? • How operations will be allocated to PEs? Algorithm specifies time units, where each time unit may include any number of concurrent operations. Time units: log n + 2 • n operations performed within the first time unit (step 1) • j-th time unit (iteration l= j-1 of step 2) has n/2j-1 operations for 2 <= j <= (log n + 1) • Last time unit: 1 operation (step 3) Work performed by the algorithm: W 𝑛 =𝑛+ = O(n) T(n)= O(log n) log 𝑛 𝑗=1 𝑛/2𝑗 + 1 Lower Level: • Suppose that WT presentation of the algorithm results in a parallel algorithm that runs in T(n) time units while performing W(n) work. • Using WT scheduling principle, we can almost always adapt this algorithm to run on a p processor PRAM in: <= W(n)/p + T(n) parallel steps The WT scheduling principle: a) Let Wi(n) be the number of operations performed in time unit i, 1 <= i <= T(n) b) Simulate each set of Wi(n) operations in Wi(n)/p parallel steps by the p processors for each 1<=i<=T(n) c) If the simulation is successful, the corresponding pprocessor PRAM algorithm takes ≤ 𝑖 (𝑐𝑒𝑖𝑙) 𝑊𝑖 (𝑛) 𝑝 ≤ 𝑖 (𝑓𝑙𝑜𝑜𝑟) parallel steps as desired WT scheduling principle: 𝑊𝑖 (𝑛) 𝑝 + 1 ≤ (floor) 𝑊𝑖 (𝑛) 𝑝 + T(n) • During each time unit i, the Wi(n) operations are scheduled as evenly as possible among available p processors. • During time units 1 and 2, each PE is scheduled to execute the same number of operations. • During time unit 3, the p-th PE executes one less operation than are executed by remaining PEs. • During time unit 4, there are only k possible concurrent operations.