Download Guidelines for the Research Paper Submission for Round 2 Intel

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Factorization of polynomials over finite fields wikipedia , lookup

Transcript
Title of the research paper:
ALGORITHMS AND CONCERNED ISSUES IN PARALLEL SEARCHING AND
SORTING
Research Area: Rewriting algorithms to help parallel programming
Authors: Amar Krishna, Devendra Singh Dhami, Pooja Prasad, Thrupthi.N.Murthy
Faculty mentor: Srinivas.B.C
Name of the Institution: Sir M.Visvesvaraya institute of technology
Abstract:
The recent switch to parallel microprocessors is a milestone in the history of
computing. Industry has laid out a roadmap for multicore designs that preserves the
programming paradigm of the past via binary compatibility and cache coherence.
Conventional wisdom is now to double the number of cores on a chip with each
silicon generation. Parallel algorithms mainly involves reducing a problem to smaller
problems and computing them concurrently to get better time and space efficiency
.Parallel technologies are being developed to address the need of high performance.
In the computing field, searching and sorting are the two most fundamental and
important operations. Parallelization of searching sorted and unsorted arrays have
been designed. Parallelizing the algorithm for searching in sorted arrays help us
derive the best time complexity. In case of sorting, Parallelization of the quick sort
algorithm and the merge sort algorithm has been designed. Merge sort requires the
services of a sub-process- the binary search algorithms which are based on the
divide and conquer strategy and make use of the CREW merge method for effective
parallelization. The algorithm takes the help of two basic assumptions one is the
number of processors involved and other one is the amount of concurrent memory
used for read and write operations.
Background:
The background involves the general conception of the parallel computing and
how to reduce the time involved in computation.
The idea involved is:Some algorithms are easy to divide up into pieces like this. For example, splitting up
the job of checking all of the numbers from one to a hundred thousand to see which
are primes could be done by assigning a subset of the numbers to each available
processor, and then putting the list of positive results back together. In this
algorithm we have taken the help of parallel computing to reduce the processing
time. We have decided the time consumed by giving different input to several
numbers of processors and input size.
Problem Statement:
The sorting problem is defined as follows: We are given a sequence S= {S1, S2…
Sn} of n items. The elements of S are initially in random order. The purpose of
sorting is to arrange the elements of S into a new sequence S’ = { S1’, S2’… Sn’)
such that Si’ < Si+1’ for i = 1, 2….n-1. Any algorithm for sorting must require O
(nlogn) operations in the worst case.
Methodology:
The design of algorithms are based on a two step process which first
requires two basic assumptions
I.The number of processors involved
II. The amount of shared memory used for concurrent read and write operations,
QUICK SORT:
step1: determination of starting and ending points for each processor
In the first step, the starting and ending points for each processor is
determined as:
Start: [(i-1) ceil (n/N)] + 1
End: [min {n, I (ceil (n/N))}]
step2: development of parallel algorithm for the operations using concept
involved in sequential algorithms
Each processor then sorts its sequence using the quick sort method.
Procedure: Quicksort (S)
If | S | =2 and S2 < S1
Then S1 <-> S2
Else if |S| > 2 then
(1) { determine m , the median element of S }
Sequential select (S, |S|/2)
(2) { Split S into two subsequences S1 and S2 }
2.1 S1 -> {Si : Si <= m } and |S1| < |S|/2
2.2 S2 -> {Si : Si >=m }
and |S2| > |S|/2
(3) Quicksort (S1)
(4) Quicksort (S2)
end if
end if.
At each level of the recursion, the “Quicksort” procedure finds the median of a
sequence S and then splits S in to two subsequences S1 and S2 of elements
smaller than or equal to and larger than or equal to the median respectively.
The algorithm is now applied recursively to each of S1 and S2 .This continues
until S consists of either one or two elements, in which case recursion is no
longer needed.
MERGE SORT:
step1: determination of starting and ending points for each processor
In the first step, the starting and ending points for each processor is
determined as:
Start: [(i-1) ceil (n/N)] + 1
End: [min {n, I (ceil (n/N))}]
step2: development of parallel algorithm for the operations using concept
involved in sequential algorithms
For that, a procedure called Binary search is used as a sub-process. It works
somewhat like this:
Procedure: Binary search (S, x, k)
Step 1: (1.1) i->1
(1.2) h->n
(1.3) k->0
Step 2: while i<=h do
(2.1) m -> (i + h)/2
(2.2) if x = Sm then i) k -> m
ii) i - > h+1
else if x <Sm then h - >m-1
else I ->m+1
end if
end if
end while.
This procedure takes as input sequence S = {S1, S2…Sn} of numbers sorted
in nondecreasing order and a number x. If x belongs to S, the procedure
returns the index k of an element Sk in S such that x = Sk. Otherwise, the
procedure returns a zero .Binary search is based on the divide and conquer
principle. At each stage a comparison is performed between x and an element
of S. Either the two is equal and the procedure terminates or half of the
elements of the sequence are discarded. The process continues until the
number of elements left is 0 or 1, and after at most one additional
comparison the procedure terminates.
For parallel merging, the CREW merge method is used.
Procedure: Crew Merge (A, B, C)
Step 1: {Select N-1 elements of A that subdivide that sequence into N
subsequences of approximately the same size. Call the subsequence formed
by these N-1 elements A (Pr). A subsequence B (Pr) of N-1 elements of B is
chosen similarly. This step is executes as follows :}
for i =1 to N-1 do in parallel
Processor Pi determines a (Pr) and b (Pr) from
(1.1) a (Pr) - > a [i(r/n)]
(1.2) b (Pr) - > b [i(s/N)]
end for.
Step 2 : { Merge A(Pr) and B(Pr) into a sequence of triples V = { v1,
v2,….v(2N-2)}, where each triple consists of an element of A(Pr) or B(Pr)
followed by its position in A(Pr) or B(Pr). This is done as follows :}
(2.1) for i =1 to N-1 do in parallel
(i) Processor Pi uses “Binary search” on B (Pr) to find the smallest j
such that a (i) (Pr) < b (i) (Pr).
(ii) If j exists then v (i+j-1) -> (a (i) (Pr), i, A)
else v (i+N-1) -> (a (i) (Pr) , i , A )
end if
end for.
(2.2) for i =1 to N-1 do in parallel
(i) Processor Pi uses “Binary Search” on A (Pr) to find the smallest j
such that b (j) (Pr) < a (j) (Pr)
(ii) if j exists then v(i
+j-1) -> ( b(i)(Pr) , i , B )
else v (i+N-1) -> (b (i) (Pr), i, B)
end if
end for.
Step 3: {Each processor merges and inserts into C the elements of two
subsequences, one from A and one from B. The indices of the two elements
(one in A and one in B) at which each processor is to begin merging are first
computed and stored in an array Q of ordered pairs. This step is executed as
follows :}
(3.1) Q(1) -> (1,1)
(3.2) for i = 2 to N do in parallel
if v (2i-2) = (A (k)(Pr) , A ) then processor Pi
(i) Uses “Binary search” on B to find the smallest j such that
b (j) > a (k) (Pr).
(ii) Q (i) -> (k[r/N], j)
else processor Pi
(i) Uses “Binary Search “on A to find the smallest j such that a
(j) > b (k) (Pr).
(ii) Q (i) - > (j, k[s/N])
end if
(3.3)
end for
for i = 1 to N-1 do in parallel
Processor Pi uses “Sequential Merge” and Q (i) = (x, y) to merge two
subsequences one beginning at a(x) and the other at b(y) and places
the result of the merge in array C beginning at position x+y. The
merge continues until
(i) An element larger than or equal to the first component of v (2i) is
encountered in each of A and B (when i<= N-1)
(ii) No elements are left in either A or B (when i=N)
end for.
Analysis of Crew Merge:
Step 1: With all processors operating in parallel, each processor computes
two subscripts. Therefore this step requires constant time.
Step 2: This step consists of two applications of procedure “Binary Search” to
a sequence of length N-1, each followed by an assignment statement. This
takes O (log N) time.
Step 3: 3.1 consists of a constant time assignment, and 3.2 require at most O
(log S) time. The procedure “Sequential Merge” takes at most O ((r+s)/N)
time. In the worst case r=s=n. Thus,
T (2n) = O ((n/N) + log n)
(Each processor creates two subsequences).
Design Issues: The memory requirement is of order O (n).
An advantage obtained during this process is that the no. of processors is
independent of the length of the searching sequence.
Some output times obtained for various numbers of processors are given
below
Varying no of processors:
1)
Input size : 100000
No of processors: 2
Duration: 3.78
2)
Input size : 100000
No of processors: 4
Duration: 3.8
2) Input size : 100000
No of processors: 1
Duration: 3.76
For input size = 10000
1)
No of processors : 4
Duration: 0.38
2)
No of processors : 1
Duration: 0.4
Key Results:
2) No of processors : 2
Duration: 0.38
Derivation of efficient parallel algorithms for
1. Searching sorted sequence
2. Sorting using the method of quick sort
3. Sorting using the method of merge sort
Discussion:
The discussion involved the basics of algorithms and its use. It involved learning of
basic concepts of parallelization and its application in deriving efficient algorithms
which help us in increasing the performance of many operations in the computing
field with the two basic operations searching and sorting as the nucleus. The
discussions aimed at developing better parallelized algorithms, analyzing them and
penning down their advantages and disadvantages over sequential algorithms.
Scope for future work (if any): Parallel processing is the main focus in today’s
computing world. It enhances the performance of several processes exponentially.
The parallel algorithms can act as a source for developing various parallel processing
devices and for designing and implementing several threading applications. The
above algorithm needs to optimize one more resource, the communication between
different processors. There are two ways parallel processors communicate, shared
memory or message passing. The above parallel algorithm has a serial part and so
has a saturation point (according to Amdahl’s law). After that point adding more
processors does not yield any more throughputs but only increases the overhead and
cost.
Conclusion: Algorithms are the basic and main part of programming languages.
Parallelization of algorithms helps in improving the time and space efficiency of
various operations. Improving the efficiency of the basic operations search and sort
by introducing parallel algorithms we look forward to achieve better solutions for
other problems which require the services of these algorithms.
References:
1) Design and analysis of parallel algorithms - Akl, Selim G.
2) Introduction to parallel algorithms - Xavier C., Iyengar, S.S.
3) Parallel processing and parallel algorithms: theory and
computation Roosta, Seyed H
Web Sites: - www.wikipedia.org, www.eecs.berkeley.edu.
Acknowledgements: Mr. Srinivas.B.C.