Download Q: What is Data Structure?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Hash table wikipedia , lookup

Rainbow table wikipedia , lookup

Bloom filter wikipedia , lookup

Comparison of programming languages (associative array) wikipedia , lookup

Array data structure wikipedia , lookup

Transcript
Data Structure and Algorithm Question and Answer Solved(2006)
Q: What is Data Structure?Difference between primitive and non primitive data structure.
A: In computer science, a data structure is a particular way of storing and organizing data in
a computer so that it can be used efficiently.
Different kinds of data structures are suited to different kinds of applications, and some are highly
specialized to specific tasks. For example, B-trees are particularly well-suited for implementation of
databases, while compiler implementations usually use hash tables to look up identifiers.
Data structures are used in almost every program or software system. Specific data structures are
essential ingredients of many efficient algorithms, and make possible the management of huge amounts
of data, such as large databases and internet indexing services. Some formal design methods and
programming languages emphasize data structures, rather than algorithms, as the key organizing factor
in software design.
Data structures are generally based on the ability of a computer to fetch and store data at any
place in its memory, specified by an address — a bit string that can be itself stored in memory
and manipulated by the program. Thus the record and array data structures are based on
computing the addresses of data items with arithmetic operations; while the linked data
structures are based on storing addresses of data items within the structure itself. Many data
structures use both principles, sometimes combined in non-trivial ways (as in XOR linking).
A primitive data structure are the basic data structures and are directly operated upon by the
machine instructions. They cannot be futher disintegrated into smaller data items e.g.
int,float,char etc.
Nonprimitive data structures can be classified as arrays, lists, and files. An array is an ordered set which
contains a fixed number of objects. No deletions or insertions are performed on arrays. At best,
elements may be changed. A list, by contrast, is an ordered set consisting of a variable number of
elements to which insertions and deletions can be made, and on which other operations can be
performed. When a list displays the relationship of adjacency between elements, it is said to be linear;
otherwise it is said to be nonlinear. A file is typically a large list that is stored in the external memory of
a computer. Additionally, a file may be used as a repository for list items (records) that are accessed
infrequently.
Q: Explain the Different Operations Performed on data structures?
A: The data appearing in our data structure is processed by means of certain operations. Infact, the particular
data structure that one chooses for a given situation depends largely on the frequency with which
specific operations are performed.The following four operations play a major role:
Transversing
Accessing each record exactly once so that certain items in the record may be processed.(This accessing or
processing is sometimes called 'visiting" the records.)
Searching
Finding the location of the record with a given key value, or finding the locations of all records, which satisfy
one or more conditions.
Inserting
Adding new records to the structure.
Deleting
Removing a record from the structure.
Two additional operations are also performed in some situations such as
Sorting
Arranging the records in some logical orders eg: Alpabetical order or numerical order such as numbers.
Merging
Combining the records in two different sorted files into a single sorted file
Sometimes two or more data structure of operations may be used in a given situation; e.g., we may want to
delete the record with a given key, which may mean we first need to search for the location of the record.
Q: What is an Array?Explain the address calculation in single and multidimensional Arrays.
A: An array is data structure (type of memory layout) that stores a collection of
individual values that are of the same data type. Arrays are useful because instead of
having to separately store related information in different variables (named memory
locations), you can store them—as a collection—in just one variable. It is more
efficient for a program to access and process the information in an array, than it is to
deal with many separate variables.
All of the items placed into an array are automatically stored in adjacent memory
locations. As the program retrieves the value of each item (or "element") of an array,
it simply moves from one memory location to the very next—in a sequential manner.
It doesn't have to keep jumping around to widely scattered memory locations in order
to retrieve each item's value.
Imagine if you had to store—and later retrieve—the names of all of the registered
voters in your city. You could create and name hundreds of thousands of distinct
variable names to store the information. That would scatter hundreds of thousands
of names all over memory. An alternative is to simply create one variable that store
the same information, but in sequential memory locations.
For example, if you have class with five students, and you want to store their test
grades, you will create an array of the integer data type. Since you have five students,
you will create a single array. This sets aside five sequential memory locations to hold
the five scores. Each score is stored as an "element" in the array. The first score will
be store at location (or "index") zero. The second score will be stored at array index
equal to one. The third score will be stored at index equals two, and so on.
Let's name the array "Scores."
The student grades are: 70, 75, 80, 85, 90, and 100.
Let's store ( or "assign") the first grade: Scores (0) = 70.
Now, the second grade: Scores (1) = 75.
Assign the third grade: Scores (2) = 80, and so on.
Now the computer code can access the array Scores to get the value of each score.
Address Calculation in one Dimensional Array
The address of a single dimension array can be easily calculated
Consider an array A of 25 elements if we are required to find the address of A[4] elements
If the first cell in the sequence A[1],A[2],…………A[25] was at address 15,the A[4] would be located
at15+(4-1)=18,as shown in figure below .We assure the size of each element stored in one unit.
15
16
A[1]
A[2]
17
A[3]
18
A[4]
Memory Cells
Therefore it is necessary to know the starting address of the space allocated to array and the size of
each element which is same for all elements of an array.we may call the starting address as a base
address and denoted it by B.Then the location of the I th element would be…ITH=B+(I-1)*s
It must be noticed that it is for the case where lower bound on the subscript is one.
Consider Our array A[4]…………………A[10].In this case our expression for the location of ith element would
be B+(I-4)*s
IN General, ITH=B+(i-l)*s
Where B =Base addres.
I=ith element.
S=size of each element of array.
L=Lower bound may be +ve ,-ve ,Zero.
The address Calculation can also be Explained with the help of a Program given below:
Suppose if num is the name of one dimensional array and if it is decleared as Int num[10]
The address of the Ith element is calculated as(num+i)where I is the Ith element
The address calculation of one dimensional array is illustrated in the following program
/*Address Calculation for one dimensional arrays*/
#include<stdio.h>
#include<conio.h>
Void main()
{
int a[10],*p,I,n;
p=&a[0];
clrscr();
printf(“Enter the total no. of elements:”);
scanf(“%d”,&n);
printf(“Enter the elements one by one:”);
for(i=0;i<n;i++)
scanf(“%d”,&a[i]);
printf(“The given elements are \n”);
for(i=0;i<n;i++)
{
Printf(“Address of %d element=%u/t its contents a [%d]=%d\n”,I,(a+i),I,*(a+i));
}
Getch();
}
Sample input and output
Enter the total no of elements: 5
Enter the elements one by one 45
34
23
12
78
The given elements are
Address of 0 element=65502 Its content a[0]=45
Address of 1 element=65504 Its content a[1]=34
Address of 2 element=65506 Its content a[2]=23
Address of 3 element=65508 Its content a[3]=12
Address of 4 element=65510 Its content a[4]=78
Formula of Address Calculation in two Dimensional Array(Row Major Form)
B+(I-L1)*(U2-L2+1)*S+(J-L2)*S
WHERE,
B=BASE ADDRESS OF ARRAY
S=SIZE OF THE ELEMENT OF THE ARRAY
I=Location of ith element
L1=Lower bound of row
L2=Lower Bound of Column
U1=Upper Bound Of Row
U2 =Upper Bound Of Column
J=Location of the jth element
Formula of Address Calculation in MultiDimensional Array(Column Major Form)
B+(I-L1)*(U2-L2+1)*S+(J-L2)*S
WHERE,
B=BASE ADDRESS OF ARRAY
S=SIZE OF THE ELEMENT OF THE ARRAY
I=Location of ith element
L1=Lower bound of row
L2=Lower Bound of Column
U1=Upper Bound Of Row
U2 =Upper Bound Of Column
J=Location of the jth element
Q:Write an algorithm to Insert,Delete,Transverse ,Sort and Search an element in an
array?
A: Representation of linear arrays in memory
The elements of linear array are stored in successive memory cells. The number of memory cells required depends
on the type of data elements. Computer keeps track of the address of the first element of the array. This is known as
the base address of the array. Using this base address the computer can calculate the address of any element.
Traversing linear arrays
Let LB be the lower bound of the array and UB the upper bound of the array
Initialize counter C = LB
While C <= UB repeat steps 3 and 4 else go to step 5
Read the Cth element of the array
Increment C by 1 (Set C = C + 1)
[End of loop]
Exit
Inserting and deleting in arrays
Inserting and deleting elements at the end of the array is easy. To insert anywhere else, we need to shift the
elements occurring after that location down to create space for the new element. Similarly if we delete an element in
anywhere in the array we need to move all the elements coming after that element up.
Insert an element in an array
Algorithm to insert an element
Let LB be the lower bound of the array AR, UB the upper bound and P be the position where we want to add an
element.
Initialize counter C = UB
While C >= P repeat steps 3 and 4 else go to step 6
Copy the value of the element stored in location C to location C+1 (AR[C+1] = AR[C])
Decrement C by 1 (Set C = C - 1)
[End of loop]
Set the value of element stored at P to the element to be added to the array
Exit
Delete an element from an array
Algorithm to delete an element
Let LB be the lower bound of the array AR, UB the upper bound and P be the position where we want to delete an
element from.
Initialize counter C = P
While C <= UB repeat steps 3 and 4 else go to step 6
Copy the value of the element stored in location C+1 to location C (AR[C] = AR[C + 1])
Increment C by 1 (Set C = C + 1)
[End of loop]
Set the value of element stored at the UB to NULL (AR[UB] = NULL)
Exit
Sorting arrays
Bubble Sort algorithm
Bubble sort is a simple sorting algorithm. It works by repeatedly stepping through the list to be sorted, comparing
each pair of adjacent items and swapping them if they are in the wrong order. The pass through the list is repeated
until no swaps are needed, which indicates that the list is sorted. The algorithm gets its name from the way smaller
elements "bubble" to the top of the list.
Bubble sort has worst-case and average complexity both О(n²), where n is the number of items being sorted.
Let LB be the lower bound of the array AR and UB the upper bound.
Initialize counter I = LB
Set value of counter J = I
While I < (UB – 1) repeat steps 4 and 7
While J < (UB-1) repeat steps 5 and 6
Compare the value of element at Jth and (J+1)th locations. If value of element at Jth location > that at (J+1)th location
then swap the elements
Increment value of J by 1
[End of inner loop]
Increment step I by 1
[End of outer loop]
Exit
Step-by-step example
Let us take the array of numbers "5 1 4 2 8", and sort the array from lowest number to greatest number using bubble
sort algorithm. In each step, elements written in bold are being compared.
First Pass:
( 5 1 4 2 8 ) -> ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and swaps them.
( 1 5 4 2 8 ) -> ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) -> ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) -> ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not swap them.
Second Pass:
( 1 4 2 5 8 ) -> ( 1 4 2 5 8 )
( 1 4 2 5 8 ) -> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) -> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) -> ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm needs one whole
pass without any swap to know it is sorted.
Third Pass:
( 1 2 4 5 8 ) -> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) -> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) -> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) -> ( 1 2 4 5 8 )
Finally, the array is sorted, and the algorithm can terminate.
Searching in arrays
Linear Search
Linear search algorithm is one of the simplest algorithms to search for an element. In linear search the array is
traversed and the element is compared with each element of the array. In the worst case scenario that number that
we are searching for could be at the end of the array and thus require n comparisons (where n is the length of the
array) before the number is found or before we can confidently say that the number is not part of the array.
Let LB be the lower bound of the array AR, UB the upper bound of the array and E the element to be found.
Initialize counter C = LB
While C <= UB repeat steps 3 and 4 else go to step 5
Compare the Cth element of the array with E. If the values match then display the number was found at the Cth
location and go to step 5
Increment C by 1 (Set C = C + 1)
[End of loop]
Exit
Binary Search
Binary search is an example of divide and rule algorithm. In this we use the knowledge that the array is sorted to
decrease the number of comparisons that we need to do before we find the number of can confidently say that the
number does not exist in the array. In the worst case scenario binary search requires log 2n comparisons.
Let LB be the lower bound of the array AR sorted in ascending order, UB the upper bound of the array and E the
element to be found.
Consider two variables BEG and END. BEG will stand for the beginning (lower bound) and END for the end (upper
bound) for the current range of locations that we are working with in the program.
Set BEG = LB and END = UB.
While BEG is not equal to END repeat step 3 else go to step 4
Compare the number stored at the middle of the current range of locations ((BEG + END)/2) to E.
If the values match then display the number was found at ((BEG + END)/2)th location and go to step 4
If number stored at the middle of the current range of locations ((BEG + END)/2) is greater than the element E then
the number can be in the first half of the current range hence set END = ((BEG + END)/2)
If number stored at the middle of the current range of locations ((BEG + END)/2) is less than the element E then the
number can be in the second half of the current range hence set BEG = ((BEG + END)/2)
[End of loop]
Exit
Q: what is the difference between sorting and searching?Explain the sorting
techniques and their complexity analysis.
Q: What is Hashing?Explain Three Techniques often built into hash functions.
A: Hashing is the transformation of a string of characters into a usually shorter fixed-length
value or key that represents the original string. Hashing is used to index and retrieve items in
a databasebecause it is faster to find the item using the shorter hashed key than to find it using
the original value. It is also used in many encryption algorithms.
As a simple example of the using of hashing in databases, a group of people could be arranged in a
database like this:
Abernathy, Sara Epperdingle, Roscoe
many more sorted into alphabetical order)
Moore, Wilfred
Smith, David
(and
Each of these names would be the key in the database for that person's data. A database search
mechanism would first have to start looking character-by-character across the name for matches
until it found the match (or ruled the other entries out). But if each of the names were hashed, it
might be possible (depending on the number of names in the database) to generate a unique fourdigit key for each name. For example:
7864 Abernathy, Sara 9802 Epperdingle, Roscoe 1990 Moore, Wilfred
David (and so forth)
8822 Smith,
A search for any name would first consist of computing the hash value (using the same hash
function used to store the item) and then comparing for a match using that value. It would, in
general, be much faster to find a match across four digits, each having only 10 possibilities, than
across an unpredictable value length where each character had 26 possibilities.
The hashing algorithm is called the hash function (and probably the term is derived from the idea that
the resulting hash value can be thought of as a "mixed up" version of the represented value). In addition
to faster data retrieval, hashing is also used to encrypt and decrypt digital signatures (used to
authenticate message senders and receivers). The digital signature is transformed with the hash
function and then both the hashed value (known as a message-digest) and the signature are sent in
separate transmissions to the receiver. Using the same hash function as the sender, the receiver derives
a message-digest from the signature and compares it with the message-digest it also received. They
should be the same.
The hash function is used to index the original value or key and then used later each time the data
associated with the value or key is to be retrieved. Thus, hashing is always a one-way operation. There's
no need to "reverse engineer" the hash function by analyzing the hashed values. In fact, the ideal hash
function can't be derived by such analysis. A good hash function also should not produce the same hash
value from two different inputs. If it does, this is known as a collision. A hash function that offers an
extremely low risk of collision may be considered acceptable.
Here are some relatively simple hash functions that have been used:
The division-remainder method: The size of the number of items in the table is estimated. That
number is then used as a divisor into each original value or key to extract a quotient and a
remainder. The remainder is the hashed value. (Since this method is liable to produce a number
of collisions, any search mechanism would have to be able to recognize a collision and offer an
alternate search mechanism.)
Folding: This method divides the original value (digits in this case) into several parts, adds the
parts together, and then uses the last four digits (or some other arbitrary number of digits that
will work ) as the hashed value or key.
Radix transformation: Where the value or key is digital, the number base (or radix) can be
changed resulting in a different sequence of digits. (For example, a decimal numbered key could
be transformed into a hexadecimal numbered key.) High-order digits could be discarded to fit a
hash value of uniform length.
Digit rearrangement: This is simply taking part of the original value or key such as digits in
positions 3 through 6, reversing their order, and then using that sequence of digits as the hash
value or key.
A hash function that works well for database storage and retrieval might not work as for cryptographic
or error-checking purposes. There are several well-known hash functions used in cryptography. These
include the message-digest hash functions MD2, MD4, and MD5, used for hashing digital signatures into
a shorter value called a message-digest, and the Secure Hash Algorithm (SHA), a standard algorithm,
that makes a larger (60-bit) message digest and is similar to MD4.