Download CSE 373 - Data Structures - Dr. Manal Helal Moodle Site

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

C syntax wikipedia , lookup

Abstraction (computer science) wikipedia , lookup

Algorithm wikipedia , lookup

Stream processing wikipedia , lookup

Array data structure wikipedia , lookup

Selection algorithm wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Data-intensive computing wikipedia , lookup

Multidimensional empirical mode decomposition wikipedia , lookup

Operational transformation wikipedia , lookup

Corecursion wikipedia , lookup

Transcript
AASTMT Engineering and Technology College
CC 215
DATA STRUCTURES
Lecture 1
Dr. Manal Helal - Fall 2014
1
Staff
2

Instructor
 Dr.

Manal Helal, [email protected]
TA’s
 Eng.
Nour El Din S Eissa, [email protected]
Fall 2014
CC 215 Data Structures
1-3
Course Description

The course tackles the difference between
static data type and dynamic data types. The
concept of pointers & dynamic memory
allocation is discussed allowing students to
experience practical programming using
dynamic structures.
1-4
Course Topics












Introduction to static Vs dynamic data structures
Stack data type
Implementation of stack in different applications
Queue data type
Introduction to dynamic programming using pointers
Linked lists
Double & circular linked lists
Introduction to tree structures
Tree traversals
Threaded tree
Graphs representation and traversals
Graphs minimum spanning tree & shortest path
1-5
Grading Scheme

Week 7





5%
2.5%
2.5%
20%
Week 12






Quizes
Lab Submissions
Assignments
Midterm 1
Quizes
Lab Submissions
Assignments
Midterm 2
Project
Final Exam
5%
2.5%
2.5%
10%
10%
40%
1-6
Course Rules


Website:
 http://moodle.manalhelal.com/course/view.php?id=4
 Signup using First and Last Names exactly as your records in AASTMT, and your
student ID as your student ID in AASTMT. Otherwise, your grades will not be
transferred to your academy records.
 Login often to follow up with assignment deadlines and announcements. Its your
responsibility to keep up with the course. Failure to receive emails or notifications
is no excuse. Everything is announced in class and then published in moodle.
Academic Honesty:
 Please confirm with the AASTMT policies regarding plagiarism and cheating. A
zero grade for the violating submission will be given the first time, then reported to
the department for further action.
 Please ask questions in the online forum to allow everyone to join and benefit
from the discussions, and avoid private emails to the lecturers and TAs. Contact
the teachers privately only about your personal grades or circumstances, not
about the course content.
1-7
How to score A+ in this course?
Please do all practicals and assignments and
study regularly. In case of problems, please ask
questions. Accumulating problems will make
things worse as the semester goes by.
Office Hours
8

Dr. Manal Helal – Room 308
 Sun
and Monday 2:00-04:00 p.m. or by
appointment

Eng. Nour – to be announced
CC 215 Data Structures
Fall 2014
Textbook
9

Data Structures and Algorithm Analysis in C++,
by Weiss

See Web page (syllabus) for errata and source
code
CC 215 Data Structures
Fall 2014
Class Overview
10
•
Introduction to many of the basic data structures used in
computer software
•
•
•
•
•
•
Understand the data structures
Basically analyze the algorithms that use them (more in CC412
Algorithms course).
Know when to apply them
Practice design and analysis of data structures.
Practice using these data structures by writing programs.
Data structures are the plumbing and wiring of programs.
CC 215 Data Structures
Fall 2014
Goal
11

You will understand
 what
the tools are for storing and processing
common data types
 which tools are appropriate for which need

So that you will be able to
 make
good design choices as a developer,
project manager, or system customer
CC 215 Data Structures
Fall 2014
Data Structures: What?
12


Need to organize program data according to
problem being solved
Abstract Data Type (ADT) - A data object and a
set of operations for manipulating it
List ADT with operations insert and delete
 Stack ADT with operations push and pop


Note similarity to Java classes

private data structure and public methods
CC 215 Data Structures
Fall 2014
Data Structures: Why?
13

Program design depends crucially on how data
is structured for use by the program
Implementation of some operations may become
easier or harder
 Speed of program may dramatically decrease or
increase
 Memory used may increase or decrease
 Debugging may be become easier or harder

CC 215 Data Structures
Fall 2014
Terminology
14
•
Abstract Data Type (ADT)
•
•
Algorithm
•
•
A high level, language independent, description of a step-bystep process
Data structure
•
•
Mathematical description of an object with set of operations
on the object. Useful building block.
A specific family of algorithms for implementing an abstract
data type.
Implementation of data structure
•
A specific implementation in a specific language
CC 215 Data Structures
Fall 2014
Algorithm Analysis: Why?
15

Correctness:
 Does

the algorithm do what is intended.
Performance:
 What
is the running time of the algorithm.
 How much storage does it consume.

Different algorithms may correctly solve a
given task
 Which
CC 215 Data Structures
should I use? Answered in CC412.
Fall 2014
Proof by Induction
16



Basis Step: The algorithm is correct for the
base case (e.g. n=0) by inspection.
Inductive Hypothesis (n=k): Assume that the
algorithm works correctly for the first k cases,
for any k.
Inductive Step (n=k+1): Given the hypothesis
above, show that the k+1 case will be
calculated correctly.
CC 215 Data Structures
Fall 2014
Program Correctness by Induction
17



Basis Step: sum(v,0) = 0. 
Inductive Hypothesis (n=k): Assume
sum(v,k) correctly returns sum of first k
elements of v, i.e. v[0]+v[1]+…+v[k-1]
Inductive Step (n=k+1): sum(v,n) returns
v[k]+sum(v,k) which is the sum of first k+1
elements of v. 
CC 215 Data Structures
Fall 2014
Algorithm Execution-Time Analysis

(Simplified) definitions.



Best-Case Execution Time (BCET): The shortest time that the
algorithm takes to solve the problem.
Worst-Case Execution Time (WCET): The longest time that the
algorithm takes to solve the problem.
Average-Case Execution Time (ACET): The expected time that the
algorithm takes to solve the problem on average.


The average execution time is the arithmetic mean of all execution
times if all inputs are equally likely to occur, otherwise we use
probability distributions to compute it.
 We will not focus on probability distributions of execution times
in this course, this is for your own knowledge.
How do we compute the above?

Not easy.
Algorithm Execution-Time Analysis

Sum the natural numbers 1 to n.
Arithmetic series.
 We know that SUM = 1+2+...+n = n(n+1)/2.

A1
Function SUM(n: ℕ) :
result := n(n+1)/2
return result

A2
Function SUM(n: ℕ) : ℕ
result := 0
for i:=1 to n do
result := result + i
end for
return result
Which algorithm is faster A1 or A2?
Note the use of pseudocode
Pseudocode
20

In the lectures algorithms will be presented in
pseudocode.
This is very common in the computer science
literature
 Pseudocode is usually easily translated to real code.
 This is programming language independent

CC 215 Data Structures
Fall 2014
Algorithm Execution-Time Analysis

Algorithm A1


A1 computes the result in a constant number of steps at
runtime.
What is the execution time T(A1) for A1?


T(A1) = β1 (β1 is some constant)
T(A1) does not depend on the input.


If input is 1, 10, 100, etc. T(A1) will not change (significantly).
Algorithm A2


A2 computes the result in a variable number of steps at
runtime depending on input size.
What is the time T(A2) for A2?


T(A2) = α2n + β2 (α2, β2 are some constants)
T(A2) depends on the input size

If input is 1, 10, 100, etc. T(A2) will change.
Algorithm Execution-Time Analysis

Before we start to formalise the asymptotic-complexity analysis of
algorithms, we make sure of the following.

We are interested in computing the growth function of an algorithm which
shows how the number of steps of the algorithm varies in terms of the size
of its input.
On the Y axis: the number of runtime steps.
 On the X axis: the size of the input.


The growth function of an algorithm has the form f(n).
n is the size of the input to the algorithm (X axis).
 f(n) is the number of runtime steps to run the algorithm
given input n (Y axis).
 The asymptotic complexity of an algorithm with growth
function f(n) is O(g(n)) means that the asymptotic growth of
f is bounded (from the top) by that of g. More in CC412 and
in moodle website.

Algorithm Execution-Time Analysis

Which of A1 and A2 is faster?


T(A1) = β1
T(A2) = α2n + β2
When is this the case?
When is this the case?
Algorithm Execution-Time Analysis

We say that A1 has a constant growth.



We say that A2 has a linear growth in terms of the input.



Its execution time does not depend on the input value/size.
T(A1) = β1
Its execution time grows as a linear function in terms of the input
value/size.
T(A2) = α2n + β2
A1 is more efficient than A2.

When input is small, A2 might be more efficient.


Analysis of algorithms when input is small is generally not interesting, as
they spend most of the time in these cases in start-up code.
When input is large, A1 is more efficient.

We are interested in the cases when input is large, also called as the
difficult instances.

So A1 is more efficient than A2.
Asymptotic-Complexity Classes
Name
Symbol
Constant
O(1)
Logarithmic
O(log n)
Linear
O(n)
Log-linear
O(n log n)
Quadratic
O(n2)
Cubic
O(n3)
Polynomial
O(np)
Exponential
O(bn)
Factorial
O(n!)
Incomputable
O(∞)
Algorithms vs Programs
26

Proving correctness of an algorithm is very
important


a well designed algorithm is guaranteed to work
correctly and its performance can be estimated
Proving correctness of a program (an
implementation) is fraught with weird bugs

Abstract Data Types are a way to bridge the gap
between mathematical algorithms and programs
CC 215 Data Structures
Fall 2014
Algorithms and Data
• Algorithms operate on data.
o An algorithm is executed on
the processor.
o The data is stored in main
memory.
• Main memory:
o Organised into addressable
cells/locations that store
data.
Arrays

An array (data structure) is a collection of data items or elements
(variables, values) which are identified by integer indices. All the
elements in an array have the same type.
Arrays and Records

The memory address of each array’s cell can be computed from its
index using a very simple mathematical formula.


The address of the first cell of the array is:

address(A[0]) = s.

This is implementation dependent.
The address of the ith cell is:


In order to access an element of the array:

The address of that element is computed in constant time.



address(A[i])= s + i.
This is true because a simple addition is performed.
Array access is O(1).
Records (also called Structures).

Similar to arrays but they can store items of different types.
Arrays and Records

Arrays and records are contiguous data structures.


Their elements are located next to each other in main memory.
Advantages of using arrays and records.

We can retrieve an array element from its index in constant time, O(1),
meaning it costs us asymptotically nothing to look up an element of an
array or record.


Consist solely of data, no space wasted on links.


This is very important.
Compare to linked lists later.
Physical continuity/memory locality: if we look up element i, there is a
high probability we will look up element i+1 next – this is exploited by
cache memory.
Arrays and Records

Disadvantages of using arrays and records.

Static arrays are non-flexible static structures.

We have to decide in advance how much space we want when the array is
allocated.


Insertion/deletion is expensive.




Once the block of memory for the array has been allocated, that’s it – we’re stuck
with the size we’ve got.
Requires shifting.
Re-copying of array is full.
We can compensate by always allocating arrays larger than we think we’ll
need, but this wastes a lot of space.
Dynamic arrays grow/shrink in size at runtime so they are relatively
flexible.

E.g. ArrayList data structure in Java.
Linked Lists

Static arrays are used when we know a priori (i.e., before
implementation) how many items we need to store.

For example:
 Ask the user to input 20 numbers to sort.

A static array is used.


We know the user will input 20 numbers.
Ask the user to enter some numbers to sort.

A dynamic array is used.



Not all programming languages have efficient support for dynamic
arrays.
Programming a dynamic array based dynamic resizing and copying
of elements is not efficient.
A linked-list is used.


We don’t know how many numbers the user will input.
We can always program it efficiently.
Linked Lists
Linked Lists

Advantages of using linked lists:



Very flexible.

We don’t need to worry about allocating space in advance, can use any free
space in memory. We only run out of space when the whole memory is
actually full.

When doing insertion and deletion no shifting is required.
More efficient for moving large records (leave data in same place in
memory, just change some pointers).
Disadvantages of using linked lists:

Wasted space: We store both pointers and data.

To find the ith item, we must start at the beginning and follow pointers
until we get there. In the worst case, if there are n items in a list and we want
the last one, we have to do n lookups.

So retrieving an element from its position in the list is O(n).
Performance



We are interested in knowing how costly it is to:

Access an item in the data structure.

Traverse the data structure (e.g., to search).

Insert or delete an item from the data structure.
Arrays

Access is constant.

Traversal is linear.

Insertion/Deletion

Requires shifting if array is not full.

Requires copying in a bigger array if array is full.
Linked lists.

Access is linear.

Traversal is linear.

Insertion/Deletion are constant.
Singly, Doubly, and Circularly Linked Lists



The linked lists we have seen so far are called singly linked lists.

A node points to its successor in the list.

A node has one pointer only.
There are also doubly linked lists.

A node points both to its successor and its predecessor in the list.

The node has two pointers.

Consume more space but allow easier traversal for some algorithms.
There are also circularly singly linked lists.

The tail points back to head.

There is no “real” head or tail.

A special node is used called cursor e.g. to know where to start and
finish traversing the circularly list.
Example

See the source code on moodle (now in Java,
and will add more in C).
A
node of the list.
 Node.java
A
singly linked list.
 SinglyLinkedList.java
 An
example usage of linked list.
 SinglyLinkedListExample.java
 An
example usage of Java implementation of
linked list.
 SinglyLinkedListExampleUsingJavaLibraries.java
Concrete Data Structures and Abstract Data
Structures/Types


Concrete Data Structures

Arrays, records, and linked lists are concrete data types.

They are provided by the computer language.

They are stored at specific addresses in memory.
Abstract Data Structures/Types (ADTs)

Offer a higher-level view of our interaction with data, and comprise

Data.

Operations on this data.

We describe the behaviour of our data structures in terms of abstract
operations.

However, the way these operations are implemented will affect
efficiency

There are different implementations of the same abstract operations

We want the ones we will use most commonly to be the most efficient
Stacks
Stack of books
Stack of Plates
Queues
Please learn to queue
in your everyday life,
and teach/tell people
around you to queue.
A queue of people
• Queuing is an
important rule in life.
• By not queuing:
o You take other
people’s rights.
o You show you are
not civilised.
For Next Week:
41


Read Chapter 3: particularly about Lists and
Stack ADT.
Exercise: In Pseudo-code, swap two adjacent
elements in a list using:
 An
Array
 A Linked List by adjusting only the links (and not
the data) for:
 a.
singly linked lists
 b. doubly linked lists