Download 1 Fundamental data structures

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Array data structure wikipedia , lookup

Transcript
1 Fundamental data structures
1.1 Introduction
•
The modern digital computer was invented and intended as a device that should facilitate
and speed up complicated and time-consuming computations.
• In the majority of applications, the speed, working frequency and the capability to
store and access large amounts of information plays the dominant part and are
considered to be the primary characteristics of a computing system
• The ability to compute, i.e., to calculate, to perform arithmetic, has in many cases
become almost irrelevant.
• The large amount of information that is to be processed by a computing system in some
sense represents an abstraction of a part of reality.
• The information that is available to the computer consists of a selected set of data
about the actual problem to be solved
• That set is considered relevant to the problem at hand, and it is believed that
from this set of data, the desired results can be derived.
• In this context, the data represent an abstraction of reality
• What is an abstraction?
• An abstraction is a simplification of facts in the sense that certain properties and
characteristics of the real objects are ignored because they are peripheral and
irrelevant to the particular problem.
• In solving a problem with a computer it is necessary to choose an convenient
abstraction of reality
• This can be achieved in two steps:
• (1) Define a set of data that is representative for the real situation. This choice
must be guided by the problem to be solved.
• (2) Then follows a choice of representation of this information. This choice
is guided by the tool that is to solve the problem, i.e., by the facilities offered
by the computer.
• In most cases these two steps are not entirely separable.
• (1) The choose of abstraction is usually determined by the nature of the problem to be
solved.
• This activity is in its turn determined by the tools which will be used in problem
solving, for example the features and facilities offered by a certain computing
system
• (2) The choice of representation of data is often a fairly difficult one, and it is not
uniquely determined by the facilities available. It must always be taken in the light of the
operations that are to be performed on the data.
• It is well known that the internal representation of information inside a computing
system is based on binary digits (bits)
• This representation is unsuitable for human beings because of the usually large
number of digits involved
• But this representation is most suitable for electronic circuits because the two values
0 and 1 can be represented conveniently and reliably by the presence or absence of
electric currents, electric charge, or magnetic fields.
• For this reason, the choice of data representation in the field of computers can be
achieved at different levels of abstraction, depending on the desired objective and the
programming language that is used.
• In fact, the programming language represents the support or the medium which
supports the abstraction
• In this context, a programming language represents:
• An abstract computer capable of interpreting sentences which are build using the
terms of this language
• The terms defined by the programming language may embody a certain level of
abstraction from the objects used by the actual machine.
• Thus, the programmer who uses such a higher-level language will be freed (and barred)
from questions of number representation, if the number is an elementary object in the
realm of this language.
• The usage of a programming language offering a fundamental set of abstractions, which
are applicable in the most part of problems related to data processing, is reflected mainly by
the reliability of the resulted programs.
• It’s much simpler to conceive a program based on reasoning with familiar notions
of numbers, sets, sequences, and repetitions than on bits, storage units, and jumps.
• Of course, an actual computer represents all data, whether numbers, sets, or
sequences, as a large mass of bits.
• But this fact is irrelevant to the programmer as long as he or she does not have to
worry about the details of representation of the chosen abstractions.
• The corresponding representation chosen by the computer (or compiler) is
reasonable for the stated purposes, and is transparent from the programmer point
of view.
• The programmer can operate with abstract notions, which are easier to
understand and to manipulate by him, because there are specialized programs
named translators (compilers), which automatically convert the abstract
notions in fundamental terms of the target computing system.
• The closer the abstractions are to a given computer:
• (1) The easier it is to make a representation choice for the engineer or
implementer of the language,
• (2) The lower is the probability that a single choice will be suitable for all (or
almost all) conceivable applications.
• This fact sets definite limits on the degree of abstraction from a given real computer, or
for a given programming language.
• The area of the programming languages considered in this course includes languages as C
and C++
• These languages cover in a representative manner a large domain including:
• (1) Machine-oriented and machine-dependent language, which offer
support for solving representation problems in an open manner
• (2) High level language involving an high degree of abstraction, which
solve automatically the representation problems
• In this context, the programmer can use the convenient abstraction level in
accordance with the application to be solved.
• In other words, the process of abstraction and representation of data used in
programming activity is supported by the programming language.
• How is this achieved?
1.2 Data Types. Abstract Data Types
1.2.1 The Concept of Data Type
• In data processing, a clear distinction is made between:
• (1) Real numbers and variable
• (2) Complex numbers and variables
• (3) Logical variables
• (4) Variables representing individual values, or sets of values, or sets of sets, or
between functions, sets of functions, and so on.
• In this sense, we will adhere to the principle that every constant, variable, expression, or
function is of a certain type.
• A type is essentially characterized by:
• (1) The set of values to which a constant belongs, or which can be assumed by a
variable or expression, or which can be generated by a function belonging to this
type
• (2) A certain degree of structuring (organizing) of the information
• (3) A set of specific operators
• In mathematical texts, the type of a variable is usually deducible from the typeface,
respectively from the shape of the fonts used to represent it, without consideration of
context;
• This is not feasible in computer programs where usually there are available a
reduced set of fonts
• That is the reason for in practice, is therefore widely accepted the rule that the associated
type is made explicit in a declaration of the constant, variable, or function, and that this
declaration textually precedes the application of that constant, variable, or function.
• This method simplify the translator activity, permitting an automatic verification
of the legality of using the associated entities of the considered type
• The primary characteristics of the concept of type are the following:
• (1) A data type determines the set of values to which a constant belongs, or which
may be assumed by a variable or an expression, or which may be generated by an
operator or a function.
• (2) The type of a value denoted by a constant, variable, or expression may be
derived from its form or its declaration without the necessity of executing the
computational process.
• (3) Each operator or function expects arguments of a fixed type and yields a
result of a fixed type.
• If an operator admits arguments of several types (e.g., + is used for addition
of both integers and real numbers), then the type of the result can be
determined from specific language rules.
• (4) A type presume a certain degree of structuring of the information
• As a consequence, obeying these rules, a compiler may use this information on types to
check the legality of various language constructs, during compilation phase, without the
necessity of execution of the generated code.
• This kind of redundancy in the program text is extremely useful as an aid in the
development of programs, and it must be considered as the primary advantage
of good high level languages over machine code (or symbolic assembly code).
• From the computing system point of view, the store is a homogeneous mass of bits
without apparent structure.
• But it is exactly the abstract data structure, which alone are enabling human
programmers to recognize meaning in the monotonous landscape of a computer
store.
1.2.2 Abstract Data Types
• Definition:
• An Abstract Data Type (ADT) is defined as a mathematical model to which, a
set of specific operators is added
• The main characteristics of an abstract data type are
• (1) An ADT generalizes the primitive data types (integer, real, etc.), as
functions and procedures are generalizations of the primitive operators (+, -, etc.).
• (2) An ADT conceptual encapsulates a data type in the sense that from the
logical and physical point of view, type definition and all the related operations
concerning this type, are located in a single section (zone) of the program.
• If the necessity of the ADT implementation modification appears, the
programmer:
• (2.1) Knows precisely the place where the modifications must be
done
• (2.2) Can be sure that the modification of the specific section, doesn’t
affect the rest of the program code, if the format of the ADT’s
operators is not modified.
• (3) In addition, outside of the definition section, the target ADT can be treated as a
primitive type
• In this case, a user can treat an ADT defined in this manner, in terms of the
associated operators, being not concerned in any way about their
implementation.
• A problem: certain operators can refer different ADTs. In this case, the
access to those operators must be done in a specific manner in
concordance with the definition section of the each ADT.
• The major disadvantage of using ADT resides in the necessity of a rigorous access
discipline:
• All the references to the encapsulated data of an ADT, must be done only through the
specific ADT’s operators
• This requirement is not verifiable at the compilation level, and must be accepted as
an imposed constraint of the programming discipline.
--------------------------------------------------------------------------------------------------------------------Example 1.2.2
• An List abstract data type is considered:
• The nodes of the list belong to a defined type TypeNode.
• Let be:
• L - an instance of List ADT (L: List),
• v -a node variable (v: TypeNode)
• c - a reference to a list’s node (c: TypeNodeReference).
• A possible set of operators for the ADT List is the following:
1. EmptyList(L:List); - operator which initializes L as an empty list;
2. c:= First(L:List); - operator which returns the reference to the first node of the list,
or nil indicator if the list is empty.
3. c:= Next(L:List); - operator which returns the reference to the next node of the list, or
nil indicator if such a node doesn’t exist.
4. Insert(v:TypeNode, L:List); - operator which inserts node v in list L after the
current node.
5. Retrieve(v:TypeNode, L:List); - operator which supplies in v the content of the
current node of the list L.
---------------------------------------------------------------------------------------------------------------------
• Important remarks over List ADT particularities.
1) Once List ADT was defined and implemented, all lists’ processing can be realized in
terms of the List ADT operators, declaring an convenient number of corresponding
instances of the ADT, respectively one specific instance for each list.
2) From the programmer point of view, the operators shape remain the same as they were
initially defined, doesn’t matter what concrete manner was used for ADT
implementation.
• Even accepting that the ADT implementation is modified in the current
application, for example for a better efficiency, the user programs remain
unchanged. Modifications are supported only by the section that includes
the implementation of the ADT.
3) There isn’t any limit referring to the number of operations that can be applied to the
instances of a specified mathematical model.
• Is necessary to underline that each set of operators is defined for, and
refers a distinct ADT.
4) The efficiency, reliability and correctness of using List ADT is guaranteed only if the
user realizes accesses to the declared lists, only by using the specific operators
associated to the corresponding type. The programmer must respect strictly the
imposed restrictions.
1.2.2.1 Defining an Abstract Data Type (ADT)
• As it was mentioned, an ADT is conceived as a mathematical model to which is
attached a set of operators, defined on this model.
• As result, defining an ADT presumes:
• (1) The mathematical model specification
• (2) The definition of the associate operators
• Programming languages include specific methods for data types definition
• In all the cases the starting point is a fundamental set of types named Primitive Types
• Primitive Types plays the role of bricks in the process of data structures
development
• There are two categories of Primitive Types:
• (1) Standard or Predefined Data Types
• (2) Primitive Data Types defined by user
• (1) Standard or Predefined Types include usually numbers and logical values
• These types are based on well-known standard mathematical models (set of
integer, real or logical numbers) and implement the specific operators defined
over this categories of numbers
• If the individual values of the type are subject of an order relation, then the type
is named ordered or scalar
• (2) Primitive Data Types defined by user
• A well-known method to built new primitive data types is to enumerate the
constitutive values of the type
• For example, in a program concerned with plane geometric figures, we may
introduce a primitive type called ShapeType, whose values may be denoted by the
identifiers rectangle, square, ellipse, and circle.
• In most cases new data types are defined in terms of previously defined data types.
• Values of such a type are usually conglomerates of the previously defined
constitutive types, and they are named structured data types.
• Structured Data Types are usually conglomerates of component values. If there is
only one constituent type, that is, if all components are of the same constituent type, then it
is known as the base type.
• The number of distinct values belonging to a type T is called its cardinality. The
cardinality provides a measure for the amount of storage needed to represent a
variable x of the type T, denoted by x: T.
• For structured data type definition, are used methods for structuring.
• Based on these methods, an entire hierarchy of data types (structures) can be
built.
• The structuring methods can be:
• Static: array, record, set, and sequence.
• Dynamic: list, tree, graph
--------------------------------------------------------------------------------------------------------------------Standard (predefined): integer, real, boolean,…
Primitive Types
(unstructured)
Defined by user: enumeration
DATA
TYPES
Static: array, record, set, sequence
Structured Types
Dynamics: list, tree, graph
----------------------------------------------------------------------------------------------------------------
• In order to use a defined data type, is necessary to declare variables belonging to this
type. This process is known as type instantiation
• The declared variables can be processed using the operators associated to the type.
• There are different kinds of operators:
• (1) Homogenous – can have as operands only instances of the
respective ADT
• (2) Mixed - can have as operands instances of the other types,
respectively instances of the compatible data types.
• (3) Internal – the result of the operator execution produce an instance
of the respective type.
• (4) External - the result of the operator execution produce an instance
that belongs to a different type.
• Categories of operators:
• (1) Primitive or atomic operators – this are usually implemented in
hardware and involve specific build instructions of the computing
system.
• (2) Defined by programming language – this operators usually
include the structuring methods and the access methods to components
• (3) Defined and implemented by the user
• The basic operators are:
• Comparison and assignment - i.e., the test for equality (and for order
in the case of ordered types), and the command to enforce equality.
• The fundamental difference between these two operations is
emphasized by the clear distinction in their denotation
throughout this text.
• Test for equality: x = y (an expression with value
TRUE or FALSE)
• Assignment to x: x:= y (a statement making x equal
to y)
• These fundamental operators are defined for most data types,
but it should be noted that their execution may involve a
substantial amount of computational effort, if the data are large
and highly structured.
• Transfer operators -operators, which organize (constitute) data types
in other data types (generate structures). They are essential for
structured types
• Constructor operators – generate structured values starting from their
components
• Selector operators – extract the component values from a structure
•
Constructors and selectors are in fact transfer operators which organize (constitute) the
constitutive types in structured types and vice-versa
• Each structuring method has its particular pair of constructor-selector
operators which are clear distinguished by their representing manner.
• The transfer operators can be:
• Implicit operators – which are direct implemented by the programming
language (for example the access to a field of a record structure or the access to
an element of an array structure)
• Explicit operators – which are built by the programmer using the facilities of
the programming language (for example the operators processing the nodes of a
list structure)
• Conclusion:
• (1) A type is a manner of structuring information
• (2) For unstructured types is more relevant the set of values (constants) which
constitute the type’s domain than the structuring degree.
• (3) As the type becomes more complex, the structuring degree becomes more an
more relevant
1.2.2.2 Implementation of an ADT
• Implementation of an ADT is in fact a translation: that means that ADT is expressed in
terms of an specific programming language
•
Implementation of an ADT presume two steps:
• (1) To specify the desired structure (mathematical model) in terms of a
programming language.
• (2) To specify the procedures that implement the specific operators, using the
same programming language
• The implementation of an ADT is realized only for the types built by the user.
• For the other types, the specific implementation elements are usually included in
the programming language
• A structured type is built starting from the fundamental types that are defined in the
language, using the available structuring facilities
• The specification of the procedures that implement type’s operators depends on
the manner that was chosen for type implementation
• There is possible to develop a hierarchical implementation of a type.
• That presumes to implement an ADT using the existing implementation of other
ADTs.
• Ideally, the programmers want to built their programs, using languages whose primitive
data types are as much as possible in correspondence with the models and operators of
the desired ADTs
• From this point of view, in many senses, the usual programming languages have
neither enough facilities for developing different ADTs nor to define the specific
operators.
1.2.3 Abstract Data Type. Data Type. Data Structure
• What is the significance of the following terms that are usually confused?
• (1) Abstract Data Type (ADT)
• (2) Data Type (DT) or simply type
• (3) Data Structure (DS)
• (4) Elementary Data or simply data (D)
• (1) As it was specified, an Abstract Data Type is a mathematical model together with the
totality of operations defined on this model.
• At the conceptual level, the algorithms can be designed and expressed in terms of
some abstract data types.
• In order to implement such an algorithm in a programming language, is absolute
necessary to represent those ADT in terms of the data types and of the operators
defined in the respective programming language.
• (2) A Data Type is an implementation of an ADT in a programming language.
• A data type (DT) can’t be used as such.
• Therefore, in the program, the user has to declare variables belonging to the
respective type. These variables can be actually processed by the algorithms
• This process is named instantiation and it leads to different results:
• The instantiation of a unstructured data type produces an elementary data
(ED)
• The instantiation of a structured data type produces a data structures (DS)
• (3) An elementary data is an instance of an unstructured data type
• (4) A data structure is an instance of a structured data type.
• As conclusion, the relationship between this notions is presented in the next
formula:
---------------------------------------------------------------------------
ED (Elementary Data)
ADT
(implementation)
DT
(instantiation)
DS (Data Structure)
----------------------------------------------------------------------------
• Is necessary to be specified that the above formula is valid only for primitive data defined
by the user and for structured data types.
• By implementation we understand type definition (specification)
• By instantiation we understand declaration of a variable associated to the
respective type
• In the case of predefined types, the implementation phase is missing, it being intrinsic
included in the programming language. As result for these types the next formula is valid:
----------------------------------------------------------------------------
DT
(instantiation)
ED
---------------------------------------------------------------------------
1.3 Unstructured Types
1.3.1 Enumeration Type
• In big majority of programs, integers are used even when numerical values are not
directly implied, and the integers represent in fact abstraction of some real entities
• To avoid such situations, the programming languages introduce a new primitive
unstructured data type specified by enumerations of all its possible values.
• Such a type is called an enumeration type. It is definable by enumerating the distinct
values belonging to it: c1,c2,....,cn .
•
Its definition has the form [1.3.1.a.
-------------------------------------------------------------// Enumeration Type definition - Variants
enum typeEnumeration {c1,c2,c3,...,cn} variable;
typedef enum {c1,c2,c3,...,cn} typeName;
[1.3.1.a]
-------------------------------------------------------------• Enumeration type is an ordered type, its values are considered ordered in an
increasing manner in the order in which they were defined, based on the convention:
(ci < cj) ⇔ (i < j)
•
The type cardinality is card(tipEnumerare) = n .
•
Enumeration type is an unstructured type defined by the user
o As consequence, its use presumes both the implementation phase (type
declaration) and instantiation phase (declaration of the variables belonging to
the type)
-----------------------------------------------------------// Example of implementation Enumeraration Type
/*1.3.1.b*/
typedef enum {rectangle, square, circle, ellipse} typeFigure;
typedef enum {white, red, orange, yellow, green, blue, brown,
black} typeColor;
typedef enum {true, false} typeBoolean;
typedef enum {Monday, Tuesday, Wednesday, Thursday, Friday,
Saturday, Sunday} typeWeekDay;
typedef enum{soldier, caporal, sergeant, sublieutenent, lieutenant,
captain, major, colonel, general} typeArmyDegree;
typedef enum {constant, type, variable, procedure, function}
typeObject;
typedef enum {sequence, array, record, set} typeDataStructure;
------------------------------------------------------------------------
• The definition of such types introduces not only a new type identifier, but at the same
time the set of identifiers denoting the values of the new type (in fact the domain of the
type’s values)
• These identifiers may then be used as constants throughout the source program, and they
enhance its understandability considerably.
-----------------------------------------------------------/* Example of instantiation of an enumeration type */
/*1.3.1.c */
typeColor c;
typeWeekDay d;
typeArmyDegree g;
typeBoolean b;
c
d
g
b
=
=
=
=
white;
wednesday;
captain;
false;
/*c:=
/*d:=
/*g:=
/*b:=
0;*/
2;*/
5;*/
1;*/
-----------------------------------------------------------• As example, based on the types implemented in [1.3.1.b], the variables c,z,g,b
are declared in [1.3.1.c].
• In an evident manner, at the code level, this representation is more intuitive. For
example:
c=0;
d=2;
g=5 ;
b=1,
• This representation is based on the assumption that c,d,g,b are defined as
integers, and in their quality of constants belonging to the respective enumeration
type, they are mapped onto increasing natural numbers in the order in which they
have been declared (enumerated).
• As a consequence, a translator can easy verify the using of unspecific operators
for this kind of non-numerical types. For example the programming statements
c = c+1; or g = white; will be signalized as errors.
• The enumeration type set includes as operators the classical assignment and comparison
• Since the enumeration type is ordered, it is sensible to introduce in its set, operators that
generate the successor and predecessor of their argument.
• The following operators have this effect: SUCC(x) and PRED(x)
1.3.2 Predefined (Standard) Primitive Types
• Predefined (standard) primitive types are those types that are available on most
computers as built-in hardware features.
• They include the whole numbers, the logical truth values, and a set of printable
characters. On many computers fractional numbers are also incorporated, together with
the standard arithmetic operations.
• These types are denoted by the identifiers: INTEGER, REAL, BOOLEAN, CHAR,
SET
• (1) The type INTEGER comprises a subset of the whole integer numbers defined in
mathematical sense
• The size of represented integer number may vary among individual computer
systems. If a computer uses n bits to represent an integer in two's complement
notation, then the admissible values x must satisfy
-2n-1 < x < 2n-1
• It is assumed that all operations on data of this type are exact and correspond to
the ordinary laws of arithmetic
• The computation process will be interrupted in the case of a result lying outside
the representable subset. This event is called overflow (depăşirea capacităţii
registrelor - DCR).
• Over the set of integer numbers, beside the classical operators comparison and
assignment, a number of standard operators are defined:
• The standard operators are: addition (+), subtraction (-), multiplication
(*), integer division (/) and modulo (%)
m - n < (m / n)* n <= m
(m / n)* n + (m % n) = m
--------------------------------------------------------------
Integer ADT
Mathematical model: scalar elements with values in the set of
integer numbers: {..., -3, -2, -1, 0, 1, 2, 3,....}.
Notations:
i,j,k inz
inn
e
b
-
integers;
[1.3.2.a]
nonzero integer;
nonnegative integer;
integer value (constant or variabile);
Boolean value.
Operators:
i= IntegerAssignement(i,e) – stores the value of e in
variable i;
k= IntegerAddition(i,j)
k= IntegerSubtraction(i,j)
k= IntegerMultiplication(i,j)
k= IntegerDivision (i,inz)
inn= Modulo(i,inz) – return the positive integer that
represents the reminder of dividing i with inz
b= EqualToZero(i)
b= BigerThanZero(i)
------------------------------------------------------------
• (2) The type REAL denotes a subset of the real numbers.
• Whereas arithmetic with operands of the types INTEGER is assumed to yield exact
results, arithmetic on values of type REAL is permitted to be inaccurate within the
limits of round-off errors caused by computation on a finite number of digits.
• The real numbers are represented with a specified precision expressed as a number
of decimal digits.
• This is the principal reason for the explicit distinction between the types INTEGER
and REAL, as it is made in most programming languages.
• The standard operators are the four basic arithmetic operations of addition (+),
subtraction (-), multiplication (*), and division (/).
• The operations that produce values which are outside of the representable domain
of the specific real implementation, cause overflow (DCR).
• The current implementations of the programming languages include many
categories of integer types (int, long int, short int, unsigned int, etc.)
respectively more categories of real types (float, double, long double, etc) which
differs by the number of digits used in representation.
• (3) The type Boolean has two values that are denoted by the identifiers TRUE and
FALSE
• In the language C these values are substituted by integer values:
• 1 or <> 0 means true
• 0 means false
• The Boolean operators are the logical conjunction, disjunction, and negation
whose values are defined in Table.3.2.a
P
TRUE
TRUE
FALSE
FALSE
P && Q
TRUE
FALSE
FALSE
FALSE
Q
TRUE
FALSE
TRUE
FALSE
P || Q
TRUE
TRUE
TRUE
FALSE
!P
FALSE
FALSE
TRUE
TRUE
Table 1.3.2.a. Logical operators
• (4) The standard type char comprises a set of printable characters.
o The ASCII codification (American Standard Code for Information
Interchange)
-------------------------------------------------------------Capitals:
A
B
C
...
X
Y
Z
hexadecimal: x’41' x'42' x'43'
x'58' x'59' x'5A'
decimal:
65
66
67
88
89
90
Decimal digits: 0
hexazecimal: x'30'
decimal:
48
1
x 31'
49
2
x 32'
50
...
'
Lower-case letters:
hexadecimal:
decimal:
a
x'61'
97
b
x'62'
98
c
x'63'
99
'
9
x 39'
57
'
...
z
x'7A'
122
Blanc character: hexadecimal: x’20’
decimal: 32
--------------------------------------------------------------
• Usually, the programming languages define primitive type char
• The nature of a char x:
('A' <= x) && (x <= 'Z') - x is capital letter;
('a' <= x) && (x <= 'z') - x is a lower-cased letter;
('0' <= x) && (x <= '9') - x is a decimal digit.
• In C language, types char and int are equivalent, thought they have different
interpretation
• Two standard transfer functions are defined between char and integer types:
• f('5')=5 transforms a character in the corresponding integer
• g(8)='8' transforms an integer in the corresponding character
• f and g are inverse functions
f(g(i))=i
g(f(c))=c
(0 < i < 9)
('0' < c < '9').
-------------------------------------------------------------// Implementation of the transfer functions integer-character
char c; int n;
[1.3.2.b]
n= c-'0'
//f(c)-> i
c= n+'0';
//g(i)-> c
--------------------------------------------------------------
1.4 Structured Types
1.4.1 Array Structure. Abstract Data Type Array
• The array is probably the most widely used data structure; in some languages it is even
the only one available.
• An array consists of components, which are all of the same type, called its base type; it
is therefore called a homogeneous structure.
• The array is a random-access structure, because all components can be selected at
random and are equally quickly accessible.
• In order to denote an individual component, the name of the entire structure is
augmented by the index which selects positional the component.
• This index is to be an integer between 0 and n-1, where n is the number of elements, the
size, of the array.
• Defining an array presumes to specify:
• (1) The structuring method, which is directly implemented in the language
• (2) The name of the resulting data type (array_type)
• (3) The array base type (element_type)
• (4) The index type (index_type).
• In C there are some particularities:
• The structuring method isn’t explicitly present. It is indicated by the presence of the
right brackets in the syntax of the declaration.
• The index type is always integer (int)
• The domain of the index values is restraint to the set of positive integer numbers:
0,1,2,3,...
• The size of the array (the total number of elements) is specified between the right
brackets in the declaration
--------------------------------------------------------------------
/*Definition of an array */
/*1.4.1.a*/
element_type array_name[number_of_elements];
-------------------------------------------------------------/* Definition of an array type */
/*1.4.1.b*/
typedef element_type array_type[number_of_elements];
array_type array_name;
-----------------------------------------------------------------
/* Definition of an array - examples */
float array[10];
/*1.4.1.c*/
char string[10];
-----------------------------------------------------------------
/* Definition of an array type - example */
typedef float array_type[100];
array_type a,b;
-----------------------------------------------------------------
• The array is a static data structure, for which the memory allocation is realized
in the compilation phase of the program
• As consequence when an array structure is defined, is absolutely necessary to
specify the exactly number of elements in other words, the array’s size.
• A structured value (an instance) of an array data type array_type having the
component values c1, c2..., cn can be initialized by an array constructor and a
assignment operation:
element_type array_name[number_of_elements]=
{expression_const0, expression_const1,...};
-------------------------------------------------------------/* Construction of an array */
int vect[5]={0,1,33,4,8};
int mat[2][3]={{11,12,13},{1,2,3}};
[1.4.1.d]
char str[9]="Turbo C++";
--------------------------------------------------------------
• The reverse operator of a constructor is the selector. This selects an individual
component of an array
• For an array variable vector , an array selector is specified adding to the name of the
structure, the index i of the selected component: vector[i].
• Instead of a constant index, an index expression can be used
• Evaluation of the index expression during the program execution leads to the
selected component as the algorithm requires.
• If the base type is an ordinal type, then over the corresponding array type, a natural
order relationship can be defined.
• The natural order of two arrays having the same base type is determined by those
corresponding components, which are different and have the smallest index.
(2,3,5,8,9)>(2,3,5,7,11)
'CAPAC' < 'COPAC'
-------------------------------------------------------------/* The natural order of two arrays */
typedef element_type array_type[n];
array_type x,y;
/*1.4.1.e*/
(x<y) <=> Ek:0≤k<n:((Ai:0≤i<k:x[i]=y[i])&(x[k]<y[k]))
--------------------------------------------------------------
• A classical example in this sense is the lexicographic order applicable to the arrays of
characters.
• Since all the components of an array array_type belongs to the same base type
element_type:
Card (array_type) = Card (element_type) Card (index_type) .
• The arrays having one dimension are named linear arrays
•
The access to any element of a linear array is realized using a single index in
the selection mechanism.
• element_type specified in the an array definition can be in its turn an array
• Using this mechanism, arrays with more dimensions can be defined
•
Such an array is called matrix
•
The access to a matrix element is realized using a number of indexes equal to
the number de dimensions of the matrix.
mat[i,j,k,… ,m,n].
--------------------------------------------------------------------
/* Defininition of a matrix type */
/*1.4.1.f*/
typedef element_type array_type [number_of_elements];
typedef array_type matrix_type[number_of_elements_1];
-------------------------------------------------------------•
A matrix can be also defined as a multidimensional array.
-------------------------------------------------------------/* Definition of a matrix */
element_type matrix_name[number_of_elements]
[number_of_elements_1];
/*1.4.1.g*/
-------------------------------------------------------------/* Definition of a multidimensional array (matrix) */
typedef element_type array_type [number_of_elements_1]
[number_of_elements_2]... [number_of_elements_n];
array_type matrix;
/*1.4.1.h*/
-------------------------------------------------------------•
The abstract data type array is summarized in [1.4.1.i]
•
In [1.4.1.j] is presented an example of implementation of an array ADT.
-------------------------------------------------------------ADT Array
Model matematic: Sequence of elements belonging to the same
type (base type). The associate index belongs to the
domain [0,1,2,...). There is a biunivoc correspondence
between indexes and array’s elements.
Notations:
element_type – the type of array’s elements (base type);
a – mono dimensional array;
i - index;
[1.4.1.i]
e – value of element_type.
Operators:
StoreInArray(a,i,e) – operator that stores the e value in
the i-th element (position) of the array a;
e= RetrieveFromArray(a,i) – operator that returns the value
of the i-th element (position) of the array a;
-------------------------------------------------------------/* Example of implementation
/*1.4.1.j*/
#define max_elem_number integer_value
typedef element_type array_type[max_elem_number];
int i;
array_type a;
element_type e;
a[i]=e; /*StoreInArray(a,i,e)*/
e=a[i]; /*RetrieveFromArray(a,i)*/
---------------------------------------------------------------------------
1.4.2 Array Searching
• The task of searching is one of most frequent operations in computer programming.
• It also provides an ideal ground for application of the data structures so far encountered.
• There exist several basic variations of the theme of searching, and many different
algorithms have been developed on this subject.
• The basic assumption in this presentation is:
• We have a collection of data, containing N elements, among which a given
element is to be searched.
• We shall assume that this set of N elements is represented as an linear array a
element_type a[n];
• Typically, the type item has a record structure with a field that acts as a key.
• The task of search consists of finding an element of whose key field is equal to a given
search argument x. The result of the search is an index i.
• The resulting index i, satisfying a[i].key = x, then permits access to the
other fields of the located element.
• Since we are here interested in the task of searching only, and do not care about the data
for which the element was searched in the first place, we shall assume that the type
element_type consists of the key only, i.e. is the key.
• In other words we will search in an array of keys.
1.4.2.1 Linear Search
• When no further information is given about the searched data, the obvious approach is to
proceed sequentially through the array in order to increase step by step the size of the
section, where the desired element is known not to exist.
• This approach is called linear search and is presented [1.4.2.1.a].
-------------------------------------------------------------/*Linear search - variant while */
/*1.4.2.1.a*/
element_type a[n];
element_type x;
int i=0;
while ((i<n-1) && (a[i]!=x))
i++;
if (a[i]!=x ){
/*the array doesn’t contain the searched element*/
}
else{
/*there is a coincidence at index i*/
}
--------------------------------------------------------------------
• There are two conditions which terminate the search
• (1) The element is found, i.e. a[i] = x
• (2) The entire array has been scanned, and no match was found; i=n-1 and
a[i]≠x
• The invariant, i.e the condition satisfied before each incrementing of the index i, is
(0<= i < n-1)&(Ak:0 <= k < i:a[k]<> x)
• The invariant express that for all values of k < i , no match exists
• The search termination condition is:
((i = n-1)OR(a[i]= x))&(Ak:0 <= k< i:a[k]<> x)
• This condition can be used in a search algorithm based on do cycle [1.4.2.1.b]
---------------------------------------------------------------------------
/*Linear search
- variant do*/
/*1.4.2.1.b*/
element_type a[n];
element_type x;
int i=-1;
do{
i++;
}while ((i<n-1) && (a[i]!=x))
if (a[i]!=x ){
/*the array doesn’t contain the searched element*/
}
else{
/*there is a coincidence at index i*/
}
--------------------------------------------------------------------
• To be noticed:
• When the search process is finished, there is one more comparison to be made
between a[i] and x in order to establish the existence or nonexistence of the
searched element.
• When the algorithm did find a match, it found the one with the least index, i.e. the
first one if they are more elements with the same key.
• The classical linear search is implemented using the for statement [1.4.2.1.c]
-------------------------------------------------------------/*Linear search - variant for */
/*1.4.2.1.c*/
/*returns value n if x is not found*/
/*returns a value i<n for x found on position i */
int linear_search(int x,int a[],int n)
{int i;
for(i=0;i<n && a[i]-x;i++);
return i;
}
-------------------------------------------------------------• As is to be observed, for each cycle of the algorithm, the Boolean expression which
contains two factors is evaluated and the index is incremented
• This process can be accelerated, by simplifying the Boolean expression
• A solution is to replace the two Boolean factors by only one, which implies the both of
them.
• This is possible only by guaranteeing that a match will be found always
• For this purpose a supplementary element a[n] is added to the array, whose value is
initially assigned with value x (the searched element). The array becomes:
element_type a[n+1];
• The auxiliary element x is called sentinel, and the search - sentinel technique
• It’s evident, if when the algorithm finishes i = n , then the searched element
doesn’t exist in array.
• The new shape of the search algorithm appears in [1.4.2.1.d].
---------------------------------------------------------------------------
/*Linear search - sentinel technique (while) */
/*1.4.2.1.d*/
element_type a[n+1];
element_type x;
int i=0;
a[n]=x;
while (a[i]!=x)
i++;
if (i == n){
/*the array doesn’t contain the searched element*/
}
else{
/*there is a coincidence at index i*/
}
--------------------------------------------------------------------
• The ending condition deducted from the cycle invariant is:
(a[i]= x)&(Ak:0 <= k < i:a[k]<> x)
• In [1.4.2.1.e] the same algorithm appears implemented using do cycle and in [1.4.2.1.f]
using for cycle
-------------------------------------------------------------/* Linear search - sentinel technique (do) */
/*1.4.2.1.e*/
element_type a[n+1];
element_type x;
int i=-1;
a[n]=x;
do{
i++;
}while (a[i]!=x)
if (i == n){
/*the array doesn’t contain the searched element*/
}
else{
/*there is a coincidence at index i*/
}
-------------------------------------------------------------/* Linear search - sentinel technique (for)*/
/*returns value n if x is not found*/
/*returns a value i<n for x found on position i */
int sentinel_search(int x,int a[],int n)
/*1.4.1.f*/
{int i;
a[n]=x;
for(i=0;a[i]-x;i++);
return i;
}
-------------------------------------------------------------•
The linear search performance is O(n/2), that means an element is found in average,
after half of the array’s elements have been tested.
1.4.2.2 Binary Search
• The search process can be accelerated, if more information is available about the
searched data.
• It is well known that a search can be made much more effective, if the data are ordered
(for example, a telephone directory in which the names were not alphabetically listed)
• We shall therefore present an algorithm which makes use of the knowledge that array a
is ordered, i.e., it satisfies the predicate Ak:
-------------------------------------------------------------/* The ordered array definition*/
element_type a[n];
Ak:0<k<=n-1:a[k-1]<= a[k]
[1.4.2.2.a]
--------------------------------------------------------------
• The basic idea is:
• Inspect an element picked at random, say a[m], and compare it with the search
argument x
• If a[m] is equal to x, the search terminates;
• If a[m] is less than x, we infer that all elements with indices less or equal to m
can be eliminated from further searches. The search interval can be restraint to
elements having indexes grater than m.
• If a[m] is greater than x, all elements with index greater or equal to m can be
eliminated. The search interval can be restraint to elements having indexes
smaller than m.
• This results in the following algorithm called binary search;
• The variables left and right mark the left and at the right end of the section
of a in which an element may still be found.
--------------------------------------------------------------------
/*binary search */
/*1.4.2.2.b*/
element_type a[n]; /*n>0*/
element_type x;
index_type left,right,m;
int found;
left=0; right=n-1; found=0;
while((left<=right)&&(!found))
{
m=(left+right)/2; /* or any values between l and r */
if(a[m]==x)
found=1;
else
if(a[m]<x)
left=m+1;
else
right=m-1;
}
if(found) /* there is a match at index I */
-------------------------------------------------------------------•
The loop invariant, i.e. the condition satisfied before each step, is
(left<=right)&(Ak:0 <= k <left:a[k]<x)&(Ak:right<k <=n-1:a[k]> x)
•
The end condition is:
found OR((left>right)&(Ak:0<=k<left:a[k]<x)&
(Ak:right<k<=n-1:a[k]>x))
• The choice of m is arbitrary in the sense that correctness does not depend on it, but it
does influence the algorithm's effectiveness.
• The optimal solution is to choose the middle element, because this eliminates half of
the array in any case.
• As a result, the maximum number of steps is O(log 2 n),
• Hence, this algorithm offers a drastic improvement over linear search, where the
expected number of comparisons is n/2.
• A more compact version of binary search is presented in [1.4.2.2.c]
-------------------------------------------------------------/* binary search 1 */
int binary_search(int x,int a[],int n){
int left=0,right=n-1,m;
do{
(x>a[m=(left+right)/2])?(left=m+1):(right=m-1); [1.4.2.2.c]
}while(a[m]-x && left<=right);
return(a[m]-x)?n:m;
}
--------------------------------------------------------------
• The efficiency of binary search can be somewhat improved if – as in the case of linear search
– a solution could be found that allows a simpler condition for termination.
• This objective can be attaint, if we abandon the naive wish to terminate the search as
soon as a match is established.
• This seems unwise at first glance, but on closer inspection we realize that the gain in
efficiency at every step is greater than the loss incurred in comparing a few extra
elements.
• Remember that the number of steps is at most log 2 n .
• The faster solution is based on the following invariant:
(Ak:0<=k<left:a[k]<x)&(Ak:right<k<=n-1:a[k]>x)
• The search process continues until the search interval becomes trivial (0 or 1 element).
• This technique is presented in [1.4.2.2.d].
-----------------------------------------------------------/* ameliorated binary search */
/*1.4.2.2.d*/
element_type a[n]; /*n>0*/
element_type x;
index_type left,right,m;
left=0; right=n;
while(left<right){
m=(left+right)/2;
if(a[m]<x)
left=m+1;
else
right=m;
}
if(right>n)
/*the array doesn’t contain the searched element*/
if(right<=n)
if(a[right]==x)
/*the element was found*/;
else /*the array doesn’t contain the searched element*/;
--------------------------------------------------------------
• In this case the end condition is left = right.
• Even though, the equality l = r doesn’t automatically presume that the searched
element was found
• If right > n doesn’t exist any coincidence. It was searched an element greater
than any other element of the array
• If right<=n the element a[right] corresponding to the last index right
was not compared with the key x
• Hence, an additional test for equality a[right]=x is necessary in order to
establish the existence of the coincidence.
• In contrast to the first solution, this algorithm -- like linear search -- finds the matching
element with the least index.
• Another version of the ameliorated binary search appears in [1.4.2.2.e].
-----------------------------------------------------------/* ameliorated binary search - variant 2 */
int ameliorated_binary_search (int x,int a[],int n){
int left=0,right=n,m;
do{
(x>a[m=(left+right)/2])?(left=m+1):(right=m); /*1.4.2.2.e*/
}while(left<right);
return
((right>=n)||(right<n && a[right]-x))?n:right;
}
-----------------------------------------------------------1.4.2.3 Interpolation Search Technique
• Another possible improvement of the binary search presumes to determine with more
precision the place in current interval in which the search is made, instead of access
always the element placed in the middle.
•
Thus, if the search is made in an ordered structure, for example in phone book,
when is searched a name starting with B is normal to search somewhere at the
beginning of the book, instead a name starting with V, somewhere at the end of
the book
• This method is named search by interpolation, and it requires a simple modification
reported to the binary search
• Thus, in binary search, the new place in which the access is realized is always in the
middle of the current search interval.
1
* (right − left )
2
• This place is determined based on the relation:
m = left +
where left and right are the limits of the current search interval
• The next evaluation of m, named interpolation search, is considered to be more efficient
m = left +
x − a[left ]
* (right − left )
a[right ] − a[left ]
• This evaluation takes care about the effective values of the keys which bounds the
search interval, when the index of the next search is established
• This estimation however presumes a normal distribution of the keys’ values over
the target domain.
--------------------------------------------------------------/* interpolation search */
int interpolation_search(element a[],int last,int x){
long m,left,right;
left=0;
right=last-1;
[1.4.2.1.m]
do{
m=left+((x-a[left].key)*(right-left))/
(a[right].key-a[left].key);
if (x>a[m].key) left=m+1;
else right=m-1;
}
while((left<=right)&&(a[m].key!=x)&&(a[right].key!=a[left].key)
&&(x>=a[left].key)&&(x<=a[right].key));
if (a[m].key==x) return m;
else return last;
}
---------------------------------------------------------------
• Property. Interpolation search requires either is successfully or unsuccessfully less than
lg (lg (n) +1) comparisons [Se 88].
• The function lg (lg n) is growing extremely slow:
• Thus, for example, for values of n around 109, lg(lg n)< 5.
1.4.3 Record Structure. Abstract Data Type Record
• The most general method to obtain structured types is to join elements of arbitrary types
that are possibly themselves structured types, into a compound type.
• Examples from mathematics are:
• Complex numbers, composed of two real numbers,
• Coordinates of points composed of two or more numbers according to the
dimensionality of the space spanned by the coordinate system.
• In mathematics, such a compound type is named the Cartesian product of its constitutive
types.
• The set of values defined by this compound type consists of all possible
combinations of values, taken one from each set defined by each constituent type.
• Such a combination is also called n-tuple
• Thus, the number of such possible combinations is the product of the number of
elements in each constituent set, that is, the cardinality of the compound type is the
product of the cardinalities of the constituent types.
• The term that describes a compound data structure in programming is record or struct (in
C).
• In [1.4.3.a.] a generic definition mode of a record type in C is presented.
-------------------------------------------------------------/*Struct (record) definition */
struct name_structure{
type1 list_nameField1;
[1.4.3.a]
type2 list_numeField2;
...
typen list_nameFieldn;
} list_var_structure;
--------------------------------------------------------------
• Examples of record data types definition are presented in [1.4.3.b].
-------------------------------------------------------------/* Examples of record structures defining */
struct data {
int day, month, year;
} birth_data, enrolment_data
struct {
int day, month, year;
} birth_data, enrolment_data
struct data{
int day, month, year;
}
struct data birth_data, enrolment_data
typedef struct data {
int day, month, year;
}calendar_data;
[1.4.3.b]
calendar_data data birth_data, enrolment_data;
typedef struct person_structure {
char[12] name;
char[12] surnume;
calendar_data birth_data;
enum state(unmarried, married, divorced, widow)
social_situation;
enum sex(male, female)sex;
}person;
--------------------------------------------------------------
• The identifiers nameField1, nameField2 ,..., nameFieldn specified in the type
definition, are names given by the user to the individual components of the type.
• They are used in record selectors applied to record structured variables.
• Given a variable x of the type name_structure, its i-th field can be denoted by
using the “dot” notation:
x.nameFieldi
• Selective updating or assignment of x is achieved by using the same selector denotation
on the left side in an assignment statement:
x.numeFieldi:= xi
where xi is a value (expression) of type typei
• In [1.4.3.c] are presented some examples of the above defined record types’ instantiation
-----------------------------------------------------------/* Examples of record structures instantiation */
calendar_data d;
person p;
[1.4.3.c]
d.month=9;
p.name=”merriam”;
p.birth_data.day=24;
p.social_situation=unmarried;
-----------------------------------------------------------
• The example of the type person_structure shows that a constituent of a record type
may itself be structured. Thus, selectors may be concatenated.
• Naturally, different structuring types may also be used in a nested fashion.
• For example, the i-th component of an array a being a component of a record
variable r is denoted by:
r.a[i]
• The component with the selector name nameFieldi of the i-th record
structured component of the array a is denoted by:
a[i].nameFieldi
• From the formal point of view, abstract data type record is presented in [1.4.3.d.].
-------------------------------------------------------------TDA Record
[1.4.3.d]
Mathematical model:
A finite collection of elements named fields, which can
belong to different types. There is a biunivoc
correspondence between the identifiers of the fields end
the elements collection.
Notations:
a - object of type record;
id - identifier name of a field;
e - object of the same type as field id of record a
Operators:
StoreInRecord(a,id,e) – stores the e value in the field
id belonging to a
e= RetrieveFromRecord(a,id) – returns the value of field id
belonging to a
--------------------------------------------------------------
• In [1.4.3.e.] are presented some examples of implementation of ADT Record.
• As can be noticed, the definition and instantiation phases can be separated or can
be cumulated in different modes, the C programming language being very flexible
in this sense
-------------------------------------------------------------/* Examples of implementation */
struct point {
int x;
int y;
}
struct rectangle {
struct point corner1;
struct point corner2;
}
[1.4.3.e]
struct point p1; double dist;
struct rectangle window;
-------------------------------------------------------------//Example of usage
dist=sqrt((double)(p1.x*p1.x)+(double)(p1.y*p1.y));
window.corner1.x=188;
-----------------------------------------------------------------
1.4.4 Union Structure
•
The C language defines union structure
•
A union is in fact a variable that can memorize at different moments of time objects
belonging to different types and having different sizes. The compiler keeps the evidence
of the objects and aligns in a correspondent manner the stored data.
•
The unions have been conceived in the idea to manipulate different data types in a unique
location (zone) of memory.
-----------------------------------------------------------/* Example of defining a union structure */
union union_indicator {
int integer_value
;
[1.4.4.a]
float real_value;
char *char_string;
} union_variable
-----------------------------------------------------------
• In [1.4.4.a] union_variable has a size that depends on the type of variable that was
assigned to it. This is transparent from the user point of view.
• In other words, for any correct utilization of the type, the actually stored value
belongs to last memorized.
• The access to the structure component is realized using dot notation.
union_variable.member
or union_pointer.member
1.4.5 Sequence Structure. Abstract Data Type Sequence
• The common characteristic of the presented data structures (arrays, records, sets) is their
finite cardinality; respectively the cardinality of the all their component types is finite.
• That is the reason why, their implementation raise no problems.
• The most part of the so called advanced data structures (sequences, lists, trees, graphs)
are characterized by non-finite cardinality
• This difference vis-à-vis the classical data structure is very important and has
significant practical consequences.
• For example, the sequence structure, having as base type T0 is defined as follows:
-----------------------------------------------------------S0=< >
(null sequence)
Si=<Si-1,si>
where 0 < i and si∈ T0
-----------------------------------------------------------• In other words, a sequence with base type T0 , is or:
• (1) Null sequence,
• (2) A concatenation of a sequence (with base type T0) with a value (si) of type
T0 .
• The recursive definition of the sequence data type has as result a non-finite cardinality
• Each value of the sequence data type contains in fact a finite number of
components belonging to the type T0 ,
• This number is boundless because starting with any sequence is possible to build
a longer one.
• Important consequences:
• (1) The necessary memory storage space for an advanced data structures cant’ be
known at the compilation moment
• (2) A dynamic allocation memory scheme is necessary, based on which the
memory is allocated to the “growing” structures and is free by the “decreasing”
structures "
• To implement such a requirement, the used high level programming language has to
contain system functions allowing:
• Dynamic Allocation/Dispose of memory
• Dynamic Link/Reference of components
• Under these circumstances advanced data structures can be described and used, with
the help of programming language instructions.
• In the next chapters of this course, the techniques for generating and processing such
advanced data structures are presented.
• The sequence structure is from this point of view an intermediary structure
•
It is an advanced data structure from the point of view of its cardinality which
is non-finite
•
It is so currently used that its inclusion in the set of fundamental data structures is
well established.
•
This situation is due to the fact that the selection of a suitable operators set for
this structure, permit to implementers to choose adequate and efficient
representations of this structure.
•
As result, the dynamic memory allocation mechanisms become simple enough
to allow an efficient implementation, not affected by details at a high level
programming language.
1.4.5.1 Abstract Data Type Sequence
• The appurtenance of the sequence structure to the fundamental data structures, presume
the restraint of its operators set in such a manner, that only the sequential access to its
components is allowed.
• This structure is known under the name of sequential file or for short file.
• Its formal definition is:
s = <s0, s1, s2,..., sn-1>
• The sequence structure is affected by the restriction that all its components belong to the
same base type, so it is a homogeneous data structure.
• The components number n, named also length of the sequence, is presumed to be
unknown during the compilation as well as during the code execution phase.
• More than that, this number is not a constant; it can be modified during the
program execution.
• Although every sequence has at any time a specific, finite length, we must consider the
cardinality of a sequence type as non-finite, because there is no fixed limit to the
potential length of sequence variables.
• Depending on the implementation manner, the sequence structures are stored on different
kinds of storage media, where sometime, data are to be moved from one medium to
another, such as from disk or tape to primary store or vice-versa.
• The usual storage devices for sequence structures are magnetic tapes and disks
in sequential allocation.
• Sequence is an ordered structure. The order of structure elements is established by the
time order of their generation.
• Due to its sequential access mechanism, index selection of the individual components of a
sequence becomes inadequate.
• As result, at the declaration of a type sequence, only the base type is specified:
TYPE SequenceType = FILE OF BaseType;
• In a sequence structure defined like above, in any moment is accessible only one
component, named current component, specified by an indicator (pointer) implicit
associated to the sequence.
• This indicator advances in a sequential manner at the next element, practically
after execution of any operator over the current element.
• In order to determine the end of a sequence structure, a specific Boolean operator is
introduced: end of file specified by Eof(f: SequenceType)
• The effective access manner to sequence components, as well as the actually processing
possibilities, depends on the implementation, and result from the semantic of the
operators defined for the abstract data type sequence.
• In [1.4.5.1.b] is presented a C implementation of the sequence structure.
-----------------------------------------------------------ADT Sequence
Mathematical model: sequence of elements belonging to the same
type. An indicator indicates the current accessible
element of the sequence. The access is strictly
sequential.
Notations:
ElementType – the type of the sequence element. It can’t
be sequence type;
s - variable sequence;
e - variable of ElementType;
[1.4.5.1.a]
b – boolean value;
DiscFileName – characters string.
Operators:
Assign(s, FileDiscName) – assigns to variable sequence s
the name of the disc file;
Rewrite(s) – operator which poses the sequence indicator at
the beginning of the f and opens the file f in writing
mode. If s doesn’t exist it is created. If s already
exists, its old variant is lost and a new empty file s
is created;
ResetSequence(s) - operator which poses the sequence
indicator at the beginning of the s and opens the file s
in reading mode. If s doesn’t exist an execution error
is signalized;
OpenSequence(s) – operator which in some implementations
play the role of rewrite or reset sequence;
b:= Eof(s) – function which returns the value true if the
sequence indicator is positioned on the marker of the
end of file s;
RetreiveSequence(f,e) – operator acting in reading mode.
While Eof(f) is false, the operator supplies in
variable e the current element indicated by sequence
indicator and advances at the next element;
StoreSequence(s,e) - for the sequence s opened in writing
mode, store value e in the current element indicated
by the sequence indicator and advances to the next
element;
Add(s) – operator which opens sequence s in writing mode,
poses the sequence indicator at its end, with the
possibility to add new elements at the end of the
sequence s;
CloseSequence(s) – closes sequence s .
-----------------------------------------------------------{Implementation of Type Sequence – Pascal variant}
TYPE ElementType = ... ;
SequenceType = TEXT;
VAR
s: SequenceType;
[1.4.5.1.b]
e: ElementType;
DiscFileName: string;
{Assign(s, DiscFileName)}
assign(s, DiscFileName)
{Rewrite(s)}
rewrite(s)
{StoreSequence(s,e)}
write(s,e) ; writeln(...)
{ResetSequence(s)}
reset(s)
{Eof(s)}
eof(s)
{RetreiveSequence(s,e)}
read(s,e); readln(...)
{Add(s)}
append(s)
{CloseSequence(s)}
close(s)
------------------------------------------------------------
1.4.5.2. Abstract Data Type Direct Access Sequence
• The main disadvantage of the sequence structure is the fact that the access to the
sequence components is realized in a strictly sequential manner, through an indicator
whose positioning is not accessible to the user in a direct manner.
• That is the reason for starting from abstract data type sequence, a new more flexible
structure where developed namely direct access sequence.
• Thus, a direct access sequence has an indicator which allows the access in direct manner to
any of the sequence’s component.
• For this purpose, each sequence’s component is identified by an index which indicates its
position in a similar manner with array structure.
• Components
numbering starts
SequenceDimension-1.
with
0,1,2,...
and
continues
until
• But, the creation and the modification of the direct access sequence are realized in
sequential manner too.
• In other words we can add or modify records, but we cant’ insert or delete the existing
records.
• The abstract data type direct access sequence is synthetically presented in [1.4.5.2.a] and
in [1.4.5.2.b] a Pascal implementation example.
-----------------------------------------------------------ADT Direct Acces Sequence
Mathematical model: a sequence of elements belonging to the
same type. An indicator indicates the current accessible
element of the sequence. The indicator can indicate any
element of the sequence. Each sequence component is
uniquely identified by an index. The counting of
components starts with 0,1,2,... and continues until
SequenceDimension-1.
Notations:
ElementType - the type of the sequence element. It can’t
be sequence type;
das – direct access sequence;
e
- variable of ElementType;
[1.4.5.2.a]
i
- index in das.
Operators:
Assign(das, DiscFileName) - assigns to variable direct
access sequence das the name of the disc file;
Rewrite(das) – operator which poses the direct access
sequence indicator at the beginning of the das and opens
the sequence das in writing mode. If das doesn’t exist
it is created. If das already exists, its old variant is
lost and a new empty direct access sequence is created.
In both cases, the associate sequence indicator takes
the value 0;
Reset(das) - operator which poses the sequence indicator at
the beginning of the das and opens the sequence in
reading mode. The indicator takes the value 0. If das
doesn’t exist an execution error is signalized;
Open(das) – operator which in some implementations play
the role of rewrite or reset sequence;
i:= SequenceDimension(das) – operator which returns the
current total number of components stored in sequence
das;
i:= SequencePosition(das) – the operator returns the
number of the current record which is specified by the
sequence indicator. When a sequence is opened,
SequencePosition takes the value 0;
Search(das,i) – selects the component specified by i,
posing the sequence indicator to point the this
component;
Retreive(das,e) - operator which supplies in variable e
the current element indicated by sequence indicator
and advances at the next element;
Store(das,e) – stores the value e in the current element
indicated by the sequence indicator in das and
advances to the next element;
Close(das) – closes sequence das.
-----------------------------------------------------------{Implementation of the type direct access sequence - Pascal
Variant}
TYPE ElementType = ... ;
DirectAccessSequenceType = FILE OF ElementType;
IndexType = integer;
VAR
das: DirectAccessSequenceType;
e: ElementType;
[1.4.5.2.b]
i: IndexType;
{Assign(das, DiscFileName)}
assign(das, DiscFileName)
{Rewrite(das)}
rewrite(das)
{Reset(das)}
reset(das)
{i:= SequenceDimension(das)}
i:= filesize(das)
{i:= SequencePosition(das)}
i:= filepos(das)
{Search(das,i)}
seek(das,i)
{Store(das,e)}
write(das,e); writeln(...);
{Retreive(das,e)}
read(das,e); readln(...);
{Close(das)}
close(das)
------------------------------------------------------------
• Similar implementation variants of ADT Sequence exist for C and C++ programming
languages.
• If at principle level the presented concepts remain similar, at the implementation level
important differences may appear.