Download 1 Fundamental data structures

1 Fundamental data structures 1.1 Introduction • The modern digital computer was invented and intended as a device that should facilitate and speed up complicated and time-consuming computations. • In the majority of applications, the speed, working frequency and the capability to store and access large amounts of information plays the dominant part and are considered to be the primary characteristics of a computing system • The ability to compute, i.e., to calculate, to perform arithmetic, has in many cases become almost irrelevant. • The large amount of information that is to be processed by a computing system in some sense represents an abstraction of a part of reality. • The information that is available to the computer consists of a selected set of data about the actual problem to be solved • That set is considered relevant to the problem at hand, and it is believed that from this set of data, the desired results can be derived. • In this context, the data represent an abstraction of reality • What is an abstraction? • An abstraction is a simplification of facts in the sense that certain properties and characteristics of the real objects are ignored because they are peripheral and irrelevant to the particular problem. • In solving a problem with a computer it is necessary to choose an convenient abstraction of reality • This can be achieved in two steps: • (1) Define a set of data that is representative for the real situation. This choice must be guided by the problem to be solved. • (2) Then follows a choice of representation of this information. This choice is guided by the tool that is to solve the problem, i.e., by the facilities offered by the computer. • In most cases these two steps are not entirely separable. • (1) The choose of abstraction is usually determined by the nature of the problem to be solved. • This activity is in its turn determined by the tools which will be used in problem solving, for example the features and facilities offered by a certain computing system • (2) The choice of representation of data is often a fairly difficult one, and it is not uniquely determined by the facilities available. It must always be taken in the light of the operations that are to be performed on the data. • It is well known that the internal representation of information inside a computing system is based on binary digits (bits) • This representation is unsuitable for human beings because of the usually large number of digits involved • But this representation is most suitable for electronic circuits because the two values 0 and 1 can be represented conveniently and reliably by the presence or absence of electric currents, electric charge, or magnetic fields. • For this reason, the choice of data representation in the field of computers can be achieved at different levels of abstraction, depending on the desired objective and the programming language that is used. • In fact, the programming language represents the support or the medium which supports the abstraction • In this context, a programming language represents: • An abstract computer capable of interpreting sentences which are build using the terms of this language • The terms defined by the programming language may embody a certain level of abstraction from the objects used by the actual machine. • Thus, the programmer who uses such a higher-level language will be freed (and barred) from questions of number representation, if the number is an elementary object in the realm of this language. • The usage of a programming language offering a fundamental set of abstractions, which are applicable in the most part of problems related to data processing, is reflected mainly by the reliability of the resulted programs. • It’s much simpler to conceive a program based on reasoning with familiar notions of numbers, sets, sequences, and repetitions than on bits, storage units, and jumps. • Of course, an actual computer represents all data, whether numbers, sets, or sequences, as a large mass of bits. • But this fact is irrelevant to the programmer as long as he or she does not have to worry about the details of representation of the chosen abstractions. • The corresponding representation chosen by the computer (or compiler) is reasonable for the stated purposes, and is transparent from the programmer point of view. • The programmer can operate with abstract notions, which are easier to understand and to manipulate by him, because there are specialized programs named translators (compilers), which automatically convert the abstract notions in fundamental terms of the target computing system. • The closer the abstractions are to a given computer: • (1) The easier it is to make a representation choice for the engineer or implementer of the language, • (2) The lower is the probability that a single choice will be suitable for all (or almost all) conceivable applications. • This fact sets definite limits on the degree of abstraction from a given real computer, or for a given programming language. • The area of the programming languages considered in this course includes languages as C and C++ • These languages cover in a representative manner a large domain including: • (1) Machine-oriented and machine-dependent language, which offer support for solving representation problems in an open manner • (2) High level language involving an high degree of abstraction, which solve automatically the representation problems • In this context, the programmer can use the convenient abstraction level in accordance with the application to be solved. • In other words, the process of abstraction and representation of data used in programming activity is supported by the programming language. • How is this achieved? 1.2 Data Types. Abstract Data Types 1.2.1 The Concept of Data Type • In data processing, a clear distinction is made between: • (1) Real numbers and variable • (2) Complex numbers and variables • (3) Logical variables • (4) Variables representing individual values, or sets of values, or sets of sets, or between functions, sets of functions, and so on. • In this sense, we will adhere to the principle that every constant, variable, expression, or function is of a certain type. • A type is essentially characterized by: • (1) The set of values to which a constant belongs, or which can be assumed by a variable or expression, or which can be generated by a function belonging to this type • (2) A certain degree of structuring (organizing) of the information • (3) A set of specific operators • In mathematical texts, the type of a variable is usually deducible from the typeface, respectively from the shape of the fonts used to represent it, without consideration of context; • This is not feasible in computer programs where usually there are available a reduced set of fonts • That is the reason for in practice, is therefore widely accepted the rule that the associated type is made explicit in a declaration of the constant, variable, or function, and that this declaration textually precedes the application of that constant, variable, or function. • This method simplify the translator activity, permitting an automatic verification of the legality of using the associated entities of the considered type • The primary characteristics of the concept of type are the following: • (1) A data type determines the set of values to which a constant belongs, or which may be assumed by a variable or an expression, or which may be generated by an operator or a function. • (2) The type of a value denoted by a constant, variable, or expression may be derived from its form or its declaration without the necessity of executing the computational process. • (3) Each operator or function expects arguments of a fixed type and yields a result of a fixed type. • If an operator admits arguments of several types (e.g., + is used for addition of both integers and real numbers), then the type of the result can be determined from specific language rules. • (4) A type presume a certain degree of structuring of the information • As a consequence, obeying these rules, a compiler may use this information on types to check the legality of various language constructs, during compilation phase, without the necessity of execution of the generated code. • This kind of redundancy in the program text is extremely useful as an aid in the development of programs, and it must be considered as the primary advantage of good high level languages over machine code (or symbolic assembly code). • From the computing system point of view, the store is a homogeneous mass of bits without apparent structure. • But it is exactly the abstract data structure, which alone are enabling human programmers to recognize meaning in the monotonous landscape of a computer store. 1.2.2 Abstract Data Types • Definition: • An Abstract Data Type (ADT) is defined as a mathematical model to which, a set of specific operators is added • The main characteristics of an abstract data type are • (1) An ADT generalizes the primitive data types (integer, real, etc.), as functions and procedures are generalizations of the primitive operators (+, -, etc.). • (2) An ADT conceptual encapsulates a data type in the sense that from the logical and physical point of view, type definition and all the related operations concerning this type, are located in a single section (zone) of the program. • If the necessity of the ADT implementation modification appears, the programmer: • (2.1) Knows precisely the place where the modifications must be done • (2.2) Can be sure that the modification of the specific section, doesn’t affect the rest of the program code, if the format of the ADT’s operators is not modified. • (3) In addition, outside of the definition section, the target ADT can be treated as a primitive type • In this case, a user can treat an ADT defined in this manner, in terms of the associated operators, being not concerned in any way about their implementation. • A problem: certain operators can refer different ADTs. In this case, the access to those operators must be done in a specific manner in concordance with the definition section of the each ADT. • The major disadvantage of using ADT resides in the necessity of a rigorous access discipline: • All the references to the encapsulated data of an ADT, must be done only through the specific ADT’s operators • This requirement is not verifiable at the compilation level, and must be accepted as an imposed constraint of the programming discipline. --------------------------------------------------------------------------------------------------------------------Example 1.2.2 • An List abstract data type is considered: • The nodes of the list belong to a defined type TypeNode. • Let be: • L - an instance of List ADT (L: List), • v -a node variable (v: TypeNode) • c - a reference to a list’s node (c: TypeNodeReference). • A possible set of operators for the ADT List is the following: 1. EmptyList(L:List); - operator which initializes L as an empty list; 2. c:= First(L:List); - operator which returns the reference to the first node of the list, or nil indicator if the list is empty. 3. c:= Next(L:List); - operator which returns the reference to the next node of the list, or nil indicator if such a node doesn’t exist. 4. Insert(v:TypeNode, L:List); - operator which inserts node v in list L after the current node. 5. Retrieve(v:TypeNode, L:List); - operator which supplies in v the content of the current node of the list L. --------------------------------------------------------------------------------------------------------------------- • Important remarks over List ADT particularities. 1) Once List ADT was defined and implemented, all lists’ processing can be realized in terms of the List ADT operators, declaring an convenient number of corresponding instances of the ADT, respectively one specific instance for each list. 2) From the programmer point of view, the operators shape remain the same as they were initially defined, doesn’t matter what concrete manner was used for ADT implementation. • Even accepting that the ADT implementation is modified in the current application, for example for a better efficiency, the user programs remain unchanged. Modifications are supported only by the section that includes the implementation of the ADT. 3) There isn’t any limit referring to the number of operations that can be applied to the instances of a specified mathematical model. • Is necessary to underline that each set of operators is defined for, and refers a distinct ADT. 4) The efficiency, reliability and correctness of using List ADT is guaranteed only if the user realizes accesses to the declared lists, only by using the specific operators associated to the corresponding type. The programmer must respect strictly the imposed restrictions. 1.2.2.1 Defining an Abstract Data Type (ADT) • As it was mentioned, an ADT is conceived as a mathematical model to which is attached a set of operators, defined on this model. • As result, defining an ADT presumes: • (1) The mathematical model specification • (2) The definition of the associate operators • Programming languages include specific methods for data types definition • In all the cases the starting point is a fundamental set of types named Primitive Types • Primitive Types plays the role of bricks in the process of data structures development • There are two categories of Primitive Types: • (1) Standard or Predefined Data Types • (2) Primitive Data Types defined by user • (1) Standard or Predefined Types include usually numbers and logical values • These types are based on well-known standard mathematical models (set of integer, real or logical numbers) and implement the specific operators defined over this categories of numbers • If the individual values of the type are subject of an order relation, then the type is named ordered or scalar • (2) Primitive Data Types defined by user • A well-known method to built new primitive data types is to enumerate the constitutive values of the type • For example, in a program concerned with plane geometric figures, we may introduce a primitive type called ShapeType, whose values may be denoted by the identifiers rectangle, square, ellipse, and circle. • In most cases new data types are defined in terms of previously defined data types. • Values of such a type are usually conglomerates of the previously defined constitutive types, and they are named structured data types. • Structured Data Types are usually conglomerates of component values. If there is only one constituent type, that is, if all components are of the same constituent type, then it is known as the base type. • The number of distinct values belonging to a type T is called its cardinality. The cardinality provides a measure for the amount of storage needed to represent a variable x of the type T, denoted by x: T. • For structured data type definition, are used methods for structuring. • Based on these methods, an entire hierarchy of data types (structures) can be built. • The structuring methods can be: • Static: array, record, set, and sequence. • Dynamic: list, tree, graph --------------------------------------------------------------------------------------------------------------------Standard (predefined): integer, real, boolean,… Primitive Types (unstructured) Defined by user: enumeration DATA TYPES Static: array, record, set, sequence Structured Types Dynamics: list, tree, graph ---------------------------------------------------------------------------------------------------------------- • In order to use a defined data type, is necessary to declare variables belonging to this type. This process is known as type instantiation • The declared variables can be processed using the operators associated to the type. • There are different kinds of operators: • (1) Homogenous – can have as operands only instances of the respective ADT • (2) Mixed - can have as operands instances of the other types, respectively instances of the compatible data types. • (3) Internal – the result of the operator execution produce an instance of the respective type. • (4) External - the result of the operator execution produce an instance that belongs to a different type. • Categories of operators: • (1) Primitive or atomic operators – this are usually implemented in hardware and involve specific build instructions of the computing system. • (2) Defined by programming language – this operators usually include the structuring methods and the access methods to components • (3) Defined and implemented by the user • The basic operators are: • Comparison and assignment - i.e., the test for equality (and for order in the case of ordered types), and the command to enforce equality. • The fundamental difference between these two operations is emphasized by the clear distinction in their denotation throughout this text. • Test for equality: x = y (an expression with value TRUE or FALSE) • Assignment to x: x:= y (a statement making x equal to y) • These fundamental operators are defined for most data types, but it should be noted that their execution may involve a substantial amount of computational effort, if the data are large and highly structured. • Transfer operators -operators, which organize (constitute) data types in other data types (generate structures). They are essential for structured types • Constructor operators – generate structured values starting from their components • Selector operators – extract the component values from a structure • Constructors and selectors are in fact transfer operators which organize (constitute) the constitutive types in structured types and vice-versa • Each structuring method has its particular pair of constructor-selector operators which are clear distinguished by their representing manner. • The transfer operators can be: • Implicit operators – which are direct implemented by the programming language (for example the access to a field of a record structure or the access to an element of an array structure) • Explicit operators – which are built by the programmer using the facilities of the programming language (for example the operators processing the nodes of a list structure) • Conclusion: • (1) A type is a manner of structuring information • (2) For unstructured types is more relevant the set of values (constants) which constitute the type’s domain than the structuring degree. • (3) As the type becomes more complex, the structuring degree becomes more an more relevant 1.2.2.2 Implementation of an ADT • Implementation of an ADT is in fact a translation: that means that ADT is expressed in terms of an specific programming language • Implementation of an ADT presume two steps: • (1) To specify the desired structure (mathematical model) in terms of a programming language. • (2) To specify the procedures that implement the specific operators, using the same programming language • The implementation of an ADT is realized only for the types built by the user. • For the other types, the specific implementation elements are usually included in the programming language • A structured type is built starting from the fundamental types that are defined in the language, using the available structuring facilities • The specification of the procedures that implement type’s operators depends on the manner that was chosen for type implementation • There is possible to develop a hierarchical implementation of a type. • That presumes to implement an ADT using the existing implementation of other ADTs. • Ideally, the programmers want to built their programs, using languages whose primitive data types are as much as possible in correspondence with the models and operators of the desired ADTs • From this point of view, in many senses, the usual programming languages have neither enough facilities for developing different ADTs nor to define the specific operators. 1.2.3 Abstract Data Type. Data Type. Data Structure • What is the significance of the following terms that are usually confused? • (1) Abstract Data Type (ADT) • (2) Data Type (DT) or simply type • (3) Data Structure (DS) • (4) Elementary Data or simply data (D) • (1) As it was specified, an Abstract Data Type is a mathematical model together with the totality of operations defined on this model. • At the conceptual level, the algorithms can be designed and expressed in terms of some abstract data types. • In order to implement such an algorithm in a programming language, is absolute necessary to represent those ADT in terms of the data types and of the operators defined in the respective programming language. • (2) A Data Type is an implementation of an ADT in a programming language. • A data type (DT) can’t be used as such. • Therefore, in the program, the user has to declare variables belonging to the respective type. These variables can be actually processed by the algorithms • This process is named instantiation and it leads to different results: • The instantiation of a unstructured data type produces an elementary data (ED) • The instantiation of a structured data type produces a data structures (DS) • (3) An elementary data is an instance of an unstructured data type • (4) A data structure is an instance of a structured data type. • As conclusion, the relationship between this notions is presented in the next formula: --------------------------------------------------------------------------- ED (Elementary Data) ADT (implementation) DT (instantiation) DS (Data Structure) ---------------------------------------------------------------------------- • Is necessary to be specified that the above formula is valid only for primitive data defined by the user and for structured data types. • By implementation we understand type definition (specification) • By instantiation we understand declaration of a variable associated to the respective type • In the case of predefined types, the implementation phase is missing, it being intrinsic included in the programming language. As result for these types the next formula is valid: ---------------------------------------------------------------------------- DT (instantiation) ED --------------------------------------------------------------------------- 1.3 Unstructured Types 1.3.1 Enumeration Type • In big majority of programs, integers are used even when numerical values are not directly implied, and the integers represent in fact abstraction of some real entities • To avoid such situations, the programming languages introduce a new primitive unstructured data type specified by enumerations of all its possible values. • Such a type is called an enumeration type. It is definable by enumerating the distinct values belonging to it: c1,c2,....,cn . • Its definition has the form [1.3.1.a. -------------------------------------------------------------// Enumeration Type definition - Variants enum typeEnumeration {c1,c2,c3,...,cn} variable; typedef enum {c1,c2,c3,...,cn} typeName; [1.3.1.a] -------------------------------------------------------------• Enumeration type is an ordered type, its values are considered ordered in an increasing manner in the order in which they were defined, based on the convention: (ci < cj) ⇔ (i < j) • The type cardinality is card(tipEnumerare) = n . • Enumeration type is an unstructured type defined by the user o As consequence, its use presumes both the implementation phase (type declaration) and instantiation phase (declaration of the variables belonging to the type) -----------------------------------------------------------// Example of implementation Enumeraration Type /*1.3.1.b*/ typedef enum {rectangle, square, circle, ellipse} typeFigure; typedef enum {white, red, orange, yellow, green, blue, brown, black} typeColor; typedef enum {true, false} typeBoolean; typedef enum {Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday} typeWeekDay; typedef enum{soldier, caporal, sergeant, sublieutenent, lieutenant, captain, major, colonel, general} typeArmyDegree; typedef enum {constant, type, variable, procedure, function} typeObject; typedef enum {sequence, array, record, set} typeDataStructure; ------------------------------------------------------------------------ • The definition of such types introduces not only a new type identifier, but at the same time the set of identifiers denoting the values of the new type (in fact the domain of the type’s values) • These identifiers may then be used as constants throughout the source program, and they enhance its understandability considerably. -----------------------------------------------------------/* Example of instantiation of an enumeration type */ /*1.3.1.c */ typeColor c; typeWeekDay d; typeArmyDegree g; typeBoolean b; c d g b = = = = white; wednesday; captain; false; /*c:= /*d:= /*g:= /*b:= 0;*/ 2;*/ 5;*/ 1;*/ -----------------------------------------------------------• As example, based on the types implemented in [1.3.1.b], the variables c,z,g,b are declared in [1.3.1.c]. • In an evident manner, at the code level, this representation is more intuitive. For example: c=0; d=2; g=5 ; b=1, • This representation is based on the assumption that c,d,g,b are defined as integers, and in their quality of constants belonging to the respective enumeration type, they are mapped onto increasing natural numbers in the order in which they have been declared (enumerated). • As a consequence, a translator can easy verify the using of unspecific operators for this kind of non-numerical types. For example the programming statements c = c+1; or g = white; will be signalized as errors. • The enumeration type set includes as operators the classical assignment and comparison • Since the enumeration type is ordered, it is sensible to introduce in its set, operators that generate the successor and predecessor of their argument. • The following operators have this effect: SUCC(x) and PRED(x) 1.3.2 Predefined (Standard) Primitive Types • Predefined (standard) primitive types are those types that are available on most computers as built-in hardware features. • They include the whole numbers, the logical truth values, and a set of printable characters. On many computers fractional numbers are also incorporated, together with the standard arithmetic operations. • These types are denoted by the identifiers: INTEGER, REAL, BOOLEAN, CHAR, SET • (1) The type INTEGER comprises a subset of the whole integer numbers defined in mathematical sense • The size of represented integer number may vary among individual computer systems. If a computer uses n bits to represent an integer in two's complement notation, then the admissible values x must satisfy -2n-1 < x < 2n-1 • It is assumed that all operations on data of this type are exact and correspond to the ordinary laws of arithmetic • The computation process will be interrupted in the case of a result lying outside the representable subset. This event is called overflow (depăşirea capacităţii registrelor - DCR). • Over the set of integer numbers, beside the classical operators comparison and assignment, a number of standard operators are defined: • The standard operators are: addition (+), subtraction (-), multiplication (*), integer division (/) and modulo (%) m - n < (m / n)* n <= m (m / n)* n + (m % n) = m -------------------------------------------------------------- Integer ADT Mathematical model: scalar elements with values in the set of integer numbers: {..., -3, -2, -1, 0, 1, 2, 3,....}. Notations: i,j,k inz inn e b - integers; [1.3.2.a] nonzero integer; nonnegative integer; integer value (constant or variabile); Boolean value. Operators: i= IntegerAssignement(i,e) – stores the value of e in variable i; k= IntegerAddition(i,j) k= IntegerSubtraction(i,j) k= IntegerMultiplication(i,j) k= IntegerDivision (i,inz) inn= Modulo(i,inz) – return the positive integer that represents the reminder of dividing i with inz b= EqualToZero(i) b= BigerThanZero(i) ------------------------------------------------------------ • (2) The type REAL denotes a subset of the real numbers. • Whereas arithmetic with operands of the types INTEGER is assumed to yield exact results, arithmetic on values of type REAL is permitted to be inaccurate within the limits of round-off errors caused by computation on a finite number of digits. • The real numbers are represented with a specified precision expressed as a number of decimal digits. • This is the principal reason for the explicit distinction between the types INTEGER and REAL, as it is made in most programming languages. • The standard operators are the four basic arithmetic operations of addition (+), subtraction (-), multiplication (*), and division (/). • The operations that produce values which are outside of the representable domain of the specific real implementation, cause overflow (DCR). • The current implementations of the programming languages include many categories of integer types (int, long int, short int, unsigned int, etc.) respectively more categories of real types (float, double, long double, etc) which differs by the number of digits used in representation. • (3) The type Boolean has two values that are denoted by the identifiers TRUE and FALSE • In the language C these values are substituted by integer values: • 1 or <> 0 means true • 0 means false • The Boolean operators are the logical conjunction, disjunction, and negation whose values are defined in Table.3.2.a P TRUE TRUE FALSE FALSE P && Q TRUE FALSE FALSE FALSE Q TRUE FALSE TRUE FALSE P || Q TRUE TRUE TRUE FALSE !P FALSE FALSE TRUE TRUE Table 1.3.2.a. Logical operators • (4) The standard type char comprises a set of printable characters. o The ASCII codification (American Standard Code for Information Interchange) -------------------------------------------------------------Capitals: A B C ... X Y Z hexadecimal: x’41' x'42' x'43' x'58' x'59' x'5A' decimal: 65 66 67 88 89 90 Decimal digits: 0 hexazecimal: x'30' decimal: 48 1 x 31' 49 2 x 32' 50 ... ' Lower-case letters: hexadecimal: decimal: a x'61' 97 b x'62' 98 c x'63' 99 ' 9 x 39' 57 ' ... z x'7A' 122 Blanc character: hexadecimal: x’20’ decimal: 32 -------------------------------------------------------------- • Usually, the programming languages define primitive type char • The nature of a char x: ('A' <= x) && (x <= 'Z') - x is capital letter; ('a' <= x) && (x <= 'z') - x is a lower-cased letter; ('0' <= x) && (x <= '9') - x is a decimal digit. • In C language, types char and int are equivalent, thought they have different interpretation • Two standard transfer functions are defined between char and integer types: • f('5')=5 transforms a character in the corresponding integer • g(8)='8' transforms an integer in the corresponding character • f and g are inverse functions f(g(i))=i g(f(c))=c (0 < i < 9) ('0' < c < '9'). -------------------------------------------------------------// Implementation of the transfer functions integer-character char c; int n; [1.3.2.b] n= c-'0' //f(c)-> i c= n+'0'; //g(i)-> c -------------------------------------------------------------- 1.4 Structured Types 1.4.1 Array Structure. Abstract Data Type Array • The array is probably the most widely used data structure; in some languages it is even the only one available. • An array consists of components, which are all of the same type, called its base type; it is therefore called a homogeneous structure. • The array is a random-access structure, because all components can be selected at random and are equally quickly accessible. • In order to denote an individual component, the name of the entire structure is augmented by the index which selects positional the component. • This index is to be an integer between 0 and n-1, where n is the number of elements, the size, of the array. • Defining an array presumes to specify: • (1) The structuring method, which is directly implemented in the language • (2) The name of the resulting data type (array_type) • (3) The array base type (element_type) • (4) The index type (index_type). • In C there are some particularities: • The structuring method isn’t explicitly present. It is indicated by the presence of the right brackets in the syntax of the declaration. • The index type is always integer (int) • The domain of the index values is restraint to the set of positive integer numbers: 0,1,2,3,... • The size of the array (the total number of elements) is specified between the right brackets in the declaration -------------------------------------------------------------------- /*Definition of an array */ /*1.4.1.a*/ element_type array_name[number_of_elements]; -------------------------------------------------------------/* Definition of an array type */ /*1.4.1.b*/ typedef element_type array_type[number_of_elements]; array_type array_name; ----------------------------------------------------------------- /* Definition of an array - examples */ float array[10]; /*1.4.1.c*/ char string[10]; ----------------------------------------------------------------- /* Definition of an array type - example */ typedef float array_type[100]; array_type a,b; ----------------------------------------------------------------- • The array is a static data structure, for which the memory allocation is realized in the compilation phase of the program • As consequence when an array structure is defined, is absolutely necessary to specify the exactly number of elements in other words, the array’s size. • A structured value (an instance) of an array data type array_type having the component values c1, c2..., cn can be initialized by an array constructor and a assignment operation: element_type array_name[number_of_elements]= {expression_const0, expression_const1,...}; -------------------------------------------------------------/* Construction of an array */ int vect[5]={0,1,33,4,8}; int mat[2][3]={{11,12,13},{1,2,3}}; [1.4.1.d] char str[9]="Turbo C++"; -------------------------------------------------------------- • The reverse operator of a constructor is the selector. This selects an individual component of an array • For an array variable vector , an array selector is specified adding to the name of the structure, the index i of the selected component: vector[i]. • Instead of a constant index, an index expression can be used • Evaluation of the index expression during the program execution leads to the selected component as the algorithm requires. • If the base type is an ordinal type, then over the corresponding array type, a natural order relationship can be defined. • The natural order of two arrays having the same base type is determined by those corresponding components, which are different and have the smallest index. (2,3,5,8,9)>(2,3,5,7,11) 'CAPAC' < 'COPAC' -------------------------------------------------------------/* The natural order of two arrays */ typedef element_type array_type[n]; array_type x,y; /*1.4.1.e*/ (x<y) <=> Ek:0≤k<n:((Ai:0≤i<k:x[i]=y[i])&(x[k]<y[k])) -------------------------------------------------------------- • A classical example in this sense is the lexicographic order applicable to the arrays of characters. • Since all the components of an array array_type belongs to the same base type element_type: Card (array_type) = Card (element_type) Card (index_type) . • The arrays having one dimension are named linear arrays • The access to any element of a linear array is realized using a single index in the selection mechanism. • element_type specified in the an array definition can be in its turn an array • Using this mechanism, arrays with more dimensions can be defined • Such an array is called matrix • The access to a matrix element is realized using a number of indexes equal to the number de dimensions of the matrix. mat[i,j,k,… ,m,n]. -------------------------------------------------------------------- /* Defininition of a matrix type */ /*1.4.1.f*/ typedef element_type array_type [number_of_elements]; typedef array_type matrix_type[number_of_elements_1]; -------------------------------------------------------------• A matrix can be also defined as a multidimensional array. -------------------------------------------------------------/* Definition of a matrix */ element_type matrix_name[number_of_elements] [number_of_elements_1]; /*1.4.1.g*/ -------------------------------------------------------------/* Definition of a multidimensional array (matrix) */ typedef element_type array_type [number_of_elements_1] [number_of_elements_2]... [number_of_elements_n]; array_type matrix; /*1.4.1.h*/ -------------------------------------------------------------• The abstract data type array is summarized in [1.4.1.i] • In [1.4.1.j] is presented an example of implementation of an array ADT. -------------------------------------------------------------ADT Array Model matematic: Sequence of elements belonging to the same type (base type). The associate index belongs to the domain [0,1,2,...). There is a biunivoc correspondence between indexes and array’s elements. Notations: element_type – the type of array’s elements (base type); a – mono dimensional array; i - index; [1.4.1.i] e – value of element_type. Operators: StoreInArray(a,i,e) – operator that stores the e value in the i-th element (position) of the array a; e= RetrieveFromArray(a,i) – operator that returns the value of the i-th element (position) of the array a; -------------------------------------------------------------/* Example of implementation /*1.4.1.j*/ #define max_elem_number integer_value typedef element_type array_type[max_elem_number]; int i; array_type a; element_type e; a[i]=e; /*StoreInArray(a,i,e)*/ e=a[i]; /*RetrieveFromArray(a,i)*/ --------------------------------------------------------------------------- 1.4.2 Array Searching • The task of searching is one of most frequent operations in computer programming. • It also provides an ideal ground for application of the data structures so far encountered. • There exist several basic variations of the theme of searching, and many different algorithms have been developed on this subject. • The basic assumption in this presentation is: • We have a collection of data, containing N elements, among which a given element is to be searched. • We shall assume that this set of N elements is represented as an linear array a element_type a[n]; • Typically, the type item has a record structure with a field that acts as a key. • The task of search consists of finding an element of whose key field is equal to a given search argument x. The result of the search is an index i. • The resulting index i, satisfying a[i].key = x, then permits access to the other fields of the located element. • Since we are here interested in the task of searching only, and do not care about the data for which the element was searched in the first place, we shall assume that the type element_type consists of the key only, i.e. is the key. • In other words we will search in an array of keys. 1.4.2.1 Linear Search • When no further information is given about the searched data, the obvious approach is to proceed sequentially through the array in order to increase step by step the size of the section, where the desired element is known not to exist. • This approach is called linear search and is presented [1.4.2.1.a]. -------------------------------------------------------------/*Linear search - variant while */ /*1.4.2.1.a*/ element_type a[n]; element_type x; int i=0; while ((i<n-1) && (a[i]!=x)) i++; if (a[i]!=x ){ /*the array doesn’t contain the searched element*/ } else{ /*there is a coincidence at index i*/ } -------------------------------------------------------------------- • There are two conditions which terminate the search • (1) The element is found, i.e. a[i] = x • (2) The entire array has been scanned, and no match was found; i=n-1 and a[i]≠x • The invariant, i.e the condition satisfied before each incrementing of the index i, is (0<= i < n-1)&(Ak:0 <= k < i:a[k]<> x) • The invariant express that for all values of k < i , no match exists • The search termination condition is: ((i = n-1)OR(a[i]= x))&(Ak:0 <= k< i:a[k]<> x) • This condition can be used in a search algorithm based on do cycle [1.4.2.1.b] --------------------------------------------------------------------------- /*Linear search - variant do*/ /*1.4.2.1.b*/ element_type a[n]; element_type x; int i=-1; do{ i++; }while ((i<n-1) && (a[i]!=x)) if (a[i]!=x ){ /*the array doesn’t contain the searched element*/ } else{ /*there is a coincidence at index i*/ } -------------------------------------------------------------------- • To be noticed: • When the search process is finished, there is one more comparison to be made between a[i] and x in order to establish the existence or nonexistence of the searched element. • When the algorithm did find a match, it found the one with the least index, i.e. the first one if they are more elements with the same key. • The classical linear search is implemented using the for statement [1.4.2.1.c] -------------------------------------------------------------/*Linear search - variant for */ /*1.4.2.1.c*/ /*returns value n if x is not found*/ /*returns a value i<n for x found on position i */ int linear_search(int x,int a[],int n) {int i; for(i=0;i<n && a[i]-x;i++); return i; } -------------------------------------------------------------• As is to be observed, for each cycle of the algorithm, the Boolean expression which contains two factors is evaluated and the index is incremented • This process can be accelerated, by simplifying the Boolean expression • A solution is to replace the two Boolean factors by only one, which implies the both of them. • This is possible only by guaranteeing that a match will be found always • For this purpose a supplementary element a[n] is added to the array, whose value is initially assigned with value x (the searched element). The array becomes: element_type a[n+1]; • The auxiliary element x is called sentinel, and the search - sentinel technique • It’s evident, if when the algorithm finishes i = n , then the searched element doesn’t exist in array. • The new shape of the search algorithm appears in [1.4.2.1.d]. --------------------------------------------------------------------------- /*Linear search - sentinel technique (while) */ /*1.4.2.1.d*/ element_type a[n+1]; element_type x; int i=0; a[n]=x; while (a[i]!=x) i++; if (i == n){ /*the array doesn’t contain the searched element*/ } else{ /*there is a coincidence at index i*/ } -------------------------------------------------------------------- • The ending condition deducted from the cycle invariant is: (a[i]= x)&(Ak:0 <= k < i:a[k]<> x) • In [1.4.2.1.e] the same algorithm appears implemented using do cycle and in [1.4.2.1.f] using for cycle -------------------------------------------------------------/* Linear search - sentinel technique (do) */ /*1.4.2.1.e*/ element_type a[n+1]; element_type x; int i=-1; a[n]=x; do{ i++; }while (a[i]!=x) if (i == n){ /*the array doesn’t contain the searched element*/ } else{ /*there is a coincidence at index i*/ } -------------------------------------------------------------/* Linear search - sentinel technique (for)*/ /*returns value n if x is not found*/ /*returns a value i<n for x found on position i */ int sentinel_search(int x,int a[],int n) /*1.4.1.f*/ {int i; a[n]=x; for(i=0;a[i]-x;i++); return i; } -------------------------------------------------------------• The linear search performance is O(n/2), that means an element is found in average, after half of the array’s elements have been tested. 1.4.2.2 Binary Search • The search process can be accelerated, if more information is available about the searched data. • It is well known that a search can be made much more effective, if the data are ordered (for example, a telephone directory in which the names were not alphabetically listed) • We shall therefore present an algorithm which makes use of the knowledge that array a is ordered, i.e., it satisfies the predicate Ak: -------------------------------------------------------------/* The ordered array definition*/ element_type a[n]; Ak:0<k<=n-1:a[k-1]<= a[k] [1.4.2.2.a] -------------------------------------------------------------- • The basic idea is: • Inspect an element picked at random, say a[m], and compare it with the search argument x • If a[m] is equal to x, the search terminates; • If a[m] is less than x, we infer that all elements with indices less or equal to m can be eliminated from further searches. The search interval can be restraint to elements having indexes grater than m. • If a[m] is greater than x, all elements with index greater or equal to m can be eliminated. The search interval can be restraint to elements having indexes smaller than m. • This results in the following algorithm called binary search; • The variables left and right mark the left and at the right end of the section of a in which an element may still be found. -------------------------------------------------------------------- /*binary search */ /*1.4.2.2.b*/ element_type a[n]; /*n>0*/ element_type x; index_type left,right,m; int found; left=0; right=n-1; found=0; while((left<=right)&&(!found)) { m=(left+right)/2; /* or any values between l and r */ if(a[m]==x) found=1; else if(a[m]<x) left=m+1; else right=m-1; } if(found) /* there is a match at index I */ -------------------------------------------------------------------• The loop invariant, i.e. the condition satisfied before each step, is (left<=right)&(Ak:0 <= k <left:a[k]<x)&(Ak:right<k <=n-1:a[k]> x) • The end condition is: found OR((left>right)&(Ak:0<=k<left:a[k]<x)& (Ak:right<k<=n-1:a[k]>x)) • The choice of m is arbitrary in the sense that correctness does not depend on it, but it does influence the algorithm's effectiveness. • The optimal solution is to choose the middle element, because this eliminates half of the array in any case. • As a result, the maximum number of steps is O(log 2 n), • Hence, this algorithm offers a drastic improvement over linear search, where the expected number of comparisons is n/2. • A more compact version of binary search is presented in [1.4.2.2.c] -------------------------------------------------------------/* binary search 1 */ int binary_search(int x,int a[],int n){ int left=0,right=n-1,m; do{ (x>a[m=(left+right)/2])?(left=m+1):(right=m-1); [1.4.2.2.c] }while(a[m]-x && left<=right); return(a[m]-x)?n:m; } -------------------------------------------------------------- • The efficiency of binary search can be somewhat improved if – as in the case of linear search – a solution could be found that allows a simpler condition for termination. • This objective can be attaint, if we abandon the naive wish to terminate the search as soon as a match is established. • This seems unwise at first glance, but on closer inspection we realize that the gain in efficiency at every step is greater than the loss incurred in comparing a few extra elements. • Remember that the number of steps is at most log 2 n . • The faster solution is based on the following invariant: (Ak:0<=k<left:a[k]<x)&(Ak:right<k<=n-1:a[k]>x) • The search process continues until the search interval becomes trivial (0 or 1 element). • This technique is presented in [1.4.2.2.d]. -----------------------------------------------------------/* ameliorated binary search */ /*1.4.2.2.d*/ element_type a[n]; /*n>0*/ element_type x; index_type left,right,m; left=0; right=n; while(left<right){ m=(left+right)/2; if(a[m]<x) left=m+1; else right=m; } if(right>n) /*the array doesn’t contain the searched element*/ if(right<=n) if(a[right]==x) /*the element was found*/; else /*the array doesn’t contain the searched element*/; -------------------------------------------------------------- • In this case the end condition is left = right. • Even though, the equality l = r doesn’t automatically presume that the searched element was found • If right > n doesn’t exist any coincidence. It was searched an element greater than any other element of the array • If right<=n the element a[right] corresponding to the last index right was not compared with the key x • Hence, an additional test for equality a[right]=x is necessary in order to establish the existence of the coincidence. • In contrast to the first solution, this algorithm -- like linear search -- finds the matching element with the least index. • Another version of the ameliorated binary search appears in [1.4.2.2.e]. -----------------------------------------------------------/* ameliorated binary search - variant 2 */ int ameliorated_binary_search (int x,int a[],int n){ int left=0,right=n,m; do{ (x>a[m=(left+right)/2])?(left=m+1):(right=m); /*1.4.2.2.e*/ }while(left<right); return ((right>=n)||(right<n && a[right]-x))?n:right; } -----------------------------------------------------------1.4.2.3 Interpolation Search Technique • Another possible improvement of the binary search presumes to determine with more precision the place in current interval in which the search is made, instead of access always the element placed in the middle. • Thus, if the search is made in an ordered structure, for example in phone book, when is searched a name starting with B is normal to search somewhere at the beginning of the book, instead a name starting with V, somewhere at the end of the book • This method is named search by interpolation, and it requires a simple modification reported to the binary search • Thus, in binary search, the new place in which the access is realized is always in the middle of the current search interval. 1 * (right − left ) 2 • This place is determined based on the relation: m = left + where left and right are the limits of the current search interval • The next evaluation of m, named interpolation search, is considered to be more efficient m = left + x − a[left ] * (right − left ) a[right ] − a[left ] • This evaluation takes care about the effective values of the keys which bounds the search interval, when the index of the next search is established • This estimation however presumes a normal distribution of the keys’ values over the target domain. --------------------------------------------------------------/* interpolation search */ int interpolation_search(element a[],int last,int x){ long m,left,right; left=0; right=last-1; [1.4.2.1.m] do{ m=left+((x-a[left].key)*(right-left))/ (a[right].key-a[left].key); if (x>a[m].key) left=m+1; else right=m-1; } while((left<=right)&&(a[m].key!=x)&&(a[right].key!=a[left].key) &&(x>=a[left].key)&&(x<=a[right].key)); if (a[m].key==x) return m; else return last; } --------------------------------------------------------------- • Property. Interpolation search requires either is successfully or unsuccessfully less than lg (lg (n) +1) comparisons [Se 88]. • The function lg (lg n) is growing extremely slow: • Thus, for example, for values of n around 109, lg(lg n)< 5. 1.4.3 Record Structure. Abstract Data Type Record • The most general method to obtain structured types is to join elements of arbitrary types that are possibly themselves structured types, into a compound type. • Examples from mathematics are: • Complex numbers, composed of two real numbers, • Coordinates of points composed of two or more numbers according to the dimensionality of the space spanned by the coordinate system. • In mathematics, such a compound type is named the Cartesian product of its constitutive types. • The set of values defined by this compound type consists of all possible combinations of values, taken one from each set defined by each constituent type. • Such a combination is also called n-tuple • Thus, the number of such possible combinations is the product of the number of elements in each constituent set, that is, the cardinality of the compound type is the product of the cardinalities of the constituent types. • The term that describes a compound data structure in programming is record or struct (in C). • In [1.4.3.a.] a generic definition mode of a record type in C is presented. -------------------------------------------------------------/*Struct (record) definition */ struct name_structure{ type1 list_nameField1; [1.4.3.a] type2 list_numeField2; ... typen list_nameFieldn; } list_var_structure; -------------------------------------------------------------- • Examples of record data types definition are presented in [1.4.3.b]. -------------------------------------------------------------/* Examples of record structures defining */ struct data { int day, month, year; } birth_data, enrolment_data struct { int day, month, year; } birth_data, enrolment_data struct data{ int day, month, year; } struct data birth_data, enrolment_data typedef struct data { int day, month, year; }calendar_data; [1.4.3.b] calendar_data data birth_data, enrolment_data; typedef struct person_structure { char[12] name; char[12] surnume; calendar_data birth_data; enum state(unmarried, married, divorced, widow) social_situation; enum sex(male, female)sex; }person; -------------------------------------------------------------- • The identifiers nameField1, nameField2 ,..., nameFieldn specified in the type definition, are names given by the user to the individual components of the type. • They are used in record selectors applied to record structured variables. • Given a variable x of the type name_structure, its i-th field can be denoted by using the “dot” notation: x.nameFieldi • Selective updating or assignment of x is achieved by using the same selector denotation on the left side in an assignment statement: x.numeFieldi:= xi where xi is a value (expression) of type typei • In [1.4.3.c] are presented some examples of the above defined record types’ instantiation -----------------------------------------------------------/* Examples of record structures instantiation */ calendar_data d; person p; [1.4.3.c] d.month=9; p.name=”merriam”; p.birth_data.day=24; p.social_situation=unmarried; ----------------------------------------------------------- • The example of the type person_structure shows that a constituent of a record type may itself be structured. Thus, selectors may be concatenated. • Naturally, different structuring types may also be used in a nested fashion. • For example, the i-th component of an array a being a component of a record variable r is denoted by: r.a[i] • The component with the selector name nameFieldi of the i-th record structured component of the array a is denoted by: a[i].nameFieldi • From the formal point of view, abstract data type record is presented in [1.4.3.d.]. -------------------------------------------------------------TDA Record [1.4.3.d] Mathematical model: A finite collection of elements named fields, which can belong to different types. There is a biunivoc correspondence between the identifiers of the fields end the elements collection. Notations: a - object of type record; id - identifier name of a field; e - object of the same type as field id of record a Operators: StoreInRecord(a,id,e) – stores the e value in the field id belonging to a e= RetrieveFromRecord(a,id) – returns the value of field id belonging to a -------------------------------------------------------------- • In [1.4.3.e.] are presented some examples of implementation of ADT Record. • As can be noticed, the definition and instantiation phases can be separated or can be cumulated in different modes, the C programming language being very flexible in this sense -------------------------------------------------------------/* Examples of implementation */ struct point { int x; int y; } struct rectangle { struct point corner1; struct point corner2; } [1.4.3.e] struct point p1; double dist; struct rectangle window; -------------------------------------------------------------//Example of usage dist=sqrt((double)(p1.x*p1.x)+(double)(p1.y*p1.y)); window.corner1.x=188; ----------------------------------------------------------------- 1.4.4 Union Structure • The C language defines union structure • A union is in fact a variable that can memorize at different moments of time objects belonging to different types and having different sizes. The compiler keeps the evidence of the objects and aligns in a correspondent manner the stored data. • The unions have been conceived in the idea to manipulate different data types in a unique location (zone) of memory. -----------------------------------------------------------/* Example of defining a union structure */ union union_indicator { int integer_value ; [1.4.4.a] float real_value; char *char_string; } union_variable ----------------------------------------------------------- • In [1.4.4.a] union_variable has a size that depends on the type of variable that was assigned to it. This is transparent from the user point of view. • In other words, for any correct utilization of the type, the actually stored value belongs to last memorized. • The access to the structure component is realized using dot notation. union_variable.member or union_pointer.member 1.4.5 Sequence Structure. Abstract Data Type Sequence • The common characteristic of the presented data structures (arrays, records, sets) is their finite cardinality; respectively the cardinality of the all their component types is finite. • That is the reason why, their implementation raise no problems. • The most part of the so called advanced data structures (sequences, lists, trees, graphs) are characterized by non-finite cardinality • This difference vis-à-vis the classical data structure is very important and has significant practical consequences. • For example, the sequence structure, having as base type T0 is defined as follows: -----------------------------------------------------------S0=< > (null sequence) Si=<Si-1,si> where 0 < i and si∈ T0 -----------------------------------------------------------• In other words, a sequence with base type T0 , is or: • (1) Null sequence, • (2) A concatenation of a sequence (with base type T0) with a value (si) of type T0 . • The recursive definition of the sequence data type has as result a non-finite cardinality • Each value of the sequence data type contains in fact a finite number of components belonging to the type T0 , • This number is boundless because starting with any sequence is possible to build a longer one. • Important consequences: • (1) The necessary memory storage space for an advanced data structures cant’ be known at the compilation moment • (2) A dynamic allocation memory scheme is necessary, based on which the memory is allocated to the “growing” structures and is free by the “decreasing” structures " • To implement such a requirement, the used high level programming language has to contain system functions allowing: • Dynamic Allocation/Dispose of memory • Dynamic Link/Reference of components • Under these circumstances advanced data structures can be described and used, with the help of programming language instructions. • In the next chapters of this course, the techniques for generating and processing such advanced data structures are presented. • The sequence structure is from this point of view an intermediary structure • It is an advanced data structure from the point of view of its cardinality which is non-finite • It is so currently used that its inclusion in the set of fundamental data structures is well established. • This situation is due to the fact that the selection of a suitable operators set for this structure, permit to implementers to choose adequate and efficient representations of this structure. • As result, the dynamic memory allocation mechanisms become simple enough to allow an efficient implementation, not affected by details at a high level programming language. 1.4.5.1 Abstract Data Type Sequence • The appurtenance of the sequence structure to the fundamental data structures, presume the restraint of its operators set in such a manner, that only the sequential access to its components is allowed. • This structure is known under the name of sequential file or for short file. • Its formal definition is: s = <s0, s1, s2,..., sn-1> • The sequence structure is affected by the restriction that all its components belong to the same base type, so it is a homogeneous data structure. • The components number n, named also length of the sequence, is presumed to be unknown during the compilation as well as during the code execution phase. • More than that, this number is not a constant; it can be modified during the program execution. • Although every sequence has at any time a specific, finite length, we must consider the cardinality of a sequence type as non-finite, because there is no fixed limit to the potential length of sequence variables. • Depending on the implementation manner, the sequence structures are stored on different kinds of storage media, where sometime, data are to be moved from one medium to another, such as from disk or tape to primary store or vice-versa. • The usual storage devices for sequence structures are magnetic tapes and disks in sequential allocation. • Sequence is an ordered structure. The order of structure elements is established by the time order of their generation. • Due to its sequential access mechanism, index selection of the individual components of a sequence becomes inadequate. • As result, at the declaration of a type sequence, only the base type is specified: TYPE SequenceType = FILE OF BaseType; • In a sequence structure defined like above, in any moment is accessible only one component, named current component, specified by an indicator (pointer) implicit associated to the sequence. • This indicator advances in a sequential manner at the next element, practically after execution of any operator over the current element. • In order to determine the end of a sequence structure, a specific Boolean operator is introduced: end of file specified by Eof(f: SequenceType) • The effective access manner to sequence components, as well as the actually processing possibilities, depends on the implementation, and result from the semantic of the operators defined for the abstract data type sequence. • In [1.4.5.1.b] is presented a C implementation of the sequence structure. -----------------------------------------------------------ADT Sequence Mathematical model: sequence of elements belonging to the same type. An indicator indicates the current accessible element of the sequence. The access is strictly sequential. Notations: ElementType – the type of the sequence element. It can’t be sequence type; s - variable sequence; e - variable of ElementType; [1.4.5.1.a] b – boolean value; DiscFileName – characters string. Operators: Assign(s, FileDiscName) – assigns to variable sequence s the name of the disc file; Rewrite(s) – operator which poses the sequence indicator at the beginning of the f and opens the file f in writing mode. If s doesn’t exist it is created. If s already exists, its old variant is lost and a new empty file s is created; ResetSequence(s) - operator which poses the sequence indicator at the beginning of the s and opens the file s in reading mode. If s doesn’t exist an execution error is signalized; OpenSequence(s) – operator which in some implementations play the role of rewrite or reset sequence; b:= Eof(s) – function which returns the value true if the sequence indicator is positioned on the marker of the end of file s; RetreiveSequence(f,e) – operator acting in reading mode. While Eof(f) is false, the operator supplies in variable e the current element indicated by sequence indicator and advances at the next element; StoreSequence(s,e) - for the sequence s opened in writing mode, store value e in the current element indicated by the sequence indicator and advances to the next element; Add(s) – operator which opens sequence s in writing mode, poses the sequence indicator at its end, with the possibility to add new elements at the end of the sequence s; CloseSequence(s) – closes sequence s . -----------------------------------------------------------{Implementation of Type Sequence – Pascal variant} TYPE ElementType = ... ; SequenceType = TEXT; VAR s: SequenceType; [1.4.5.1.b] e: ElementType; DiscFileName: string; {Assign(s, DiscFileName)} assign(s, DiscFileName) {Rewrite(s)} rewrite(s) {StoreSequence(s,e)} write(s,e) ; writeln(...) {ResetSequence(s)} reset(s) {Eof(s)} eof(s) {RetreiveSequence(s,e)} read(s,e); readln(...) {Add(s)} append(s) {CloseSequence(s)} close(s) ------------------------------------------------------------ 1.4.5.2. Abstract Data Type Direct Access Sequence • The main disadvantage of the sequence structure is the fact that the access to the sequence components is realized in a strictly sequential manner, through an indicator whose positioning is not accessible to the user in a direct manner. • That is the reason for starting from abstract data type sequence, a new more flexible structure where developed namely direct access sequence. • Thus, a direct access sequence has an indicator which allows the access in direct manner to any of the sequence’s component. • For this purpose, each sequence’s component is identified by an index which indicates its position in a similar manner with array structure. • Components numbering starts SequenceDimension-1. with 0,1,2,... and continues until • But, the creation and the modification of the direct access sequence are realized in sequential manner too. • In other words we can add or modify records, but we cant’ insert or delete the existing records. • The abstract data type direct access sequence is synthetically presented in [1.4.5.2.a] and in [1.4.5.2.b] a Pascal implementation example. -----------------------------------------------------------ADT Direct Acces Sequence Mathematical model: a sequence of elements belonging to the same type. An indicator indicates the current accessible element of the sequence. The indicator can indicate any element of the sequence. Each sequence component is uniquely identified by an index. The counting of components starts with 0,1,2,... and continues until SequenceDimension-1. Notations: ElementType - the type of the sequence element. It can’t be sequence type; das – direct access sequence; e - variable of ElementType; [1.4.5.2.a] i - index in das. Operators: Assign(das, DiscFileName) - assigns to variable direct access sequence das the name of the disc file; Rewrite(das) – operator which poses the direct access sequence indicator at the beginning of the das and opens the sequence das in writing mode. If das doesn’t exist it is created. If das already exists, its old variant is lost and a new empty direct access sequence is created. In both cases, the associate sequence indicator takes the value 0; Reset(das) - operator which poses the sequence indicator at the beginning of the das and opens the sequence in reading mode. The indicator takes the value 0. If das doesn’t exist an execution error is signalized; Open(das) – operator which in some implementations play the role of rewrite or reset sequence; i:= SequenceDimension(das) – operator which returns the current total number of components stored in sequence das; i:= SequencePosition(das) – the operator returns the number of the current record which is specified by the sequence indicator. When a sequence is opened, SequencePosition takes the value 0; Search(das,i) – selects the component specified by i, posing the sequence indicator to point the this component; Retreive(das,e) - operator which supplies in variable e the current element indicated by sequence indicator and advances at the next element; Store(das,e) – stores the value e in the current element indicated by the sequence indicator in das and advances to the next element; Close(das) – closes sequence das. -----------------------------------------------------------{Implementation of the type direct access sequence - Pascal Variant} TYPE ElementType = ... ; DirectAccessSequenceType = FILE OF ElementType; IndexType = integer; VAR das: DirectAccessSequenceType; e: ElementType; [1.4.5.2.b] i: IndexType; {Assign(das, DiscFileName)} assign(das, DiscFileName) {Rewrite(das)} rewrite(das) {Reset(das)} reset(das) {i:= SequenceDimension(das)} i:= filesize(das) {i:= SequencePosition(das)} i:= filepos(das) {Search(das,i)} seek(das,i) {Store(das,e)} write(das,e); writeln(...); {Retreive(das,e)} read(das,e); readln(...); {Close(das)} close(das) ------------------------------------------------------------ • Similar implementation variants of ADT Sequence exist for C and C++ programming languages. • If at principle level the presented concepts remain similar, at the implementation level important differences may appear.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1 Fundamental data structures