Download Strings

Advanced Programming Languages Lecture week 7 Data structure is a data object containing other data objects as elements or components Common types: arrays, records, strings, stacks, lists, pointers, sets, files Some are programmer-defined/some system-defined Specification Major Attributes 1. # components (fixed as in structs and arrays vs. variable as in lists and strings) 2. type of each component (homogeneous or heterogeneous) 3. selection of components (subscript, identifier) 4. maximum # of components (variable size) 5. organization of components (linear vs. multi-dimensional) Operations 1. component selection (random vs. sequential) 2. whole-data structure ops ( r1 = r2 or A U B) 3. insert/delete components 4. create/destroy structures Implementation Storage representations 1. sequential (fixed size): single contiguous block of storage for both descriptor & components 2. linked (variable size): stored in several noncontiguous blocks, linked by pointers Operation implementation Sequential 1. access 1st (base address + offset) 2. access next (add size of component) Random 1. access ith element (base address + offset * i) fixed size only 2. hash coding Linked 1. access 1st (base address) 2. access next (follow chain of pointers) Storage Management & Data Structures Lifetime: begins when object is bound to location; ends when binding is dissolved Individual elements of a variable sized structure have individual lifetimes - inserted into/deleted from structure Access path: created at start of object’s lifetime - by association of data object with a name in some referencing environment - by storage of a pointer in some other existing structure; then new object is a component of older one New access paths may be created during lifetime - passing object as argument to subprogram - creating new pointers to it So several access paths can exist to a data object Access paths are destroyed in various ways - return from subprogram - assigning new value to pointer variable Two major problems in storage mgmt - Garbage: all access paths to object are destroyed, but object still exists o Object no longer accessible from program, but its storage can’t be reused - Dangling references: access paths to objects that no longer exist o May compromise integrity of run time structure during execution Structured Data Types 2 ways to join data so treated as unit: 1. array – elements identified by position (all same type) 2. record – elements identified by name (can be different types) Array Specification: group of homogeneous elements with one name identified by position (all same type) Implementation stored in contiguous memory locations C++/.Java: references by pointer Could be in row, column format (pascal, Ada, C++, java) or column, row format (Fortran) Examples: ADA: Table: array (1..10, 1..N+1) of float; C++: char a [2*n]; Java: int i=15; Byte TwoDArray [ ] [ ] = new byte [256] [i+1] Array Issues (10) 1. syntax use [ ] vs ( ) - overload array delimiter? 2. dimensions (lower bound?, upperbound) 3. bounds dynamic (can you use expressions and functions) 4. type for index set (ordinal or only integer) 5. range (can bounds be undefined)? Example: ADA: type sequence is array (integer range < > ) of float; P : sequence (1…1000); Java: int a [ ] [ ] = new int [10] [ ]; at least 1st specifier must have #elements 6. multidimensional 7. nonrectangular Example: Java: int [ ] [ ] TwoDim = { {1,2}, {3, 4, 5}, {5, 6, 7, 8}}; 8. array slices Example: PL/1 W (3, *) ALGOL W[3, ] Java: int a[ ] [ ] = new int[10] [15]; use a[1] 9. array initialization Example: java – see issue 7’s example ADA type Tmatrix is array (integer range 1..2, integer range 1..2) of real; A: Tmatrix := ((10,20), (20, 40)); Or A: Tmatrix := (1 => (1 => 10, 2=>20), 2 => (2 => 20, 2 => 40)); 10. operations FORTRAN, ALGOL60 – elements act as scalars APL, PL/1, ALGOL68 – assignment of arrays and subarrays of same size and type elements ADA, PASCAL – assign entire arrays of same size and type (not same size and structure!) ADA: 1. use ( ) overloaded 2. LB, UB 3. dynamic 4. any ordinal type 5. yes 6. yes 7. no 8. yes (if array of array) C++: 1. Use [ ] not overloaded 2. UB only (0 is LB) 3. Dynamic 4. Integer 5. No 6. Yes 7. No 8. No 9. yes 10. yes (:=) 9. Yes 10. No (arrays ref. by pointers) Java: 1. Use [ ] not overloaded 2. UB only (0 is LB) 3. dynamic 4. Integer 5. Yes 6. Yes 7. Yes 8. Yes (java uses array of arrays) 9. Yes 10. Yes, clone and = test Records (structs) Specification – structured type composed of heterogeneous data – describe attributes and their types in a declaration Implementation stored in contiguous memory locations Examples: ADA: type complex is record Rpart, Ipart : float; end record; C++ : struct complex { float rpart, ipart; } var c: complex; c.im := 5.2; complex c; c.im = 5.2; Java: n/a Record Issues (6): 1. how to reference components of a record 2. distinction of field ids w/ ids declared outside record (“scope” of a record) 3. initialization 4. a with statement to avoid continually referring to fields by qualifying record name 5. variant records (parameter to record declaration which is called a discriminant) 6. name equivalence vs. structure equivalence ADA: 1. dot notation 2. “scope” of record 3. yes var ibm: stock; ibm :=(name=>‘IBM’, price=> 5.25); 4. ? 5. discriminates change by changing entire record Examples: type TMonth (length: integer range 1..31) is Record Name: string(1..3); Days : array (1..length) of float; End record; Var Jan: Month(31); 6. name equivalence (passing and assignment) C++: 1. Dot notation 2. “scope” of record 3. Yes struct TCircle { float radius;}; Tcircle c = {5}; 4. no 5. No, but does have Template functions template <class item> struct Node { item data; Node * link; } 6. Name equivalence Strings Early FORTRAN – strings used in FORMAT statements for output Later – A-format included, so I/O of strings ALGOL60 – string constants enclosed by quotes Later mid-60s variable could hold character strings relational operators on strings (assuming collating sequence) set of operations (length, index, concat, substr) implementation required dynamic (de) allocation as length not determinable at compile time (reason not in FORTRAN) Specification – sequence of characters Operations: assignment, concat, relational ops, subscripting,… Implementation – contiguous memory locations (like arrays) Examples: ADA: type string is array (positive range < > of character); MyString: string (1..30);  predefined C++: char *s; char s[10]; String Issues (3): 1. ways to declare (constant length, varying length, COBOL’s PICture clause) 2. is a char variable compatible with a string of 1 char 3. operations ADA: 1. varying 2. implementation dependent 3. yes C++: 1. Varying and constant 2. No 3. Library Java: 1. 2. No 3. Yes (java.lang.string) Pointers Specification – pointer reference to some object Pointer variable is an identifier whose value is a reference Implementation – hardware (address) Examples: ADA: type handle is access string; Moniker : handle := new string’ (TOXIC AVENGER’); C++: int *p; p = new int; Pointer Issues (6): 1. aliasing 2. reclamation of garbage 3. syntactical denotations 4. pointers point to only objects of a single type 5. operations 6. dispose/delete 7. pointer arithmetic ADA: 1. yes 2. no 3. context (dereferencing automatic for Rvalue) 4. yes 5. assignment, = test, dereferencing, new 6. no (too time consuming) (no dispose, no garbage collection!) 7. no (unless you disable type checking) C++: 1. Yes 2. No 3. p is pointer, *p is object p points to 4. Yes 5. Assignment, == test, dereferencing, new 6. Yes (delete) 7. yes Java: (references) 1. yes 2. yes 3. context 4. yes 5. fetch 6. auto garbage collection 7. no Defining new data types Type definition (C, Pascal, ADA) but not complete ADT, doesn’t include operations 2 Design Issues 1. when are two types “the same” 2. when are two objects of the same type “equal” (deals with Rvalue) 1. name equivalence vs. structural equivalence (Pascal, Ada, C++) Disadvantages of name equivalence: - var w:array[1..10] of real; but can’t pass as parameter, can’t redefine in subprogram Disadvantages of structural equivalence: - costly to determine - may be inadvertently equal, combine with another => error/not checked - identical fields ids (or types and order) same #components vs. subscript range 2. data object equality (determine conditions for equality) x, y : stack; equality 1. X.top = y.top 2. For every 0 < I < top-1 : x.data[I] = y.data[I] EASY! A, B: set; Equality 1. A.size = B.size . A.data[0]..A.data[size-1] permutation of B.data[0]..B.data[size-1] NOT EASY TO DO! Ada defaults =, if not provided. Type definitions with parameters Ada: type Section (maxSize: integer) is Record Rm : integer; InstructorCode: integer; ClassSize : integer range 0..MaxSize; ClassRoll : array (1..maxSize) of studId; End record; X : section (100); Y: section(25);

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Strings