Download Strings

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Advanced Programming Languages
Lecture week 7
Data structure
is a data object containing other data objects as elements or components
Common types: arrays, records, strings, stacks, lists, pointers, sets, files
Some are programmer-defined/some system-defined
Specification
Major Attributes
1. # components (fixed as in structs and arrays vs. variable as in lists and strings)
2. type of each component (homogeneous or heterogeneous)
3. selection of components (subscript, identifier)
4. maximum # of components (variable size)
5. organization of components (linear vs. multi-dimensional)
Operations
1. component selection (random vs. sequential)
2. whole-data structure ops ( r1 = r2 or A U B)
3. insert/delete components
4. create/destroy structures
Implementation
Storage representations
1. sequential (fixed size): single contiguous block of storage for both descriptor & components
2. linked (variable size): stored in several noncontiguous blocks, linked by pointers
Operation implementation
Sequential
1. access 1st (base address + offset)
2. access next (add size of component)
Random
1. access ith element (base address + offset * i) fixed size only
2. hash coding
Linked
1. access 1st (base address)
2. access next (follow chain of pointers)
Storage Management & Data Structures
Lifetime: begins when object is bound to location; ends when binding is dissolved
Individual elements of a variable sized structure have individual lifetimes
- inserted into/deleted from structure
Access path: created at start of object’s lifetime
- by association of data object with a name in some referencing environment
- by storage of a pointer in some other existing structure; then new object is a component of older one
New access paths may be created during lifetime
- passing object as argument to subprogram
- creating new pointers to it
So several access paths can exist to a data object
Access paths are destroyed in various ways
- return from subprogram
- assigning new value to pointer variable
Two major problems in storage mgmt
- Garbage: all access paths to object are destroyed, but object still exists
o Object no longer accessible from program, but its storage can’t be reused
- Dangling references: access paths to objects that no longer exist
o May compromise integrity of run time structure during execution
Structured Data Types
2 ways to join data so treated as unit:
1. array – elements identified by position (all same type)
2. record – elements identified by name (can be different types)
Array
Specification:
group of homogeneous elements with one name identified by position (all same type)
Implementation
stored in contiguous memory locations
C++/.Java: references by pointer
Could be in row, column format (pascal, Ada, C++, java) or column, row format (Fortran)
Examples:
ADA: Table: array (1..10, 1..N+1) of float;
C++: char a [2*n];
Java: int i=15;
Byte TwoDArray [ ] [ ] = new byte [256] [i+1]
Array Issues (10)
1. syntax use [ ] vs ( ) - overload array delimiter?
2. dimensions (lower bound?, upperbound)
3. bounds dynamic (can you use expressions and functions)
4. type for index set (ordinal or only integer)
5. range (can bounds be undefined)?
Example: ADA: type sequence is array (integer range < > ) of float;
P : sequence (1…1000);
Java:
int a [ ] [ ] = new int [10] [ ];
at least 1st specifier must have #elements
6. multidimensional
7. nonrectangular
Example: Java:
int [ ] [ ] TwoDim = { {1,2}, {3, 4, 5}, {5, 6, 7, 8}};
8. array slices
Example: PL/1 W (3, *)
ALGOL
W[3, ]
Java: int a[ ] [ ] = new int[10] [15];
use a[1]
9. array initialization
Example: java – see issue 7’s example
ADA type Tmatrix is array (integer range 1..2, integer range 1..2) of real;
A: Tmatrix := ((10,20), (20, 40));
Or
A: Tmatrix := (1 => (1 => 10, 2=>20),
2 => (2 => 20, 2 => 40));
10. operations
FORTRAN, ALGOL60 – elements act as scalars
APL, PL/1, ALGOL68 – assignment of arrays and subarrays of same size and type elements
ADA, PASCAL – assign entire arrays of same size and type (not same size and structure!)
ADA:
1. use ( ) overloaded
2. LB, UB
3. dynamic
4. any ordinal type
5. yes
6. yes
7. no
8. yes (if array of array)
C++:
1. Use [ ] not overloaded
2. UB only (0 is LB)
3. Dynamic
4. Integer
5. No
6. Yes
7. No
8. No
9. yes
10. yes (:=)
9. Yes
10. No (arrays ref. by pointers)
Java:
1. Use [ ] not overloaded
2. UB only (0 is LB)
3. dynamic
4. Integer
5. Yes
6. Yes
7. Yes
8. Yes (java
uses array of arrays)
9. Yes
10. Yes, clone
and = test
Records (structs)
Specification
– structured type composed of heterogeneous data
– describe attributes and their types in a declaration
Implementation
stored in contiguous memory locations
Examples:
ADA: type complex is record
Rpart, Ipart : float;
end record;
C++ : struct complex
{ float rpart, ipart;
}
var c: complex;
c.im := 5.2;
complex c;
c.im = 5.2;
Java: n/a
Record Issues (6):
1. how to reference components of a record
2. distinction of field ids w/ ids declared outside record (“scope” of a record)
3. initialization
4. a with statement to avoid continually referring to fields by qualifying record name
5. variant records (parameter to record declaration which is called a discriminant)
6. name equivalence vs. structure equivalence
ADA:
1. dot notation
2. “scope” of record
3. yes
var ibm: stock;
ibm :=(name=>‘IBM’, price=> 5.25);
4. ?
5. discriminates change by changing entire record
Examples:
type TMonth (length: integer range 1..31) is
Record
Name: string(1..3);
Days : array (1..length) of float;
End record;
Var Jan: Month(31);
6. name equivalence (passing and assignment)
C++:
1. Dot notation
2. “scope” of record
3. Yes
struct TCircle
{ float radius;};
Tcircle c = {5};
4. no
5. No, but does have
Template functions
template <class item>
struct Node
{ item data;
Node * link;
}
6. Name equivalence
Strings
Early FORTRAN – strings used in FORMAT statements for output
Later – A-format included, so I/O of strings
ALGOL60 – string constants enclosed by quotes
Later mid-60s
variable could hold character strings
relational operators on strings (assuming collating sequence)
set of operations (length, index, concat, substr)
implementation required dynamic (de) allocation as length not determinable at compile time (reason not in
FORTRAN)
Specification – sequence of characters
Operations: assignment, concat, relational ops, subscripting,…
Implementation – contiguous memory locations (like arrays)
Examples:
ADA: type string is array (positive range < > of character);
MyString: string (1..30);
 predefined
C++: char *s;
char s[10];
String Issues (3):
1. ways to declare (constant length, varying length, COBOL’s PICture clause)
2. is a char variable compatible with a string of 1 char
3. operations
ADA:
1. varying
2. implementation dependent
3. yes
C++:
1. Varying and constant
2. No
3. Library
Java:
1.
2. No
3. Yes (java.lang.string)
Pointers
Specification – pointer reference to some object
Pointer variable is an identifier whose value is a reference
Implementation – hardware (address)
Examples:
ADA: type handle is access string;
Moniker : handle := new string’ (TOXIC AVENGER’);
C++: int *p;
p = new int;
Pointer Issues (6):
1. aliasing
2. reclamation of garbage
3. syntactical denotations
4. pointers point to only objects of a single type
5. operations
6. dispose/delete
7. pointer arithmetic
ADA:
1. yes
2. no
3. context (dereferencing automatic for Rvalue)
4. yes
5. assignment, = test, dereferencing, new
6. no (too time consuming)
(no dispose, no garbage collection!)
7. no (unless you disable type checking)
C++:
1. Yes
2. No
3. p is pointer, *p is object p points to
4. Yes
5. Assignment, == test, dereferencing, new
6. Yes (delete)
7. yes
Java: (references)
1. yes
2. yes
3. context
4. yes
5. fetch
6. auto garbage collection
7. no
Defining new data types
Type definition (C, Pascal, ADA) but not complete ADT, doesn’t include operations
2 Design Issues
1. when are two types “the same”
2. when are two objects of the same type “equal” (deals with Rvalue)
1. name equivalence vs. structural equivalence
(Pascal, Ada, C++)
Disadvantages of name equivalence:
- var w:array[1..10] of real; but can’t pass as parameter, can’t redefine in subprogram
Disadvantages of structural equivalence:
- costly to determine
- may be inadvertently equal, combine with another => error/not checked
- identical fields ids (or types and order) same #components vs. subscript range
2.
data object equality (determine conditions for equality)
x, y : stack;
equality 1. X.top = y.top
2. For every 0 < I < top-1 : x.data[I] = y.data[I] EASY!
A, B: set;
Equality 1. A.size = B.size
. A.data[0]..A.data[size-1] permutation of B.data[0]..B.data[size-1]
NOT EASY TO DO! Ada defaults =, if not provided.
Type definitions with parameters
Ada: type Section (maxSize: integer) is
Record
Rm : integer;
InstructorCode: integer;
ClassSize : integer range 0..MaxSize;
ClassRoll : array (1..maxSize) of studId;
End record;
X : section (100);
Y: section(25);