Download Chapter 1

Document related concepts
no text concepts found
Transcript
Chapter 1
Preliminaries
ISBN 0-321-33025-0
Purpose of This Book
• To examine carefully
• the underlying concepts of the various constructs
and capabilities of programming languages
1-2
Chapter 1 Topics
• Reasons for Studying Concepts of
Programming Languages
• Programming Domains
• Language Evaluation Criteria
• Influences on Language Design
• Language Categories
• Language Design Trade-Offs
• Implementation Methods
• Programming Environments
1-3
Reasons for Studying Concepts of
Programming Languages (1)
• Increased ability to express ideas
– The language in which programmers develop software places
limits on the kinds of control structures, data structures, and
abstraction they can use
• Awareness of a wider variety of programming language
features can reduce such limitations in software
development
– Languages constructs can be simulated in other languages
that do not support those constructs directly;
• however, the simulation is often
– less elegant
– more cumbersome
– less safe
1-4
Reasons for Studying Concepts of
Programming Languages (2)
• Improved background for choosing
appropriate languages
• Increased ability to learn new languages
– According to TIOBE, Java, C, and C++ were
the three most popular languages in use in
Feb. 2017.
1-5
Reasons for Studying Concepts of
Programming Languages (3)
• Better understanding of significance of
implementation
– Program bugs
– Performance
• Better use of languages that are already
known
• Overall advancement of computing
– Those in positions to choose languages were
not sufficiently familiar with programming
language concepts
1-6
Programming Domains (1)
• Scientific applications
– Large number of floating point computations
– Simple data structures
– Fortran
• Originally developed by IBM in the 1950s
• Business applications
– Facilities for
• producing elaborate reports,
• Storing decimal numbers and character data
• The ability to specify decimal arithmetic operations
– COBOL
• The initial version appeared in 1960
• Artificial intelligence
– Symbols, consisting of names rather than numbers, are
manipulated
– LISP
• Appeared in 1959
1-7
Programming Domains (2)
• Systems programming
– The operating system and all of the
programming support tools of a computer
system are collectively known as its systems
software.
• Systems software is used almost continuously
and so must be efficient.
– A language for this domain must provide
• fast execution
• having low-level features that allow the software
interfaces to external devices to be written
–C
• The Unix OS is written almost entirely in C
1-8
Programming Domains (3)
• Web Software
– Eclectic collection of languages: markup (e.g.,
XHTML), scripting (e.g., PHP), general-purpose
(e.g., Java)
1-9
Language Evaluation Criteria
• Readability: the ease with which
programs can be read and understood
• Writability: the ease with which a
language can be used to create programs
• Reliability: conform to specifications (i.e.,
performs to its specifications)
• Cost: the ultimate total cost
1-10
Why Readability Is Important
• Maintenance was recognized as a major
part of the software life cycle, particularly
in terms of cost.
• Ease of maintenance is determined in large
part by the readability of programs.
1-11
Characteristics Contributing to the
Readability - Simplicity
• Overall simplicity
– A manageable set of features and constructs
• Readability problems occur whenever the program’s author has
learned a different subset from that subset with which the
reader is familiar.
– Few feature multiplicity (methods of doing the same operation)
• For example, in Java the following ways could be used to
increase a integer variable
count = count + 1
count += 1
count++
++count
– Minimal operator overloading (a single operator symbol has
more than one meaning)
• Overloading may simplify a language by reducing the number
of operators; however, it can lead to reduced readability if users
are allowed to create their own overloading and do not do it
sensibly.
1-12
Excessive Simplicity
• Simplicity improves readability; however,
excessive simplicity may also reduce
readability.
– For example:
• The form and meaning of most assembly language
statements are models of simplicity.
• This very simplicity, however, makes assembly
language programs less readable. Because they lack
more complex control statements, program
structure is less obvious.
1-13
Characteristics Contributing to the
Readability – Control Statements
• The presence of well-known control structures (e.g., while
statement)
• A program that can be read from top to bottom is much
easier to understand than a program that requires the
reader to jump from one statement to some nonadjacent
statement in order to follow the execution order.
1-14
Characteristics Contributing to the
Readability - Data Types and Structures
• The presence of adequate facilities for defining data
structures
• Example:
– If a language doesn’t have a Boolean type, then it may need to
use a numeric type as an indicator flag
timeOut =1
– Comparing with a language providing Boolean type, the
following state is much more readable
timeOut = true
1-15
Characteristics Contributing to the
Readability - Syntax Considerations (1)
• Identifier forms: flexible composition
– Restricting identifiers to very short lengths detracts from
readability
1-16
Characteristics Contributing to the
Readability - Syntax Considerations (2)
• Special words
– Program appearance and thus program readability are strongly
influenced by the forms of a language’s special words
– Whether the special words of a language can be used as names
for program variables?
• methods of forming compound statements
– C and its descendants use braces to specify compound
statements.
– All of these languages suffer because statements groups are
always terminated in the same way, which makes it difficult to
determine which group is being ended when an end or }
appears.
1-17
Characteristics Contributing to the
Readability - Syntax Considerations (3)
• Form and meaning: self-descriptive constructs, meaningful
keywords
1-18
Evaluation Criteria: Writability
• Writability is a measure of how easily a
language can be used to create programs
for a chosen problem domain.
• Most of the language characteristics that
affect readibility also affect writability.
– This follows directly from the fact that the
process of writing a program requires the
programmer frequently to reread the part of the
program that is already written
1-19
Writability Comparison between Two
Different Languages
• It is simply not reasonable to compare the
writability of two languages in the realm of
a particular application when one was
designed for that application and the other
was not.
1-20
Characteristics Contributing to the
Writability - Support for Abstraction
• Abstraction - the ability to define and
use complex structures or operations in
ways that allow details to be ignored
• Programming languages can support two
distinct categories of abstraction:
– Process
– Data
1-21
Process Abstraction
• A simple example of process abstraction is
the use of a subprogram to implement a
sort of algorithm that is required several
times in a program.
1-22
Data Abstraction (1)
[Wikipedia]
• Data abstraction enforces a clear separation
between the abstract properties of a data type and
the concrete details of its implementation.
• The abstract properties are those that are visible to
client code that makes use of the data type—the
interface to the data type—while the concrete
implementation is kept entirely private, and indeed
can change, for example to incorporate efficiency
improvements over time.
• The idea is that such changes are not supposed to
have any impact on client code, since they involve
no difference in the abstract behavior.
1-23
Data Abstraction (2)
• For example, one could define an abstract data
type called lookup table which uniquely associates
keys with values, and in which values may be
retrieved by specifying their corresponding keys.
• Such a lookup table may be implemented in
various ways: as
– a hash table
– a binary search tree
or
– even a simple linear list of (key:value) pairs.
• As far as client code is concerned, the abstract
properties of the type are the same in each case.
1-24
Data Abstraction (3)
• A binary tree
– Fortran 77 – use integer arrays to implement
– C++ and Java – use a class with two pointers (or
references) and an integer
1-25
Characteristics Contributing to the
Writability - Expressivity
• A set of relatively convenient ways of
specifying operations
– Example:
• the inclusion of for statement in many modern
languages makes writing counting loops easier
than with the use of while.
1-26
Evaluation Criteria: Reliability
• A program is said to be reliable if it
performs to its specifications under all
conditions.
1-27
Characteristics Contributing to the
Reliability – Type Checking
•
Testing for type errors in a given program, either by the compiler or during
program execution.
–
–
–
–
Run-time type checking is expensive
Compile-time type checking is more desirable
The earlier errors in programs are detected, the less expensive it is to make the
required repairs
Example
void greater_than(unsigned int a, int b)
{
if(a<b)
printf("a<b\n");
else
printf("a>b\n");
}
bar()
{
greater_than(-1,2);
}
1-28
Characteristics Contributing to the
Reliability – Exception Handling
• Exception handling
– Intercept run-time errors
and
– take corrective measures
and
– then continue the corresponding program’s
execution
1-29
Characteristics Damaging the Reliability –
Aliasing
• Presence of two or more distinct
referencing methods for the same memory
location
• It is now widely accepted that aliasing is a
dangerous feature in a programming
language
• Most programming languages allow some
kind of aliasing –
– for example, two pointers set to point to the
same variable.
1-30
Characteristics Contributing to the
Reliability – Readability and Writability
• Readability and writability
– A language that does not support “natural” ways
of expressing an algorithm will necessarily use
“unnatural” approaches, and hence reduced
reliability
1-31
Evaluation Criteria: Cost
•
•
•
•
•
Training programmers to use language
Writing programs
Compiling programs
Executing programs
Language implementation system:
availability of free compilers
• Reliability: poor reliability leads to high
costs
• Maintaining programs
1-32
Evaluation Criteria: Others
• Portability
– The ease with which programs can be moved
from one implementation to another
• Generality
– The applicability to a wide range of applications
• Well-definedness
– The completeness and precision of the
language’s official definition
1-33
Influences on Language Design
• Computer Architecture
– Languages are developed around the prevalent
computer architecture, known as the von
Neumann architecture
• Programming Methodologies
– New software development methodologies (e.g.,
object-oriented software development) led to
new programming paradigms and by extension,
new programming languages
1-34
Von Neumann Architecture
• Most of the popular languages of the past 50 years
have been designed around the prevalent
computer architecture: Von Neumann architecture
• These language are called imperative languages.
– Data and programs are stored in the same memory
– Memory is separate from CPU
– Instructions and data are transmitted from memory to
CPU
– Results of operations in the CPU must be moved back to
memory
• Nearly all digital computers built since the 1940s
have been based on the von Neumann architecture
1-35
The Motherboard of a Computer
1-36
The von Neumann Architecture
1-37
Central Features of Imperative Languages
• Variables: model memory cells
• Assignment statements: model piping
• Iteration is fast on von Neumann computers
because instructions are stored in adjacent
cells of memory and repeating the
execution of a section of code requires only
a simple branch instruction
1-38
Program Execution on a Von Neumann
Computer
• The execution of a machine code program
on a von Neumann architecture computer
occurs in a process called the
fetch-execute cycle.
• Each instruction to be executed must be
moved from memory to the processor.
• The address of the next instruction to be
executed is maintained in a register called
the program counter.
1-39
Fetch-execute-cycle (on a von
Neumann Architecture)
initialize the program counter
repeat forever
fetch the instruction pointed by the counter
increment the counter
decode the instruction
execute the instruction
end repeat
P.S.: the ``decode the instruction’’ step in the algorithm
means the instruction is examined to determine what
action it specifies.
1-40
Functional Language Programs
Executed on a Von Neumann Machine
• A functional language is one in which the primary
means of computation is applying functions to
given parameters.
• Programming can be done in a functional language
– without the kind of variables that are used in imperative
languages
– without assignment statements
and
– without iteration.
• Although many computer scientists have
expounded on the myriad benefits of functional
languages, it is unlikely that they will displace the
imperative language until a non-von Neumann
computer is designed that allows efficient
execution of programs in functional languages
1-41
Evolution of Programming Methodologies (1)
• 1950s and early 1960s:
– simple applications
– worry about machine efficiency
• 1970s:
– hardware costs decreased
– programmer costs increased
– larger and more complex problems were being solved by
computers
– Emphasis:
• structured programming
• top-down design and step-wise refinement
– Deficiency:
• Incompleteness of type checking
1-42
Evolution of Programming Methodologies (2)
• Late 1970s:
– shift from procedure-oriented to data-oriented
– emphasize data design, focusing on the use of abstract
data types to solve problems
– most languages designed since the late 1970s support
data abstraction
• Middle 1980s: Object-oriented programming
– data abstraction
• encapsulates processing with data objects
• controls access to data
– Inheritance
• enhances the potential reuse of existing software, thereby
providing the possibility of significant increases in
software development productivity
– dynamic method binding
• allow more flexible use of inheritance
• overloaded method
• overridden method
1-43
All of the evolutionary steps in software development
methodologies led to new language constructs to
support them.
1-44
Programming Language Categories
• Imperative
– Central features are variables, assignment statements, and
iteration
– Examples: C, Pascal
• Functional
– Main means of making computations is by applying functions to
given parameters
– Examples: LISP, Scheme
• Logic
– Rule-based (rules are specified in no particular order)
– Example: Prolog
• Object-oriented
– Data abstraction, inheritance, late binding
– Examples: Java, C++
1-45
Should Languages Support Object-oriented
Programming Form a Separate Language Category?
• The author of this book claimed that he
does not consider languages that support
object-oriented programming to form a
separate category of language, because,
both imperative languages and function
languages support object-oriented
programming.
1-46
Subcategories of Imperative Languages
• Visual languages:
– e.g. Visual BASIC and Visual BASIC .NET
– These languages include capabilities for dragand-drop generation of code segments.
– Once called fourth-generation Languages
– Provide a simple way to generate graphical user
interfaces to programs.
• Scripting Languages
– e.g. Perl, JavaScript, and Ruby
1-47
A Typical Session in Microsoft Visual
Basic 6
1-48
Execution Order of Programs
• In an imperative language,
– an algorithm is specified in great detail
and
– the specific order of execution of the
instructions or statements must be included.
• In a rule-based language, however, rules
are specified in NO particular order
– The language implementation system must
choose an execution order that produces the
desired result.
1-49
Markup Programming hybrid languages
• not a programming language, but used to
specify the layout of information in Web
documents
– examples: XHTML, XML
– However, some programming capability has
crept into some extensions to XHTML and XML
1-50
Benefits of Modular Design
• Modular design brings with it great
productivity improvements.
– First of all, small modules can be coded quickly
and easily.
– Secondly, general purpose modules can be reused, leading to faster development of
subsequent programs.
– Thirdly, the modules of a program can be tested
independently, helping to reduce the time spent
debugging.
1-51
Language Design Trade-offs
• The programming language evaluation
criteria provide a framework for language
design; however, that framework is selfcontradictory.
1-52
Instances of Language Design Trade-Offs
• Reliability vs. cost of execution
– Conflicting criteria
– Example: Java demands all references to array elements
be checked to ensure that the index is in it legal ranges
but that leads to increased execution costs
• Readability vs. writability
– Another conflicting criteria
– Example: APL provides many powerful operators (and a
large number of new symbols), allowing complex
computations to be written in a compact program but at
the cost of poor readability
• Writability (flexibility) vs. reliability
– Another conflicting criteria
– Example: C++ pointers are powerful and very flexible but
not reliably used
1-53
Primary Components of a Computer
• Internal Memory
– Used to store data and program
• Processor
– a collection of circuits that provides a
realization of a set of primitive operations, or
machine instructions, such as those for
arithmetic and logic operations.
1-54
The Machine Language of a Computer
• Is its set of instructions.
• Is the ONLY language that the hardware of
the computer can understand directly.
• Provide the most commonly needed
primitive operations.
• Programs written by high level languages
require system software (language
implementation systems) to translate them
into corresponding machine language
versions.
1-55
Operating Systems
• Supply Higher-level primitives than those
of the machine language.
• These primitives provide
–
–
–
–
–
system resource management
input and output operations
a file management system
text and/or program editors
a variety of other commonly needed functions
1-56
Language Implementation Systems and
an Operating Systems
• Because language implementation systems need
many of the operating system facilities,
they utilize the operating system to do their
work rather than develop their own code to
interact with the hardware directly.
1-57
Implementation Methods
• Compilation
– Programs are translated into machine
language, which can be executed directly
on the computer
• Pure Interpretation
– Programs are interpreted by another
program known as an interpreter
• Hybrid Implementation Systems
– A compromise between compilers and
pure interpreters
1-58
Compilation
• Translate high-level program (source
language) into machine code (machine
language)
• Slow translation, fast execution
1-59
Phases of Compilation Process
• lexical analysis: gathers the characters of the source
program into lexical units.
– lexical units: identifiers, special words, operators and
punctuation symbols
• syntax analysis: transforms lexical units into parse
trees which represent the syntactic structure of
program
• intermediate code generation: translate a source
program into an intermediate language one
– semantics analysis: check for errors that are difficult if not
impossible to detect during syntax analysis, such as type
errors.
• code generation: machine code is generated
1-60
Optimization
• Improve programs (usually in their intermediate
code version) by making them smaller or faster or
both, is often an optional part of compilation.
• Some compilers are incapable of doing any
significant optimization.
• Optimization may
– omit some code in your program
– change the execution order of code in your program
• P.S.: Sometimes, especially when synchronization
between processes is required, the above results may
create some bugs in your programs which cannot be
detected by just checking the source code.
1-61
Optimization vs. Reliability
a:
process 2
process 1
memory
int a;
int foo()
{
a=1;
if(a>0)
a=3;
else
a=-1;
return a;
}
int a;
foo()
optimization
{
a=3;
return a;
}
If a is not a volatile variable, the optimization
improve performance; otherwise, it introduces
race condition problem.
1-62
Symbol Table
• The symbol table serves as a database for
the compilation process.
• The primary contents of the symbol table
are
– the type and attribute information of each userdefined name in the program.
• P.S.: This information is placed in the symbol table
by the lexical and syntax analyzers and is used by
the semantic analyzer and the code generator.
1-63
The Compilation Process
1-64
User Program Supporting Code
• The machine language generated by a
compiler can be executed directly on the
hardware; however, it must nearly always
be run along with some other code.
• Most user programs also require functions
from the OS.
• Among the most common of these are
functions for input and output.
1-65
Linking Operation
• Before the machine language programs
produced by a compiler can be executed,
the required functions from the OS must be
found and linked to the user program.
• The linking operation connects the user program
to the system functions by placing the
addresses of the entry points of the system
functions in the calls to them in the user
program.
1-66
Combine a User Program and All
Supporting Functions Together
address space of a process
0x40ffffff
printf:
linking
compilation
main()
loading
main:
main:
call add_of_printf
call
{
printf()
0x40ffffff
}
1-67
Linking Operation
• Load module (executable image): the user
and system code together
• Linking and loading (linking): the operation
of collecting system functions and linking
them to user programs
– Accomplished by a systems program called a linker
1-68
Libraries
• In addition to system functions, user programs
must often be linked to previously
compiled user functions that reside in
libraries.
• The linker not only links a given program to
system functions, it may also link it to
other user functions.
1-69
Von Neumann Bottleneck
• Connection speed between a computer’s
memory and its processor determines the
speed of a computer
• Program instructions often can be executed
a lot faster than the above connection
speed; the connection speed thus results in
a bottleneck
• Known as von Neumann bottleneck; it is the
primary limiting factor in the speed of
computers
1-70
Interpreter
• Programs are interpreted by another
program called an interpreter, with no
translation whatever.
• The interpreter program acts as a software
simulation of a machine whose fetchexecute cycle deals with high-level
language program statements rather than
machine instructions.
• This software simulation obviously provides
a virtual machine for the language.
1-71
Advantages of Interpretation
• Allowing easy implementation of many
source-level debugging operations, because all
run-time error messages can refer to
source-level unit.
– For example, if an array is found to be out of
rang, the error message can easily indicate the
source line and the name of the array.
1-72
Disadvantages of Interpretation (1)
• Slower execution (10 to 100 times slower
than compiled programs)
– The decoding of the high-level language
statements are far more complex than machine
language instruction.
– Regardless of how many times a statement is
executed, it must be decoded every time.
– Therefore, statement decoding, rather than the
connection between the processor and memory,
is the bottleneck of a pure interpreter.
1-73
Disadvantages of Interpretation (2)
• Often requires more space.
– In addition to the source program, the symbol
table must be present during interpretation
– The source program may be stored in a form
designed for easy access and modification
rather than one that provides for minimal size
1-74
Popularity of Interpretation
• Some simple early languages of the 1960s
(APL, SNOBOL, and LISP) were purely
interpreted.
• By the 1980s, the approach was rarely used
on high-level languages.
• In recent years, pure interpretation has
made a significant comeback with some
Web scripting languages, such as
JavaScript and PHP, which are now widely
used.
1-75
Pure Interpretation Process
1-76
Hybrid Implementation Systems
• A compromise between compilers and pure
interpreters
• A high-level language program is
translated to an intermediate language that
allows easy interpretation
• Faster than pure interpretation
1-77
Example (1)
• Perl programs
– are partially compiled to detect errors before
interpretation to simplify the interpreter.
1-78
Example (2)
• Initial implementations of Java
– initial implementations of Java were all hybrid
– its intermediate form, byte code, provides portability to
any machine that has a byte code interpreter and the Java class
library.
– There are now systems that translate Java byte code into
machine code for faster execution.
1-79
Java Bytecode Example
[wikipedia]
javac
translated by a Java compiler
Java code (*.java)
Java bytecode (*.class)
1-80
Java Virtual Machine
[Wikipedia]
• A Java virtual machine (JVM) [Wikipedia][zhebel]
is an abstract computing machine.
– p.s.: computing machine ≡ computer
• There are three notions of the JVM:
– specification,
– implementation,
– and instance.
1-81
Java Virtual Machine Specification
[Wikipedia]
• The specification is a book that formally
describes what is required of a JVM
implementation.
• Having a single specification ensures all
implementations are consistent.
1-82
Java Virtual Machine Implementation
[Wikipedia]
• A JVM implementation is a computer
program that implements requirements of
the JVM specification.
1-83
Java Virtual Machine Instance[Wikipedia]
• An instance of the JVM is a process that
executes a computer program compiled
into Java bytecode.
1-84
Java Runtime Environment
[Wikipedia]
• The Oracle Corporation owns the Java
trademark.
• Oracle distributes the Java Virtual Machine
implementation HotSpot together with an
implementation of the Java Class Library.
• The JVM and the Java class library are
named Java Runtime Environment (JRE).
1-85
The java Command
[oracle]
• The java command starts a Java application.
– It does this by starting a Java runtime
environment, loading a specified class, and
calling that class's main method.
1-86
Java Class Library[Wikipedia]
• The Java Class Library (JCL) is a set of
dynamically loadable libraries that Java
applications can call at run time.
• Because the Java Platform is not dependent
on a specific operating system, applications
cannot rely on any of the platform-native
libraries.
• Instead, the Java Platform provides a
comprehensive set of standard class libraries,
containing the functions common to
modern operating systems.
1-87
Hybrid Implementation Process
1-88
Just-in-Time (JIT) Implementation
Systems
• Initially translate programs to an intermediate
language
• Then during execution, it compiles intermediate
language methods into machine code when they
are called
• Machine code version is kept for subsequent calls
• JIT systems are widely used for Java programs
• .NET languages are implemented with a JIT system
1-89
Preprocessors
• A preprocessor is a program that processes
a program immediately before the program
is compiled.
1-90
Preprocessor Instructions
• Preprocessor instructions are embedded in
programs.
• Preprocessor instructions are commonly
used to specify that code from another file
is to be included.
– For example, the following C preprocessor
instruction #include myLib.c, causes the
preprocessor to copy the contents of myLib.c
into the program at the position of the
#include myLib.c.
1-91
More Preprocessor instructions
• Other preprocessor instructions are used to
define symbols to represent expressions.
– For example, one could use
#define max(A, B)
((A) > (B) ? (A): (B))
to determine the largest of two given expressions.
1-92
Programming Environments
• The collection of tools used in software
development
• This collection may consist of only
–
–
–
–
a
a
a
a
file system
text editor
linker
compiler
• Or a programming environment may
include a large collection of integrated
tools, each accessed through a uniform
user interface.
1-93
Programming Environment Examples
• UNIX
– Provides a wide array of powerful support tools for
software production and maintenance in a variety of
languages.
– Nowadays often used through a GUI (e.g., CDE, KDE, or
GNOME) that run on top of UNIX
• Borland JBuilder
– An integrated development environment for Java
• Microsoft Visual Studio.NET
– A large and elaborate collection of software development
tools, all used through a windowed interface.
– Used to program in C#, Visual BASIC.NET, Jscript, J#,
or C++
1-94
Summary
• The study of programming languages is valuable for a number
of reasons:
– Increase our capacity to use different constructs
– Enable us to choose languages more intelligently
– Makes learning new languages easier
• Most important criteria for evaluating programming languages
include:
– Readability, writability, reliability, cost
• Major influences on language design have been machine
architecture and software development methodologies
• The major methods of implementing programming languages
are:
– compilation,
– pure interpretation,
– hybrid implementation
1-95