Download 01_Introduction

Document related concepts

Falcon (programming language) wikipedia , lookup

C Sharp (programming language) wikipedia , lookup

Transcript
Introduction
Programming Language Concepts
1
Objective of the Course
1.
2.
3.
4.
Understand concepts of programming languages.
Become familiar with different language paradigms
and the kind of problems they are for.
Learn to use useful tools.
Learn basics of some widely used languages

C/C++

Scheme (functional language like Lisp)

Prolog

Python

other language, depending on your interest
2
Evaluation Criteria
Class participation and attendance
 Homework
 Programming assignments
 Quizzes
 Exams

3
Your Responsibility

Perform the role of a student at a level suitable for a
first-tier university.

Behavior and activities:
1.
Read the assigned text before each lecture
2.
Do homework assignments early
3.
Participate in class
4.
Self-directed learning
5.
Analyze your own understanding
4
Instructor's Responsibility

Provide clear guidance on the course of study

Make the requirements clear and unambiguous

Facilitate discussion, answer your questions

Provide helpful evaluation of your understanding

Listen to and respond to feedback
5
Why Tools?
"If the only tool you have is a hammer, then everything
looks like a nail."
6
What is a Programming Language?

A vocabulary, set of grammar rules, and associated
meanings for communication between people and
computers, or communication between computers.

Another definition: A programming language is a
notational system for describing computation in
machine-readable and human-readable form [K.
Louden].
7
Instructions to the CPU

A CPU executes instructions, and uses data, in binary
format. An executable computer program may look
like this:
0010110010100010
0001011110000000
0010110000011000
1101100001011001
0011001010001011
0000011101001111
1101100001100001
0010110100010001
0001011100001110
…etc…
First instruction to CPU
Second instruction
an address for 2nd instruction
3rd instruction
4th instruction
an argument (data) for 4th instruction
another argument for 4th instruction
5th instruction
6th instruction
… more instructions …
8
Running A Program
The program and data is loaded into main memory.
2. The address of the program's first instruction is placed
in a special register in the CPU: the program
counter.
3. The CPU is told to fetch and execute that instruction.
1.
Memory
PC Register
9
What instructions does the CPU
understand?
CPU instructions cause the CPU to implement logic
which is designed (hardwired) into the CPU.
 Most CPU instruction are very simple, such as:

LOAD memory_address TO some register
SAVE some register TO memory_address
MOVE some register TO another register
ADD some register TO another register
MULT some register TO another register
COMPARE register TO another register
TEST some register’s value, jump to new instruction if
true
10
JUMP TO new instruction (unconditional branch)
Generations of Computer Languages

Machine language:
program as series of binary instructions (0 and 1's)

Assembly Language

"High level" or procedural Languages


this is the category we will study
4GL - application specific languages
11
Machine Language

These simple, binary instructions are called machine
language instructions.

Writing machine language is slow and difficult, so
people invented a symbolic notation for machine
language, called assembly language.
Assembly Language
MOV AX,1F80H
PUSH AX
INT 21H
POP AX
Machine Language
0010110010100010
0001111110000000
0010110000011000
1101100001011001
0011001000011000
12
Assembly Language

Simple assembly language instructions have a 1-to-1
translation to machine language.

A program called an assembler converts assembly
language into machine language:
unix: as -o output_program input_program.s
Assembly Language
MOV AX,1F80H
PUSH AX
INT 21H
POP AX
assembler
Machine Language
0010110010100010
0001111110000000
0010110000011000
1101100001011001
0011001000011000
13
Viewing Assembly Code on a PC
1.
2.
3.
4.
5.
Find a small .exe or .com file, e.g. diskcopy.com
Open a “Command Prompt” (DOS) window.
Enter “debug filename”.
Enter “u” to unassemble and view the program.
Enter “q” to quit debug.
C:\WINDOWS\system32> debug diskcopy.com
- u
0B30:0000 0E
PUSH
CS
0B30:0001 1F
POP
DS
0B30:0002 BA0E00
MOV
DX,000E
0B30:0005 B409
MOV
AH,09
- q
machine code
(in hex)
assembly language
14
Macro Assembly Language

Simple assembly language instructions are still very
tedious and time consuming to input (“code”).

Macro assembly language adds the use of symbolic
variable names and compound instructions. A
compound instruction is an instruction that is converted
into several machine level instructions.
15
Macro Assembly Example
Here is a simple “Hello, World” program in macro
assembly language:
.stack
.data
message db "Hello world, I’m a program.”, "$"
.code
main proc
mov ax,seg message
mov ds,ax
mov ah,09
lea dx,message
int 21h
mov ax,4c00h
int 21h
main endp
end main
16
Problems with Assembly Language
Programming is still difficult in macro assembly
language:
 language does not match the way people think
 programs are long and difficult to understand
 program logic errors are common and difficult to correct
 difficult to divide a large problem into small problems
using assembly language
 assembly language is machine dependent -- won’t run
on a different type of computer!
 you have to rewrite all your software whenever you
change to different hardware platform
17
Low-level and high-level languages
Higher, more "abstract", languages aid programming
and thinking about algorithms.
 Separate program logic from hardware implementat'n

Add 2 numbers in C:
int sum( int x, int y ) {
int z = x + y;
return z;
}
Which one is easier to
understand?
In Assembly Language:
.globl sum
.type sum,@function
sum:
pushl
%ebp
movl
%esp, %ebp
subl
$4, %esp
movl
12(%ebp), %eax
addl
8(%ebp), %eax
movl
%eax, -4(%ebp)
movl
-4(%ebp), %eax
movl
%eax, %eax
leave
ret
18
High Level Languages
Higher level languages (also called third generation
languages) provide these features:
 English-like syntax
 descriptive names to represent data elements
 concise representation of complex logic
 use of standard math symbols:
total = quantity * price * (1 + taxrate);
 conditional execution and looping expressions:
if ( total > 0 ) then writeln(“The total is “, total)
else writeln(“No sale.”);
 functional units, such as "functions" or "classes":
radius = sqrt( x*x + y*x ); // call sqrt function
19
High Level Language Benefits

programming is faster and easier

abstraction helps programmer think at a higher level

standardized languages enable code to be machine
independent, and reduces training effort

complex problems can be divided into smaller
problems, each coded and tested separately.

code re-use

provides standard programming interface for hardware
interactions, such as input and output.
20
Some higher level languages
There are several broad categories:

Imperative: FORTRAN, COBOL, BASIC, C, Pascal

Functional: Scheme, Lisp

Logic Languages: Prolog

Object-oriented:

Pure object-oriented: Java, Python

Object-oriented & imperative: C++, Perl, Visual Basic
21
Fourth Generation Languages
Fourth Generation Languages (4GL) are application
specific. They are tailored for a particular application.

SQL (structured query language) for databases

Postscript page description language for printers

PDF (portable document format) for online documents

HTML and PHP for World Wide Web content

Mathematica for symbolic math on a computer
Mathematica is a proprietary language; the others are
industry standards.
22
4GL Example: SQL
SQL is one of the most widely used 4GL. Here are
some examples:
 Insert data into a database table:

INSERT INTO employees (id, lastName, firstName, Job)
VALUES (1445, ‘John’,’Smith',’manager’);

Choose data from a table based on criteria:
SELECT id, lastName, salary FROM employees
WHERE ( Job = ‘manager’ );
SQL is declarative: statements say what you want
done but not how to do it.
 Learn more about SQL at the Open Directory Project:

http://dmoz.org/Computers/Programming/Languages/SQL
23
Factors Influencing Language
Many factors influence the nature of a programming
language. Some major factors are:
 Environment: the capabilities of the machine that we
are communicating with.
 early computers had very few instructions, no
operating system, no memory management.
 Application Domain: language is influenced by the
kind of information we want to communicate.
 most languages are used to specify algorithms.
 language may contain only statements for the kind
of problem we are interested in solving, e.g., Fortran
for scientific/numerical computing.
24
Factors Influencing Language (2)
Methodology: an evolving discipline.
 new approaches to constructing software are
developed based on experience and need.
 O-O emerged as response to limitations of
imperative languages in handling complexity.
 Preference, Economics, and Patronage:
 languages have been influenced by the backing of
large companies (Fortran by IBM, Java by Sun)
 government patronage: Ada by U.S. DoD.
 preferred by experts:


"structured programming" and modularity made popular by
C.S. professors
25
Language Paradigms
Imperative (procedural): traditional sequential
programming: program statements operate on
variables.


variable represents data in memory locations.

characterized by variables, assignment, and loops.

basic unit of imperative programs in the procedure
or function
Examples: Algol, C, Pascal, Ada, FORTRAN
26
Language Paradigms
Functional: functions are first-class entities (can be used
same as other forms of data); all execution is by
function evaluation


characterized by recursion and functions as data

execution on values, not memory locations -variables are not necessary!

a function can dynamically define and return a new
function: self-evolving code.
Examples: Lisp, Scheme, ML, Haskell
27
Language Paradigms
Logic: program is declarative, it specifies what must be
true but not how to compute it.


logic inference the basic control

no sequential operation

non-deterministic: may have many solutions or
none
Example: Prolog
28
Language Paradigms
Object-oriented: an object contains its own state (data)
and the functions that operate on that state.


program logic is instantiation of objects, messages
between objects, encapsulation, and protection.

computing model mostly an extension of imperative.
Examples: C++, C#, Java, Smalltalk, Python
29
More Language Paradigms (1)

Declarative: state what needs computing, not how to
compute it (algorithm).

Many 4GL, like SQL and Mathematica share this
property.

Prolog is also declarative
30
More Language Paradigms (1)
Concurrent or Parallel: Programming to utilize multiple
CPU or multiple threads of execution.
 Requires attention to task management,
synchronization, and data conflict
 sequence of execution may not be predictable.
 parallel features are often added to existing
programming languages.
 Examples: threads in Java, C#, and other languages.
MPI (Message Passing Interface) library for cluster and
grid computing.
31
Programming Languages used in Open Source projects at SourceForge.net
Source: http://www.cs.berkeley.edu/~flab/languages.html
32
Example: Euclid’s gcd algorithm
Compute the greatest common divisor of two integers.
For example, the gcd of 90 and 24 is 6.
33
C
/* “functional” implementation of gcd uses recursion
*/
#include <stdio.h>
int gcd(int u, int v)
{ if (v == 0) return u;
else return gcd (v, u % v); // “tail” recursion
}
int main() /* test of gcd */
{ int x, y;
printf("Input two integers: ");
// note: use references to read input into x, y
scanf("%d%d",&x,&y);
printf("The gcd of %d and %d is %d\n", x, y,
gcd(x,y) );
return 0;
}
34
Java: imperative style GCD
public class AnyClassAtAll {
/** compute greatest common divisor.
* @return the g.c.d. of m and n.
*
1 if m and n are zero.
*/
private static long gcd(long u, long v) {
long remainder;
if (v < 0) v = -v;
while ( v != 0 ) {
remainder = u % v;
u = v;
v = remainder;
}
if ( u == 0 ) return 1;
// gcd(0,x) = 1
else return (u>0)? u : -u; // absolute value
}
... remainder of the class is irrelevant
35
Java: Object-oriented GCD
/** This class finds the GCD of one value (the state of
* the object) with any other value given as parameter.
*/
public class GCD {
// attribute: state of the object (immutable)
private final int value;
/** constructor */
public GCD( int value ) { this.value = value; }
/** compute GCD of private state and param v */
public int gcd ( int v ) {
int u = value; // don't modify object's state
while ( v != 0 ) {
int t = u % v;
u = v;
v = t;
}
return u;
}
}
36
Scheme
; functional implementation of gcd
; uses recursion
(define (gcd u v)
(if (= v 0) u
(gcd v (modulo u v) )
)
)
Scheme syntax for defining a function:
( define ( function-name param1 param2 ... )
body of function definition
)
37
Scheme application
; using the gcd: perform I/O, invoke gcd
(define (euclid)
(display "enter two integers:")
; use of variables not really necessary
; and not "functional" style
(let ( (u (read)) (v (read)) )
(display "the gcd of ")
(display u)
(display " and ")
(display v)
(display " is ")
(display (gcd u v))
(newline)
)
)
38
Prolog
/* conditions for GCD */
gcd(U, V, U) :- V = 0.
gcd(U, V, X) :- not (V = 0),
Y is U mod V,
gcd(V, Y, X).
/* Goal: compute the GCD of 288 and 60. */
gcd(288, 60, X).
In Prolog, a clause is an assertion that can succeed (be true) or
fail of the form:
consequence :-
a, b, c.
means: consequence is true if a, b, and c are true.
39
FORTRAN 77
C
Greatest common denominator for real programmers
INTEGER FUNCTION IGCD(U,V)
INTEGER U, V, TMP
DO WHILE ( V .NE. 0 )
TMP = V
V = MOD(U,V)
U = TMP
END DO
Assign returned value
IGCD = V
RETURN
END
PROGRAM MAIN
WRITE(6,*) "Input two integers:"
READ(5,*) I, J
I, J implicitly integer
WRITE(6,100) I, J, IGCD(I,J)
100 FORMAT("GCD of ",I4," and ",I4," is ",I4)
STOP
END
40
Paradigm use is rarely “pure”

The C gcd() example defines gcd in a functional style,
even though C is mainly imperative.

Java can be used to write purely imperative style
programs (all static, no objects)


also, in Java primitive data types aren't objects
Scheme uses I/O operations, which depend on
sequence and external effects (imperative style)

in a "pure" functional languages, the result of a
function depends only the the parameters

this isn't true of I/O operations
41
Language Design
Some conflicting objectives, criteria, and goals
42
Goals for language design

Power

Simplicity

Flexibility

Clarity

Expressiveness


Writability
Consistency
(orthogonality)

Efficient implementation

Readability

Support for abstraction

Applicability to problem
domain

Portability
43
Readability or writability?

Should programming languages promote the writing of
programs or the reading of programs?

Many people (including the writer!) may need to read a
program after it is written.
44
Readability or writability?
Q: What does this Perl script do?
#!/usr/bin/perl
foreach $FILE ( @ARGV ) {
open(FILE) || die "Couldn't open $FILE";
while($_ = <FILE>) { print $_; }
close(FILE);
}
45
Language definition
Syntax: defines the grammar of a language.
 what are valid statements, what is a valid program.
 given in formal notation such as BNF or ENBF.
 Semantics: the meaning of the elements of a
language.
 usually defined in human language
 formal notations exist, but not widely used
 can have a static component: type checking,
definition checking, other consistency checks prior
to execution.
 dynamic: run-time checking of array indices, runtime type determination.

46
Syntax
Defines symbols and grammar of a language.
 Usually given in Backus-Naur Form or its extensions.

if-statement ::= if ( expression ) statement-block
[ else statement-block ]
statement-block ::= statement ';'
| '{' statement ';' [...] '}'
statement ::= if-statement | assignment-statement |
while-statement | ...etc...
47
Language implementation strategies

Compiler:
multi-step process that translates source code into
target code; then the user executes the target code.

Interpreter:
one-step process in which the source code is executed
directly.

Hybrids:
"just in time" compilers - Perl
"virtual machine language" - Java, Microsoft .NET
languages.
48
Compiler versus Interpreter
Source Program
Input
Interpreter
Output
Execute on machine
Source Program
Compiler
Input
Target Program
Output
Execute on machine
49
Language processing: Interpreted
Interpreted: BASIC, Postscript, Scheme, Matlab
 The interpreter reads the source program and executes
each command as it reads.
 The interpreter “knows” how to perform each instruction
in the language.

Source
Program
Interpreter
Execution
50
Language processing: Compiled
Compiled: C/C++, Pascal, Fortran
 The compiler converts source code into machine
language to create an object code file.
 A linker combines object code files and pre-compiled
libraries to produce an executable program (machine
language).

51
Compiling a Program
Source
Code
Compiler
Object
Code
file.c
main() {
printf("hello");
exit(0);
}
printf.obj
<obj. code for
printf function>
Libraries (of
object codes)
Linker
file.obj
.sym printf
FE048C7138
029845AAAF
...
Executable
Program
file.exe
<hardware
instructions>
52
Typical Phases of a Compiler
Source
Program
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code
Generator
Code Optimizer
Code Generator
Target
Program
53
Interpreted versus Compiled
Interpreted
 Flexible
 More interactive
 More dynamic behavior
 Rapid development
 Can run program
immediately after writing
or changing it
 Portable to any machine
that has the interpreter
Compiled
 More efficient execution
 Extensive data checking
 More structured
 Usually more scalable
(can develop large
applications)
 Must (re-)compile
program each time a
change is made
 Must recompile for new
hardware or OS
54
Java: A Hybrid Strategy
Java Compiler: compiles program to create a
machine independent byte code.
 Java Virtual Machine (interpreter): executes the byte
code.
Libraries

Program
execution
Hello.java
Java
source
program
javac
Hello.class
java
compiler
byte code
Java VM:
- byte checker
Hello,
World!
- class loader
- interpreter
Java Runtime Environment
(JRE)
55
Error classification

Lexical: token-level error, such as illegal character
(hard to distinguish from syntax errors).

Syntax: error in grammar (missing semicolon or
keyword).

Static semantic: non-syntax error detectable prior to
execution (e.g., undefined variables, type errors).

Dynamic semantic: non-syntax error maybe detected
during execution (e.g., division by 0, array bounds).

Logic: error in algorithm or logical error in its
implementation, program not at fault.
56
Notes on error reporting

A compiler will report lexical, syntax, and static
semantic errors. It cannot report dynamic semantic
errors.

A compiler must recover after finding an error so it can
continue to check for more errors. Not easy!

An interpreter will often only report lexical and syntax
errors when loading the program. Static semantic
errors may not be reported until just prior to execution.
Indeed, most interpreted languages (e.g. Lisp,
Smalltalk) do not define any static semantic errors.

No translator will report a logic error.
57
Sample Errors (Java):
public int gcd ( int v# ) // lexical error
{ int z = value
// syntax error: missing ;
y = v;
// static semantic: y undefined
while ( y >= 0 ) // dynamic semantic:
// division by zero
{ int t = y;
y = z % y;
z = t;
}
return y;
// logic: should return z
}
58
Identify the errors (Java)
// Compute ratio of a / b
public ratio ( long a; long b )
{ int result;
if ( b => 0 )
Result == a / b;
else
result = 0.0;
return Result;
}
59
Semantic error detected by the linker
/* program contains 2 semantic errors */
int main( ) {
int now;
now = getcurrenttime( );
}
To compile a program without linking it, on Linux use:
gcc -c filename.c
GNU cc (an excellent compiler) doesn't report any errors!
To detect one error, compile and link using:
gcc filename.c
60
The Archetypical semantic/logic error
1. In C an assignment statement resolves to a value equal to
the value that was assigned:
x = 2;
results in a value of "2". This makes "x = y = z = 2;" possible.
2. In C, a numeric value can be used as an "if" condition.
0 means value
anything else is true
Always prints
"n equals 1"
int n = 0;
if ( n = 1 ) printf("n equals 1");
61
The Archetypical semantic/logic error
This error is not detected at all!
Its perfectly legal use of the C language.
/* sum input data until a zero is read */
int main( ) {
int x, sum;
sum = 0;
while ( 1 ) {
scanf("%d", &x);
// read an integer
if ( x = 0 ) break; // stop if 0 found
sum += x;
// else add x to sum
}
}
62
Wake up!
Abstraction is a key to good software.
63
Abstraction
Abstraction: using one thing to represent another;
usually to omit (hide) unimportant details or group
similar cases together.
 Why Abstraction? Control complexity.
 In Daily Life:
words and language are abstractions for concepts.
money is an abstraction for value, enabling exchange.
walk, a process of using legs to travel.
lock an abstraction of concept, technology, & process!

64
Abstraction in Programming Languages

In Programming Languages:
everything is an abstraction
x = 10
store 10 in a memory location
y = 2*x
load the value of x (memory location)
into a register,
load 2 into another register
multiply the values together
save the result in a memory location
(called "y')
65
Data Abstraction
Basic Abstraction: Data types. integer, float, double
(hides detail of how or where the data is stored)
Structured Abstraction:
Structures:
struct node {
int id;
char name[80];
struct node *next_node; /* point to next */
}
Unit Abstraction:
Program divided into files for separate compilation,
Tables in a database, classes in Java
URL (uniform resource locator) - file://c/temp/junk.txt,
ftp://somewhere.com/downloads/junk.txt
66
Control or Process Abstraction
Basic Abstraction:
assignment (y = a*x + b),
abstracts notion of storing values in memory,
"goto" and "break" statements
Structured Abstraction:
if - else if - else
loops (for, while), switch-case statement
statement blocks (scope of variables or process)
functions and subroutines (procedures)
Unit Abstraction:
threads - semi-independent execution units
processes - C fork() to start "child" processes
67
Abstractions
Basic
Structured
Unit
Data
int, char
String
class, struct
Control
or
Process
goto,
=
if - then - else
while { }
procedure
file,
package,
class (for
data
hiding)
package,
API,
threads,
Ada tasks
68
Abstraction is Key to Programming

Object-Oriented Programming - success is due to a
useful abstraction (classes, objects as entities)

World Wide Web - information on the Internet as an
interconnected web (plus a good interface :-) and
extensible. Processes look like data, too.

Spreadsheet - the original "killer app" for PCs.
Useful abstraction of data and its organization.
Q: what useful abstractions contribute the simplicity (and
success) of Microsoft Windows and Mac OS?
69
Abstraction and Progress
By relieving the brain of all unnecessary work, a good
notation sets it free to concentrate on more advanced
problems . . . Civilization advances by extending the
number of important operations which we can perform
without thinking about them.
[A. N. Whitehead, 1911]
70
Abstraction and Proficiency

An abstraction is only useful if it reduces mental effort.

You must understand the abstraction and be proficient
in using it to benefit from it.

Intellectual progress depends on this.

O-O Programming is a good example.
Therefore, please study everything (not just this
course) with the goal of understanding and proficiency - not getting a grade.
71
Can we Compare Languages?

With so many languages, how do you choose one?

Is one language able to solve problems that can't be
solved in another language?

Theory of Computing seeks to answer these questions.
72
Questions (1)

What is the syntax of a language?

What is meant by the semantics of a language?

There are two strategies for how to process a computer
program (source code) so it can be run on a computer.
Describe the 2 strategies.
73
Questions (2)

Name the 4 major categories of computer languages.
74
Questions (3)

Name one language from each of these categories:
Imperative
Functional
Logic
Object-oriented
75