Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction Programming Language Concepts 1 Objective of the Course 1. 2. 3. 4. Understand concepts of programming languages. Become familiar with different language paradigms and the kind of problems they are for. Learn to use useful tools. Learn basics of some widely used languages C/C++ Scheme (functional language like Lisp) Prolog Python other language, depending on your interest 2 Evaluation Criteria Class participation and attendance Homework Programming assignments Quizzes Exams 3 Your Responsibility Perform the role of a student at a level suitable for a first-tier university. Behavior and activities: 1. Read the assigned text before each lecture 2. Do homework assignments early 3. Participate in class 4. Self-directed learning 5. Analyze your own understanding 4 Instructor's Responsibility Provide clear guidance on the course of study Make the requirements clear and unambiguous Facilitate discussion, answer your questions Provide helpful evaluation of your understanding Listen to and respond to feedback 5 Why Tools? "If the only tool you have is a hammer, then everything looks like a nail." 6 What is a Programming Language? A vocabulary, set of grammar rules, and associated meanings for communication between people and computers, or communication between computers. Another definition: A programming language is a notational system for describing computation in machine-readable and human-readable form [K. Louden]. 7 Instructions to the CPU A CPU executes instructions, and uses data, in binary format. An executable computer program may look like this: 0010110010100010 0001011110000000 0010110000011000 1101100001011001 0011001010001011 0000011101001111 1101100001100001 0010110100010001 0001011100001110 …etc… First instruction to CPU Second instruction an address for 2nd instruction 3rd instruction 4th instruction an argument (data) for 4th instruction another argument for 4th instruction 5th instruction 6th instruction … more instructions … 8 Running A Program The program and data is loaded into main memory. 2. The address of the program's first instruction is placed in a special register in the CPU: the program counter. 3. The CPU is told to fetch and execute that instruction. 1. Memory PC Register 9 What instructions does the CPU understand? CPU instructions cause the CPU to implement logic which is designed (hardwired) into the CPU. Most CPU instruction are very simple, such as: LOAD memory_address TO some register SAVE some register TO memory_address MOVE some register TO another register ADD some register TO another register MULT some register TO another register COMPARE register TO another register TEST some register’s value, jump to new instruction if true 10 JUMP TO new instruction (unconditional branch) Generations of Computer Languages Machine language: program as series of binary instructions (0 and 1's) Assembly Language "High level" or procedural Languages this is the category we will study 4GL - application specific languages 11 Machine Language These simple, binary instructions are called machine language instructions. Writing machine language is slow and difficult, so people invented a symbolic notation for machine language, called assembly language. Assembly Language MOV AX,1F80H PUSH AX INT 21H POP AX Machine Language 0010110010100010 0001111110000000 0010110000011000 1101100001011001 0011001000011000 12 Assembly Language Simple assembly language instructions have a 1-to-1 translation to machine language. A program called an assembler converts assembly language into machine language: unix: as -o output_program input_program.s Assembly Language MOV AX,1F80H PUSH AX INT 21H POP AX assembler Machine Language 0010110010100010 0001111110000000 0010110000011000 1101100001011001 0011001000011000 13 Viewing Assembly Code on a PC 1. 2. 3. 4. 5. Find a small .exe or .com file, e.g. diskcopy.com Open a “Command Prompt” (DOS) window. Enter “debug filename”. Enter “u” to unassemble and view the program. Enter “q” to quit debug. C:\WINDOWS\system32> debug diskcopy.com - u 0B30:0000 0E PUSH CS 0B30:0001 1F POP DS 0B30:0002 BA0E00 MOV DX,000E 0B30:0005 B409 MOV AH,09 - q machine code (in hex) assembly language 14 Macro Assembly Language Simple assembly language instructions are still very tedious and time consuming to input (“code”). Macro assembly language adds the use of symbolic variable names and compound instructions. A compound instruction is an instruction that is converted into several machine level instructions. 15 Macro Assembly Example Here is a simple “Hello, World” program in macro assembly language: .stack .data message db "Hello world, I’m a program.”, "$" .code main proc mov ax,seg message mov ds,ax mov ah,09 lea dx,message int 21h mov ax,4c00h int 21h main endp end main 16 Problems with Assembly Language Programming is still difficult in macro assembly language: language does not match the way people think programs are long and difficult to understand program logic errors are common and difficult to correct difficult to divide a large problem into small problems using assembly language assembly language is machine dependent -- won’t run on a different type of computer! you have to rewrite all your software whenever you change to different hardware platform 17 Low-level and high-level languages Higher, more "abstract", languages aid programming and thinking about algorithms. Separate program logic from hardware implementat'n Add 2 numbers in C: int sum( int x, int y ) { int z = x + y; return z; } Which one is easier to understand? In Assembly Language: .globl sum .type sum,@function sum: pushl %ebp movl %esp, %ebp subl $4, %esp movl 12(%ebp), %eax addl 8(%ebp), %eax movl %eax, -4(%ebp) movl -4(%ebp), %eax movl %eax, %eax leave ret 18 High Level Languages Higher level languages (also called third generation languages) provide these features: English-like syntax descriptive names to represent data elements concise representation of complex logic use of standard math symbols: total = quantity * price * (1 + taxrate); conditional execution and looping expressions: if ( total > 0 ) then writeln(“The total is “, total) else writeln(“No sale.”); functional units, such as "functions" or "classes": radius = sqrt( x*x + y*x ); // call sqrt function 19 High Level Language Benefits programming is faster and easier abstraction helps programmer think at a higher level standardized languages enable code to be machine independent, and reduces training effort complex problems can be divided into smaller problems, each coded and tested separately. code re-use provides standard programming interface for hardware interactions, such as input and output. 20 Some higher level languages There are several broad categories: Imperative: FORTRAN, COBOL, BASIC, C, Pascal Functional: Scheme, Lisp Logic Languages: Prolog Object-oriented: Pure object-oriented: Java, Python Object-oriented & imperative: C++, Perl, Visual Basic 21 Fourth Generation Languages Fourth Generation Languages (4GL) are application specific. They are tailored for a particular application. SQL (structured query language) for databases Postscript page description language for printers PDF (portable document format) for online documents HTML and PHP for World Wide Web content Mathematica for symbolic math on a computer Mathematica is a proprietary language; the others are industry standards. 22 4GL Example: SQL SQL is one of the most widely used 4GL. Here are some examples: Insert data into a database table: INSERT INTO employees (id, lastName, firstName, Job) VALUES (1445, ‘John’,’Smith',’manager’); Choose data from a table based on criteria: SELECT id, lastName, salary FROM employees WHERE ( Job = ‘manager’ ); SQL is declarative: statements say what you want done but not how to do it. Learn more about SQL at the Open Directory Project: http://dmoz.org/Computers/Programming/Languages/SQL 23 Factors Influencing Language Many factors influence the nature of a programming language. Some major factors are: Environment: the capabilities of the machine that we are communicating with. early computers had very few instructions, no operating system, no memory management. Application Domain: language is influenced by the kind of information we want to communicate. most languages are used to specify algorithms. language may contain only statements for the kind of problem we are interested in solving, e.g., Fortran for scientific/numerical computing. 24 Factors Influencing Language (2) Methodology: an evolving discipline. new approaches to constructing software are developed based on experience and need. O-O emerged as response to limitations of imperative languages in handling complexity. Preference, Economics, and Patronage: languages have been influenced by the backing of large companies (Fortran by IBM, Java by Sun) government patronage: Ada by U.S. DoD. preferred by experts: "structured programming" and modularity made popular by C.S. professors 25 Language Paradigms Imperative (procedural): traditional sequential programming: program statements operate on variables. variable represents data in memory locations. characterized by variables, assignment, and loops. basic unit of imperative programs in the procedure or function Examples: Algol, C, Pascal, Ada, FORTRAN 26 Language Paradigms Functional: functions are first-class entities (can be used same as other forms of data); all execution is by function evaluation characterized by recursion and functions as data execution on values, not memory locations -variables are not necessary! a function can dynamically define and return a new function: self-evolving code. Examples: Lisp, Scheme, ML, Haskell 27 Language Paradigms Logic: program is declarative, it specifies what must be true but not how to compute it. logic inference the basic control no sequential operation non-deterministic: may have many solutions or none Example: Prolog 28 Language Paradigms Object-oriented: an object contains its own state (data) and the functions that operate on that state. program logic is instantiation of objects, messages between objects, encapsulation, and protection. computing model mostly an extension of imperative. Examples: C++, C#, Java, Smalltalk, Python 29 More Language Paradigms (1) Declarative: state what needs computing, not how to compute it (algorithm). Many 4GL, like SQL and Mathematica share this property. Prolog is also declarative 30 More Language Paradigms (1) Concurrent or Parallel: Programming to utilize multiple CPU or multiple threads of execution. Requires attention to task management, synchronization, and data conflict sequence of execution may not be predictable. parallel features are often added to existing programming languages. Examples: threads in Java, C#, and other languages. MPI (Message Passing Interface) library for cluster and grid computing. 31 Programming Languages used in Open Source projects at SourceForge.net Source: http://www.cs.berkeley.edu/~flab/languages.html 32 Example: Euclid’s gcd algorithm Compute the greatest common divisor of two integers. For example, the gcd of 90 and 24 is 6. 33 C /* “functional” implementation of gcd uses recursion */ #include <stdio.h> int gcd(int u, int v) { if (v == 0) return u; else return gcd (v, u % v); // “tail” recursion } int main() /* test of gcd */ { int x, y; printf("Input two integers: "); // note: use references to read input into x, y scanf("%d%d",&x,&y); printf("The gcd of %d and %d is %d\n", x, y, gcd(x,y) ); return 0; } 34 Java: imperative style GCD public class AnyClassAtAll { /** compute greatest common divisor. * @return the g.c.d. of m and n. * 1 if m and n are zero. */ private static long gcd(long u, long v) { long remainder; if (v < 0) v = -v; while ( v != 0 ) { remainder = u % v; u = v; v = remainder; } if ( u == 0 ) return 1; // gcd(0,x) = 1 else return (u>0)? u : -u; // absolute value } ... remainder of the class is irrelevant 35 Java: Object-oriented GCD /** This class finds the GCD of one value (the state of * the object) with any other value given as parameter. */ public class GCD { // attribute: state of the object (immutable) private final int value; /** constructor */ public GCD( int value ) { this.value = value; } /** compute GCD of private state and param v */ public int gcd ( int v ) { int u = value; // don't modify object's state while ( v != 0 ) { int t = u % v; u = v; v = t; } return u; } } 36 Scheme ; functional implementation of gcd ; uses recursion (define (gcd u v) (if (= v 0) u (gcd v (modulo u v) ) ) ) Scheme syntax for defining a function: ( define ( function-name param1 param2 ... ) body of function definition ) 37 Scheme application ; using the gcd: perform I/O, invoke gcd (define (euclid) (display "enter two integers:") ; use of variables not really necessary ; and not "functional" style (let ( (u (read)) (v (read)) ) (display "the gcd of ") (display u) (display " and ") (display v) (display " is ") (display (gcd u v)) (newline) ) ) 38 Prolog /* conditions for GCD */ gcd(U, V, U) :- V = 0. gcd(U, V, X) :- not (V = 0), Y is U mod V, gcd(V, Y, X). /* Goal: compute the GCD of 288 and 60. */ gcd(288, 60, X). In Prolog, a clause is an assertion that can succeed (be true) or fail of the form: consequence :- a, b, c. means: consequence is true if a, b, and c are true. 39 FORTRAN 77 C Greatest common denominator for real programmers INTEGER FUNCTION IGCD(U,V) INTEGER U, V, TMP DO WHILE ( V .NE. 0 ) TMP = V V = MOD(U,V) U = TMP END DO Assign returned value IGCD = V RETURN END PROGRAM MAIN WRITE(6,*) "Input two integers:" READ(5,*) I, J I, J implicitly integer WRITE(6,100) I, J, IGCD(I,J) 100 FORMAT("GCD of ",I4," and ",I4," is ",I4) STOP END 40 Paradigm use is rarely “pure” The C gcd() example defines gcd in a functional style, even though C is mainly imperative. Java can be used to write purely imperative style programs (all static, no objects) also, in Java primitive data types aren't objects Scheme uses I/O operations, which depend on sequence and external effects (imperative style) in a "pure" functional languages, the result of a function depends only the the parameters this isn't true of I/O operations 41 Language Design Some conflicting objectives, criteria, and goals 42 Goals for language design Power Simplicity Flexibility Clarity Expressiveness Writability Consistency (orthogonality) Efficient implementation Readability Support for abstraction Applicability to problem domain Portability 43 Readability or writability? Should programming languages promote the writing of programs or the reading of programs? Many people (including the writer!) may need to read a program after it is written. 44 Readability or writability? Q: What does this Perl script do? #!/usr/bin/perl foreach $FILE ( @ARGV ) { open(FILE) || die "Couldn't open $FILE"; while($_ = <FILE>) { print $_; } close(FILE); } 45 Language definition Syntax: defines the grammar of a language. what are valid statements, what is a valid program. given in formal notation such as BNF or ENBF. Semantics: the meaning of the elements of a language. usually defined in human language formal notations exist, but not widely used can have a static component: type checking, definition checking, other consistency checks prior to execution. dynamic: run-time checking of array indices, runtime type determination. 46 Syntax Defines symbols and grammar of a language. Usually given in Backus-Naur Form or its extensions. if-statement ::= if ( expression ) statement-block [ else statement-block ] statement-block ::= statement ';' | '{' statement ';' [...] '}' statement ::= if-statement | assignment-statement | while-statement | ...etc... 47 Language implementation strategies Compiler: multi-step process that translates source code into target code; then the user executes the target code. Interpreter: one-step process in which the source code is executed directly. Hybrids: "just in time" compilers - Perl "virtual machine language" - Java, Microsoft .NET languages. 48 Compiler versus Interpreter Source Program Input Interpreter Output Execute on machine Source Program Compiler Input Target Program Output Execute on machine 49 Language processing: Interpreted Interpreted: BASIC, Postscript, Scheme, Matlab The interpreter reads the source program and executes each command as it reads. The interpreter “knows” how to perform each instruction in the language. Source Program Interpreter Execution 50 Language processing: Compiled Compiled: C/C++, Pascal, Fortran The compiler converts source code into machine language to create an object code file. A linker combines object code files and pre-compiled libraries to produce an executable program (machine language). 51 Compiling a Program Source Code Compiler Object Code file.c main() { printf("hello"); exit(0); } printf.obj <obj. code for printf function> Libraries (of object codes) Linker file.obj .sym printf FE048C7138 029845AAAF ... Executable Program file.exe <hardware instructions> 52 Typical Phases of a Compiler Source Program Lexical Analyzer Syntax Analyzer Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator Target Program 53 Interpreted versus Compiled Interpreted Flexible More interactive More dynamic behavior Rapid development Can run program immediately after writing or changing it Portable to any machine that has the interpreter Compiled More efficient execution Extensive data checking More structured Usually more scalable (can develop large applications) Must (re-)compile program each time a change is made Must recompile for new hardware or OS 54 Java: A Hybrid Strategy Java Compiler: compiles program to create a machine independent byte code. Java Virtual Machine (interpreter): executes the byte code. Libraries Program execution Hello.java Java source program javac Hello.class java compiler byte code Java VM: - byte checker Hello, World! - class loader - interpreter Java Runtime Environment (JRE) 55 Error classification Lexical: token-level error, such as illegal character (hard to distinguish from syntax errors). Syntax: error in grammar (missing semicolon or keyword). Static semantic: non-syntax error detectable prior to execution (e.g., undefined variables, type errors). Dynamic semantic: non-syntax error maybe detected during execution (e.g., division by 0, array bounds). Logic: error in algorithm or logical error in its implementation, program not at fault. 56 Notes on error reporting A compiler will report lexical, syntax, and static semantic errors. It cannot report dynamic semantic errors. A compiler must recover after finding an error so it can continue to check for more errors. Not easy! An interpreter will often only report lexical and syntax errors when loading the program. Static semantic errors may not be reported until just prior to execution. Indeed, most interpreted languages (e.g. Lisp, Smalltalk) do not define any static semantic errors. No translator will report a logic error. 57 Sample Errors (Java): public int gcd ( int v# ) // lexical error { int z = value // syntax error: missing ; y = v; // static semantic: y undefined while ( y >= 0 ) // dynamic semantic: // division by zero { int t = y; y = z % y; z = t; } return y; // logic: should return z } 58 Identify the errors (Java) // Compute ratio of a / b public ratio ( long a; long b ) { int result; if ( b => 0 ) Result == a / b; else result = 0.0; return Result; } 59 Semantic error detected by the linker /* program contains 2 semantic errors */ int main( ) { int now; now = getcurrenttime( ); } To compile a program without linking it, on Linux use: gcc -c filename.c GNU cc (an excellent compiler) doesn't report any errors! To detect one error, compile and link using: gcc filename.c 60 The Archetypical semantic/logic error 1. In C an assignment statement resolves to a value equal to the value that was assigned: x = 2; results in a value of "2". This makes "x = y = z = 2;" possible. 2. In C, a numeric value can be used as an "if" condition. 0 means value anything else is true Always prints "n equals 1" int n = 0; if ( n = 1 ) printf("n equals 1"); 61 The Archetypical semantic/logic error This error is not detected at all! Its perfectly legal use of the C language. /* sum input data until a zero is read */ int main( ) { int x, sum; sum = 0; while ( 1 ) { scanf("%d", &x); // read an integer if ( x = 0 ) break; // stop if 0 found sum += x; // else add x to sum } } 62 Wake up! Abstraction is a key to good software. 63 Abstraction Abstraction: using one thing to represent another; usually to omit (hide) unimportant details or group similar cases together. Why Abstraction? Control complexity. In Daily Life: words and language are abstractions for concepts. money is an abstraction for value, enabling exchange. walk, a process of using legs to travel. lock an abstraction of concept, technology, & process! 64 Abstraction in Programming Languages In Programming Languages: everything is an abstraction x = 10 store 10 in a memory location y = 2*x load the value of x (memory location) into a register, load 2 into another register multiply the values together save the result in a memory location (called "y') 65 Data Abstraction Basic Abstraction: Data types. integer, float, double (hides detail of how or where the data is stored) Structured Abstraction: Structures: struct node { int id; char name[80]; struct node *next_node; /* point to next */ } Unit Abstraction: Program divided into files for separate compilation, Tables in a database, classes in Java URL (uniform resource locator) - file://c/temp/junk.txt, ftp://somewhere.com/downloads/junk.txt 66 Control or Process Abstraction Basic Abstraction: assignment (y = a*x + b), abstracts notion of storing values in memory, "goto" and "break" statements Structured Abstraction: if - else if - else loops (for, while), switch-case statement statement blocks (scope of variables or process) functions and subroutines (procedures) Unit Abstraction: threads - semi-independent execution units processes - C fork() to start "child" processes 67 Abstractions Basic Structured Unit Data int, char String class, struct Control or Process goto, = if - then - else while { } procedure file, package, class (for data hiding) package, API, threads, Ada tasks 68 Abstraction is Key to Programming Object-Oriented Programming - success is due to a useful abstraction (classes, objects as entities) World Wide Web - information on the Internet as an interconnected web (plus a good interface :-) and extensible. Processes look like data, too. Spreadsheet - the original "killer app" for PCs. Useful abstraction of data and its organization. Q: what useful abstractions contribute the simplicity (and success) of Microsoft Windows and Mac OS? 69 Abstraction and Progress By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems . . . Civilization advances by extending the number of important operations which we can perform without thinking about them. [A. N. Whitehead, 1911] 70 Abstraction and Proficiency An abstraction is only useful if it reduces mental effort. You must understand the abstraction and be proficient in using it to benefit from it. Intellectual progress depends on this. O-O Programming is a good example. Therefore, please study everything (not just this course) with the goal of understanding and proficiency - not getting a grade. 71 Can we Compare Languages? With so many languages, how do you choose one? Is one language able to solve problems that can't be solved in another language? Theory of Computing seeks to answer these questions. 72 Questions (1) What is the syntax of a language? What is meant by the semantics of a language? There are two strategies for how to process a computer program (source code) so it can be run on a computer. Describe the 2 strategies. 73 Questions (2) Name the 4 major categories of computer languages. 74 Questions (3) Name one language from each of these categories: Imperative Functional Logic Object-oriented 75