Download Programming Languages

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Reactive programming wikipedia , lookup

Falcon (programming language) wikipedia , lookup

Abstraction (computer science) wikipedia , lookup

Program optimization wikipedia , lookup

Coding theory wikipedia , lookup

Programming language wikipedia , lookup

Structured programming wikipedia , lookup

Object-oriented programming wikipedia , lookup

Compiler wikipedia , lookup

Go (programming language) wikipedia , lookup

Assembly language wikipedia , lookup

C Sharp (programming language) wikipedia , lookup

Interpreter (computing) wikipedia , lookup

Transcript
Programming Languages
On completion of this chapter, you will be able to:

Distinguish between low-level and high level programming languages.

Differentiate between an assembler, an interpreter, and a compiler.

Differentiate among the five different generations of programming languages.
Introduction
There are many programming languages that have been developed to code programs.
As of writing this document, I have counted in excess of 560. As many as there have been some
are extinct. Although there are so many, these languages can be grouped into two categories –
low-level languages and high-level languages. The classification of these two types of languages
gives rise to the different generations of programs. As we will see, low-level languages describe
two generations of programs – first generation and second generation. High-level languages give
rise to what is describes as third generation languages. There is also forth and fifth generation
languages. Every generation language as you might suspect by now, has features that are
different from one another. This document will help you to understand the similarities and
differences of one generation of programming from another.
Low Level Languages
These types of programming languages are machine dependent. That is, the code written for one
type of machine or processor cannot be understood by a different type of processor. This is
because every class of processor is designed with its own set of primitive instruction codes.
First Generation language
The first type of low-level language is called first generation language, sometimes called
machine language. As with all first generation language, their codes must be written in binary
digits. As you know, a computer (the processor) can only understand binary information. As such,
first generation languages do not need a compiler or interpreter to run. The processor for which
the language was written is able to run the binary code directly. The following section takes a
closer look at machine language programs in general.
Machine Language Instruction
Let us design a four address hypothetical computer. That is, each instruction for this
machine is contained in four fields, as shown in Figure 1
Operation code
address
4 bits
Register number
Register number
4 bits
4 bits
Figure 1 Format for one machine instruction code
1
Memory
4 bits
The field, operation code, represents some primitive operation of the processor, such as
arithmetic operations. The field, register number, represents one of the registers of the
processor; and field, memory address, represents an address of primary memory (RAM). We
will make each field four bits long. Also, let us design this machine with one accumulator register
within the ALU, so that when certain operation is performed, the result of that operation is placed
in the accumulator. The accumulator register will be implied within the instruction. Let us design
this machine with another feature. It has seven basic operations that are indigenous to it – load,
copy, add, subtract, multiply, divide, and store. Figure 2 shows the binary equivalence of each
operation.
Basic operations
Binary code representing operations
Load
0000
Copy
0001
Add
0010
Subtract
0011
Multiply
0100
Divide
0101
Save
0110
Figure 2 some basic operations
We will also design this machine with eight registers. See Figure 3
Register Number
Binary equivalent
0
0000
1
0001
2
0010
3
0011
4
0100
5
0101
6
0110
7
0111
Registers 0 through 4 programmers’ use
Registers 5 through 7 systems use
Figure 3 register number and their binary equivalent
Registers 0 through 4 are designed for programmers’ use, while registers 5 through 7 are set
aside for certain systems operations.
Let us assume that this machine has 256 bytes of RAM. The address number will go from
0 through 255 base 10 (0000 0000 – 1111 1111 base 2). See Figure 4.
2
Primary Memory (RAM)
Address of RAM
:
:
:
:
0000 1100
0000 1101
0000 1110
0000 1111
0001 0000
0001 0001
0001 0010
00010011
:
:
:
1111 1111
Figure 4
Remember that the memory address of an instruction can only take four-bit codes 0000 through
1111. In this case only the four rightmost bits in the instruction can be used for memory address.
The complete list of these memory addresses is shown in Figure 5
Address
Binary
Address
Binary
0
0000
8
1000
1
0001
9
1001
2
0010
10
1010
3
0011
11
1011
4
0100
12
1100
5
0101
13
1101
6
0110
14
1110
7
0111
15
1111
Figure 5
As part of the specification of this machine, the address field denoted by 0000 will always mean
that the field contains no information pertaining to this instruction. In short, the instruction will not
involve this field. The field for operation code if contains 0000 will maintain its meaning, load.
Based on the above specification for this hypothetical computer, a typical instruction would be:
3
0001
0010
0100
Add
The content of
Register 2
To the content
of Register 4
0000
Figure 6 One machine instruction
In our example we have separated each field, but as far as the processor is concerned, it sees
only one string of binary digits as in: 0001001001000000. In this example, let us suppose that
register 2 contains the value 20, and register 4 contains 5; after the execution of this instruction
the accumulator will contain the value 25.
Example
Given the following pseudocode algorithm, write the equivalent machine language version.

Load the accumulator with the value stored at memory location 12.

Add the content of register 2

Multiply the content of register 3

Store the result at memory location 12.
See Figures 8.xxx for the binary equivalent of respective operation code. See Figure 7 for the
binary equivalent of the respective register; and see Figure 7 the binary equivalent of the memory
address. The result of these combinations is shown below. A space is placed between each
group of digit that represents field in the instruction for readability.
0000 0000 0000 1100
Load the accumulator from memory location 12
0001 0010 0000 0000
Add the contents of register 2
0100 0011 0000 0000
Multiply the contents of register 3
0110 0000 0000 1100
Store the result in register 12
Figure 7
As we have stated earlier, each type of machine has its own machine instruction set. If
we have another type of computer whose arithmetic operation codes are different from the
current example, then this other machine cannot execute the instructions designed for the first
machine. Figure 8 shows the instruction set for a different type of machine.
4
Basic instruction
Binary equivalent
Load
0001
Copy
0010
Add
0011
Subtract
0100
Multiply
0101
Divide
0110
Store
0111
Figure 8
If we use the previous machine instruction code on this machine, by now you should see that it
will not work, since the codes for the operation fields are not compatible. In the first machine the
operation code for load is 0000, and the second it is 0001.
The major advantage of first generation language is that the code runs very fast and
efficiently because it is directly executed by the CPU. That is, the code does not need any
translation. The major disadvantages are: the code is machine dependent, and the series of 1s
and 0s make it tedious for the programmer to write, especially for long programs. This is a recipe
for making typographical errors.
Second Generation language
The second type of low-level language is called second generation language, popularly
called assembly language. As with all second generation language, their instructions are not
written in binary digits as we saw with first generation language. Instead, operation codes use
mnemonic codes (short abbreviated words); registers are written as R n, where n denotes a
number; and memory addresses are written as hexadecimal numbers, or sometimes are
referenced by register names. This makes programming much easier than trying to program
using binary numbers.
Assembly Language
As you know, computers (processor) can only understand binary information. In this
situation you will need a programming language that understands the mnemonic program codes,
and can translate the mnemonic codes into binary codes. Every type of computer has its
associated second generation programming language. This programming language is called
assembly language program. The assembly language program code that you would type is called
the source code, sometimes called source program. Now, the source code needs a program to
translate/convert it into machine language instruction. This translated version of the source code
is called object code or object program, as it is sometimes called. A program that translates
assembly language source code into machine instructions is called an assembler. Two most
5
popular assemblers for the personal computer are Borland Turbo Assembler (TASM) and Microsoft
Assembler (MASM). Figure 9 shows the flow of programming activities necessary when using
second generation language.
LDA 12
Assembly language source code that
ADD R2
you typed using a text editor
MUL R3
STO 12
Assembler such as Borland Turbo Assembler
(TASM), or Microsoft Assembler (MASM),
TASM/MASM Assembler
converts source code to machine language
0000 0000 0000 1100
0001 0010 0000 0000
Assembler takes source as input and
0100 0011 0000 0000
translates it into machine language
0110 0000 0000 1100
Figure 9
As with the case of machine language instruction, the object code that is generated by an
assembler can only run on machines of the same kind, because the object code is indeed
machine language instructions. It happens this way, because assembly language is a machine
specific programming language. It has a one-to-one correspondence between each of its
statements and computer’s indigenous machine language.
High Level Languages
A high level programming language is any computer programming language in which
instructions are written in a language that resembles human (natural) language. That is, their
codes are further from machine languages of the first generation languages. See Figure 10
High level languages language
Assembly language
Machine language
Hardware
Figure 10
6
Because of the difficulties encountered using low-level languages, high-level languages were
developed to make programs easier to write and to read. These languages use words that more
clearly describe the task being performed. In general, the main advantages of high-level
languages over low-level languages are that they are easier to read, write, and maintain. High
level languages span third generation, fourth generation, fifth generations, and beyond. The rest
of the chapter describes these generations of programming languages.
Third Generation Languages
Third generation languages are largely procedural. That is, they concentrate more on
how to do something, rather than describing how something gets done. Procedural programming
languages follow similar pattern to how solution algorithms are designed as described in Chapter
4. Another feature of third generation languages is that the programmer’s code called source
program, also known as source code, must be translated into machine language. This requires
special translator programs to convert the source code into machine code.
There are two kinds of translators used to translate and execute a program - compiler
and interpreter. Both translators have one thing in common – they convert the source code into
machine language. The process by which they carry out the translation and interpretation of
source code differs. In the case of the interpreter, it translates and executes each line of codes
one line at a time. Thus if the program has syntax errors (violation of program rule) lower down in
the code, you never know until the interpreter reaches to that statement. Compilers on the other
hand make sure that there is no syntax error in the program before it starts to execute the
program. As a result most people prefer a compiler to an interpreter. Generally an interpreter is
easier to learn than a compiler; but a compiler gives a better trade off. That is, once the source
code is compiled, this compiled version called the executable code can be executed repeatedly
without having to re-compile the source code. Interpreters do not behavior this way. Each time
that the program is to be executed, the source code must be re-translated. Figure 11 shows the
flow of activities carried by an interpreter – from creating the program, to translating the program,
to executing the program.
7
Create/Edit
Source code
No
Another line of code?
Program ends
(Normal termination)
Yes
Interpreter
(Check syntax)
Yes
Line of code has error?
Program aborts
(Abnormally termination)
No
Execution
(Perform task)
Figure 11
The programming process begins the time you create or type the program as shown in
Figure 12. The code you type is called the source code. In the case of the interpreter, when it is
supplied a program, it checks if there are more lines of code in the file. If the answer is no, then
this signifies that the source code ends. If there is another line of code in the source file, it is
interpreted. If the code has at least one syntax error, the program halts, but this time abnormally.
The programmer has the option of fixing the error and re-submits the entire code to be interpreted
again. If there are no errors in the code, that line of code gets executed, and the process starts
over for the next line of code, until the entire program is interpreted and executed.
Figure 12 shows the flow of activities carried by a compiler – from creating the program,
to translating the program, to executing the program. As oppose to the interpreter, the compiler
makes sure that there are no syntax errors in the entire source code, before it starts to execute it.
8
I’m learning to
Program !
Compile
(Entire program)
Create/Edit
Source code
Yes
Compiler/
(Syntax error?)
No
No
Output
Object code
Linker - combines
Object code and library code
(Output = Load module)
No
Link
(Successfull?)
Yes
Load
(Place load module in memory)
Execute Program
Yes
Runtime Error?
(Exception)
No
Yes
Logic error?
(Answer correct?)
No
Yes ! ! !
It Works !!
Figure 12
9
The process of compiling and executing a program is more complicated than interpreting
a program. First we will look at the similarities, and then the differences. As in the case of the
interpreter, the programming process begins the time you create or type the program as shown in
Figure 12.. The compiler examines all of the source code to make sure that all lines of codes are
free of syntax errors. If there should ever be any syntax error, the error must be fixed, and the
program is re-submitted for compilation. Once it is determined that the source code has no syntax
error, the compiler produces and output code in binary, called object code.
When a programmer writes a program, it is rare that he/she writes all the codes
necessary for the program to work. For example the codes for input and output are difficult and
long to write, so the compiler designer usually write them for us, and store them in file(s)
commonly called library. The next step in the compilation process is to combine some of those
codes with your object code to form another file of codes called a load module. The compiler has
another module called the linker, which links the object code and the library codes. The linker
typically stores this new version of your program in a file with extension “.exe”. If the linking is
unsuccessful, the necessary corrections have to be made, and the compilation process begins
again. If the linking is successful, a third module of the compiler places the load module into the
primary for the program to be executed. It is at this stage that the program is supplied data, and
that we get output.
Although the program may compile and link, this does not guarantee that it will execute
successfully; another type of error could occur. This kind of error is called Exception. It occurs
only during runtime; hence it is also called runtime error. When an exception occurs, the program
is aborted abruptly because of some unforeseen reasons. For instance, if a program attempts to
divide by zero, or read from an empty file, or read from a file that does not exists; the computer
would abort the program instantly. Essentially, exceptions are impossible tasks that the program
is requesting the machine to perform. Its response therefore is to terminate the execution of the
program abnormally. When this happens, the programmer must once again fix the problem and
re-submit the program for re-compilation. Lastly, if the solution is incorrect, due to logic error, then
the correction has to be made, and the compilation process begins all over again.
Another feature of third generation language is that one statement (instruction) in any of
these languages generally generates several machine languages instructions. This feature is true
for all high level programming languages. For instance, the algebraic statement:
Y=(Y+X)*Z
is coded just as you see it here in Java, C and Basic languages. In Pascal it is coded as:
10
Y := ( Y + X ) * Z. Notice that it is only the assignment symbol that is different. When this
statement compiled/interpreted however, the assembly codes generated are similar to what is
shown in Figure 13
Assembly language
Machine language
Load accumulator from memory location Y
LDA Y
0000 0000 0000 xxxx
Copy memory location X into register R2
MOV R2, X
0001 0010 0000 . . . .
Add contents of accumulator and R2.
ADD R2
0010 0010 0000 0000
Copy memory location Z into register R2
MOV R2, Z
0001 0010 0000 . . . .
Figure
xxx
Multiply
accumulator by R2.
Save value in accumulator in memory Y
MUL R2
STO Y
0100 0010 0000 0000
0110 0000 0000 xxxx
Figure 13
In Figure xxx we see that the single high level language statement y = ( y + x ) * z , when
compiled, generates six assembly language statements, which intern generates six machine
instructions.
Some of the most featured third generation languages are C, C++, Pascal, FORTRAN,
COBOL, and Basic. All but Basic are compilers; Basic is an interpreter.
Fourth Generation Languages
Fourth-generation languages is so named, because it shows a direct departure from the
previous generations. Fourth-generation languages instructions are not written in binary, neither
are they are not written in assembly format either. We know from the previous sections that the
third-generations are procedural in nature. That is, they concentrate on how things get done.
Fourth-generation languages do not fit this model either. Instead, Fourth-generation languages
describe what is to be done in a more or less natural language format. That is, they state the
goals to be achived, but they do not list the steps to achieve the goals. The three most typical
feature of fourth-generation languages are:

They are non-procedural. As mentioned they do not focus on how the task gets done, but
rather their instruction focus on what needs to be done.

They use English-like phrases and sentence formats to issue instructions.

As a result of these two previous points, fourth-generation languages are favoured in the
industries which depends on data retrieval and queries. Employees in this environment
do not necessarily have to be knowledgeable in computer science, instead they are
11
usually trainable individuals who can write commands. It is common knowledge that
fourth-generation languages increases productivity in the work place.
One broad area of applications where fourth-generation languages are used are
structure query language (SQL). In a relational database, data are stored in a table as shown in
Figure xxxxx. The first row of the table shows the attributes of the table, Id, LastName, etc. Lets
say the name of the table is Employee. Let’s say you want to see the firstname, lastname, and
position of all employees who earn $50,000.00 or more. In fourth-generation languages, a
typically commaned would be somewhat like this:
SELECT FirstName, LastName, position FROM Employee WHERE salary >= 50000 ;
A command like this does not require programming knowledge to understand. The individual can
be trained to construct queries such as this.
Id
LastName
FirstName
Salary
Position
M-0010
Smith
James
50000
Manager
F-1000
Richards
Mary
75000
Manager
F-2000
Hammond
Berrisford
40000
Staff
M-0010
Smith
Maureen
28000
Part-Time
M-0030
Harvey
Val
55000
Staff
Figure 14
In this case the required information would be:
James Smith Manager
Mary Richards Manager
Val Harvey Staff
Fifth Generation Languages and Beyond
As oppose to first, second, third and fourth generation languages, fifth generation
languages around the concept of solving problems using constraints, rather using algorithm. By
this we mean, constraint solving is the solving of problems by giving constraints (conditions,
properties) pairs, which must be satisfied by a solution to the problem.This way, the programmer
12
only needs to worry about what problems need to be solved and what conditions need to be met,
without worrying about how to implement any algorithm to solve them.
Artificial Intellegence (AI)
Fifth-generation languages are used mainly in the area of artificial intellegence (AI). The
field of artificial intellegence focuses on areas such as:

Deductive reasoning – Like the human beings, this branch of AI tends to solve most of
their problems using fast, intuitive judgments rather than the conscious, step-by-step
deduction approach.

Knowledge representation – This is a method whereby knowledge about a topic or an
object is stored in an expert system. The knowledge is typically a series of IF conditionTHEN take action rules.

Machine learning – This area AI focuses on finding patterns in data. The idea behind
machine learning is to replace the writing of code with the supplying data to the
computer, and then let the computer figure out what information is needed from the data,
by looking at some examples that have also been fed to it. The main idea after all of this,
is to have the computer a supply generalized solution beyond the just the examples that
were given to it.

Natural language processing - This branch of study is concerned with the interactions
between computers and human’s natural languages. That is, they can convert
information from computer databases into readable human language. They can even
convert samples of human language into more formal representations that are easier for
computer programs to manipulate.

Motion navigation - The field of robotics is one such area of AI. That is, intelligence is
required for robots to be able to handle such tasks as manipulating object and navigating
the positions of objects. That is, knowing where the objects are, learning what is around
them and figuring out how to get an object from one location to another.
Beyond Fifth Generation Languages
Beyond fifth generation language (AI) are two new programming paradigms – visual
programming language (VPL) and object oriented programming (OOP).
Visual Programming Language
Visual programming language, as the term suggests, are those programming languages
which lets you build programs using icons rather than using text. That is a VPL are based on the
idea of using buttons, text box, check boxes, command buttons, labels and connecting arrows to
create graphical user interface (GUI) programs on your computer screen.
13
Visual Basic (VB) is one of the forerunners of VPL. Its main attractiveness, unlike many
other languages, is the ease with which it allows the programmer to create appealing looking,
graphical user programs with little coding. Other programming languages may require hundreds
of lines codes, and several hours of programming. The way that most VPL work, is that as the
programmer lays out the buttons, labels, arrows, etc, on the GUI form area, much of the program
code is automatically generated by the program itself. The language provides us with a tool box,
from which you can select the various items to build your GUI. Figure 14. show a typical Visual
Basic tool box. Most VPL, including VB, follow this three step generic format when developing an
application:
1. Design the appearance of your GUI application before setting it up.
2. Assign property settings to the objects of your GUI program.
3. Write any necessary code to direct specific tasks at runtime.
Picture
Label
Text
box
Button
Check
box
Combo
box
Radio button
Vertical
Scrollbar
Timer
Shap
Line
e
Image
Data control
Figure 16
14
Figure 17
Object Oriented Programming (OOP)
Object oriented programming (OOP), which also, is beyond fifth generation programming,
is a type of programming that defines both the data and any operations that can be perform on
the data, as a complete program unit. In this way, the entire unit, both data and operations, is
referred to as the object. In this paradigm the data declarations are called fields, and the
operations are called methods. Both the fields and the methods are stored as one unit, typically
called a class.
Writing object-oriented programs require an object-oriented programming language
(OOPL). Three of the more popular object oriented languages are Java, C++ and Smalltalk . The
following running example gives an understanding of some of the features of OOP. We will use
15
the Java programming language to highlight these features. This section does not intend to teach
Java, it is only to demonstrate a few features of object oriented programming. In the following
running example we will limit our discussion to finding area of some type of two dimensional
surfaces, and volume of some three dimensional figures as well.
After this brief introduction to object oriented programming, we discover that the major
advantages of OOP are:

Modularity - each object forms a separate entity with its own set of data and its own set of
operations. This concept called encapsulation protects the data, which makes it difficult, if
not impossible for objects outside of the system to access the data.

Modifiability - it is easy to make minor changes to the entity in terms of the data
representation or operations. In addition, changes inside a class do not affect any other
part of the program.

Extensibility – the concept of inheritance allows you create new entities from existing
ones and new features. This extended entity becomes a unique feature of the original
entity.

Maintainability – because each object is distinct, it makes it easier to maintain an entire
system by modifying only those entities that require changes. Any changes made to an
entity usually has little or no change to existing modules; thereby reducing
programming time, reduces maintenance costs, and reduces program
development time.

Re-usability – a given enitity can be reused in different programs at any time.

Simplicity –object oriented programming models real world objects. The programming is
built around the concept of: what fields are involved in the object and what operations are
required on these fields. This concept reduces program complexities and it makes the
program structure is very clear and easily understood.
16