Download Lecture 1: Introduction to CS 18

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CS18
Integrated Introduction to Computer Science
Nelson, Greenwald
Lecture 1: Introduction to CS 18
Jan 25, 2017
Contents
1 Introduction to CS 18
1
2 Getting Started
3
2.1
Key Features of Java and Scala . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.2
The Java Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
3 Hello, World!
5
4 Getting Started with Java
6
A Why Java?
11
Objectives
By the end of this lecture, you will know:
• our goals for the semester
• key features of Java and Scala
• what the Java virtual machine (JVM) is
By the end of this lecture, you will be able to:
• write a program that prints "Hello world!" in Java
1
Introduction to CS 18
Welcome to CS 18! This course builds on CS 17 and CS 19. The mechanics of CS 18 are very similar
to those of CS 17. Lectures, homeworks, labs, projects, and exams all work in pretty much the same
way. The only minor distinction that the TAs hold weekly review sessions, instead of workshops.
Content-wise, here’s a rundown of some of what we will cover this semester:
• Programming Paradigms:
– Imperative Programming: computations are expressed in terms of state—a program’s
current information. Statements can change that state. The key word here is mutation.
CS18
Lecture 1: Introduction to CS 18
Jan 25, 2017
– Object-Oriented Programming (OOP): computations are expressed in terms of
classes and their instances—so-called objects—and how they compose, inherit from,
and interface with one another.
• Data Structures:
– Arrays: collections of data, indexed by integers, with constant time access to individual
elements
– Lists: implemented as linked lists and dynamic arrays
– Hash Tables: collections of data, indexed by non-integers, with (average) constant time
access to individual elements
– Heaps: tree-based data structures such that the greatest (or least) element is always
stored at the root
– Self-Balancing Binary Search Trees: BSTs that automatically rearrange themselves
as necessary to keep their depth small
• Algorithms:
– Dynamic Programming: faster algorithms for some problems we solved last semester
using exhaustive search
– Sorting Algorithms, in addition to those already covered in CS 17
– Graph Algorithms: algorithms that manipulate connected data
– Parsing: algorithms that parse text based on a given specification
• Tools for proving program correctness:
– Mathematical Induction: a technique for proving properties of the natural numbers
– Program (Loop) Invariant: a condition that is initially true, and remains true,
throughout the execution of a program (or loop)
• Programming Languages:
– Java
– Scala
• Programming Models:
– Client-Server Architecture: a network architecture in which work is divided between
service providers, known as servers, and service requesters, known as clients
– Web Programming: programming that generates web pages and their content
• Programming Environments and Libraries:
– Eclipse: an integrated development environment (IDE)
– JavaFX: a library for developing graphical user interfaces (GUIs) for Java and Scala
programs
2
CS18
2
Lecture 1: Introduction to CS 18
Jan 25, 2017
Getting Started
We will start off this semester with Java, an object-oriented programming language that is widely
used in industry today. About halfway through the semester, we will transition to Scala, an
object-oriented programming language that supports both functional and imperative programming.1
Hence, in CS 18, you’ll be learning how to program in both Java and Scala. But, more generally,
you’ll learn how to structure programs using the object-oriented and imperative paradigms—the
focus of CS 18—together with the functional programming paradigm of CS 17.
2.1
Key Features of Java and Scala
• Imperative: Some would say that imperative programming is more “natural” than functional,
because we often think “imperatively.” Imperative programs are sequences of steps, much like
recipes. As each step executes, the program’s state changes. This is called mutation.
• Object-Oriented: Scala and Java include fundamental object-oriented design capabilities
such as inheritance, encapsulation, and subtype polymorphism, which facilitate code re-use
and extensibility. We will cover these principles in detail over the course of the semester.
• Platform Independence: Java’s motto is “Write Once, Read Anywhere,” meaning: write a
program once, compile it once, and run it everywhere.2 Both Java and Scala achieve this ideal
via the JVM (Java virtual machine).3
• Extensive Libraries: Although one of the CS 17 / 18 mottos is “no magic,” both Java
and Scala actually enable a whole lot of magic, making it possible for programmers to write
program quickly. Q: How do they achieve this? A: With extensive libraries.
A library (in computer science) is a collection of programs that are organized to be easily
invoked by other programs. There exist Java and Scala libraries that do pretty much everything
(e.g., networking, graphics, parsing, security, etc.).
That said, in CS 18, we work from the ground up. You’ll be learning how to implement many
fundamental data structures—like the lists you used in Racket (but with mutation!)—all on
your own. So, in short, hands off the libraries unless we tell you otherwise.
Like Racket, which is an offspring of Lisp, Java derives from and extends earlier languages. Java
syntax is closest to that of C++, which, in turn, is based on C. Its execution model, the JVM,
although popularized by Java, dates back to Smalltalk.
Scala runs on the JVM and can leverage all code written in Java, including Java’s API. Hence, Scala
is said to interoperate with Java.
1
A few of the companies who are using Scala at the time of writing include: LinkedIn, Twitter, FourSquare, Netflix,
Tumblr, AirBnB, and Apple.
2
In reality, “Write Once, Run Anywhere” is often more like “Write Once, Debug Everywhere” due to minor
incompatibilities and inconsistencies among JVM implementations.
3
And so do a number of other languages, like Groovy, Clojure, JRuby, and Jython.
3
CS18
2.2
Lecture 1: Introduction to CS 18
Jan 25, 2017
The Java Virtual Machine
People write source code, but machines read “machine code,” or, more specifically, platformdependent native code. There are two ways (and combinations thereof) to run source code: it
can be compiled into native code, so that it can be directly executed, or it can be interpreted,
meaning that it is indirectly executed by another program, called an interpreter.
A compiler is a program that translates source code into native code, which is then directly
executed on the intended platform. Hence, a compiler is specific to a given machine architecture
(e.g., there is a C compiler for Macs, and another one for IBMs). Using a compiler (only), running a
program is a two-step process:
1. Compilation: Translate the source code into native code.
2. Execution: Run the native code.
An interpreter is a program that executes code. Historically, the code executed by an interpreter
was source code, written by humans (e.g., the first Lisp interpreter). Today it is more common
for there to be an additional layer of indirection, so that the code executed by an interpreter is
not source code, but virtual machine code, output by a compiler. Indeed, Java and Scala,4 are
compiled into virtual machine code (also called bytecode), which is then interpreted by the JVM,
as illustrated below:
Compilers are difficult to write, but they are well-optimized so that compiled code executes very
fast. Interpreters are easy to write,5 but they tend to execute code very slowly. On the JVM, byte
code, not source code, is interpreted. Byte code interpreters are relatively easy to write and they
execute code relatively quickly. So this additional layer of indirection helps achieve a sweet spot
between writing correct code and generating code that executes quickly.
Furthermore, the JVM facilitates platform independence. A developer can code on a MAC, and
then compile to byte code, which they can then release, and that byte code should run reliably
on a Linux machine. This approach is preferable to either releasing their source code for users to
attempt to compile and run on their own (this would both give away intellectual property and be a
burden for users), and to compiling their source code for every imaginable platform. According to
Sun Microsystems, the creators of Java, there are over 5.5 billion JVM-enabled devices!6 Imagine a
developer attempting to release their code for even a small fraction of that many devices.
4
and Racket, and optionally OCaml
Indeed you wrote one in CS 17—during your very first semester of computer science.
6
And that statistic dates back to 2012!
5
4
CS18
3
Lecture 1: Introduction to CS 18
Jan 25, 2017
Hello, World!
When you set out to learn a new programming language, 90% of the time, the first program you
write is “Hello, World!” This program does one thing, and one thing only. It outputs “Hello, World”
to standard output (i.e., on the console).
Here it is, folks: the “Hello, World!” program in Java.
/**
* A class that prints "Hello, world!" to standard output.
*/
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, World!"); // print ‘‘Hello, World!"
}
}
Object-oriented code is structured in terms of classes and objects. Objects are instances of classes;
likewise, classes are abstractions of objects. Each object stores and manipulates data. The latter is
achieved via methods, which specify behavior relevant to an object. What we called functions or
procedures in CS 17 will be called methods in CS 18.
Our “Hello, World” program consists of one class definition, the HelloWorld class. That class is
declared to be public. The fact that classes can be declared public, private, etc., is called access
control. We will undertake a detailed discussion of access control in a couple of weeks. For now,
we will simply declare all our classes to be public.
Within the HelloWorld class, there is one function definition, main. Every Java program has exactly
one main method,7 which is the program’s entry point: i.e., the point where the flow of execution
begins. The main method is always declared to be both public and static. When a method is
static, it means that the method is accessible even if an instance of the class in which it resides is
not created (which it is not in this program). The main method is required to be static so that the
JVM can access the entry point of a Java program without having to instantiate any objects. It
leaves the instantiation of objects to the programmer (and we leave a discussion of instantiating
objects for a later lecture).
The signature of the main method in a Java program is void main(String[] args). That is, the
input to the main method is String[] args, and its output is void. The void keyword means
“nothing.” So, main returns nothing. Why invoke a method that returns nothing? Such a method is
executed for its side effects only. An example of a side effect is printing to standard output.
The syntax String[] means an array of strings. We will learn about arrays soon. In the meantime,
you can just think of them as finite lists. This means that the main method takes as input a
(finite) list of strings, which, by convention, is always named args in Java programs. The strings
in the list args are called command-line arguments. They are useful when you want your
7
Even though we will use the word function, not method, for the first couple of weeks of class, until we dive into
OOP, it seems anachronistic to refer to the main method as the main function, even though that is, of course, what it
was originally called when it was introduced in C.
5
CS18
Lecture 1: Introduction to CS 18
Jan 25, 2017
program’s performance to vary at run time. For example, the unix program ‘ls’ takes in a variety
of command-line arguments, such as ‘-a’ and ‘-l’. Try them. This single program can function very
differently (without recompiling) depending on its input.
Within the main method, the line System.out.println("Hello, World!"); is a call to the
System class in the Java library to print the "Hello, World!" message to standard output.
When we start programming in Scala, you will be relieved to learn that it is sufficient to type
println("Hello World!");. But Java requires that you type the entire path to the method in the
Java library.
Compiling and running the “Hello, World!” program is fairly straightforward. To compile source
code into bytecode, use ‘javac’, the Java compiler. To run bytecode, use ‘java’, the JVM.
cslab1a ~ $ javac HelloWorld.java
cslab1a ~ $ java HelloWorld
Hello, World!
By way of comparison, here is the “Hello, World” program in Racket.
"Hello, World!"
By way of another comparison, here is the “Hello, World” program in Basic, the first programming
language that Professors Greenwald and Nelson learned.
10 PRINT "Hello, World!"
It’s pretty easy to figure out what this program does, even if you don’t know any Basic. It has
one line, and that one line does one action, PRINT, which means print the string that follows the
PRINT command to standard output.
4
Getting Started with Java
We’re now going to take a look at the components of Java programs. At first, you might be a little
confused about how the pieces fit together as a program, but try to suspend your confusion and
focus on understanding the pieces alone for now.
Lexical Structure
• Java is a case-sensitive language (e.g., “if” is different than “If”)
• Comments (which are necessary for the design recipe!) take two forms:
// This is a single line comment
/*
* This is a
* multiple line
* comment
*/
6
CS18
Lecture 1: Introduction to CS 18
Jan 25, 2017
• Keywords with predefined meanings include: if, else, true, false, class, interface,
abstract, public, private, protected, static, final, void, this, etc.
• Literals (i.e., constants) are numbers, characters in single quotes, strings in double quotes,
true, false, etc..
• Identifiers are names that are given to a part of a program (e.g., a variable or a function).
Their meaning is known to Java through their declarations (e.g., int i = 18 declares an
identifier named i with the integer value 18). Identifiers are comprised of arbitrary numbers
of alphanumeric characters, and the underscore character, beginning with a lowercase letter
(by convention) (e.g., i, x18, foo, myObject).
• Punctuation Marks: {, }, (, ), ;, ., etc.
• Arithmetic Operators: +, -, *, /, %, +=, -=, etc.
• Comparison Operators: <, <=, >, >=, ==, !=, etc.
• Logical Operators: &&, ||, !
Primitive Data Types The following are primitive (i.e., built-in) data types in Java. They vary
in size, as noted. Size is measured in bytes, where 1 byte equals 8 bits. A bit is a binary digit,
meaning a 0 or a 1. The bit is the smallest unit of computer storage.
• a character type: char, whose size is 2 bytes.
• four integer types: byte, short, int, and long. These vary in size, measuring 1, 2, 4, and 8
bytes, respectively.
A char differs from a short in that the former is unsigned, so its two bytes range in value
from 0 to 216 = 65, 536; whereas the latter is signed, so its two bytes range in value from
−215 = −32, 768 to 215 = 32, 767—the 16th bit stores the sign.
Generalizing from shorts, an byte ranges in value from −27 to 27 ; an int, from −231 to 231 ;
and a long, from −263 to 263 .
• two floating-point types: float and double. These also vary in size, measuring 4 and 8
bytes, respectively. They differ from ints and longs in that much of their storage is reserved
for significant decimal digits: floats store approximately 7 significant decimal digits, and
doubles, approximately 15.
• a boolean type: boolean. A boolean stores one bit of information (true or false).
Collectively, the integer types are called integral types, and the integer and floating-point types
together are called numeric types.
Here are some examples of declaring literals of these types in a Java REPL:
java> boolean myBool = true;
java> myBool
true
7
CS18
Lecture 1: Introduction to CS 18
java> char my_favorite_letter = ’a’
java> my_favorite_letter
’a’
java> int a = 18;
java> a
18
java> int b = 018;
Invalid variable declaration
java> int b = 017;
java> b
15
java> int c = 0x18;
java> c
24
java> long l = 12345678;
java> l
12345678
java> short s = 12345678;
Static Error: Bad types in assignment: from int to short
java> short s = 1234;
java> s
1234
java> byte b = 1234;
Static Error: Bad types in assignment: from int to byte
java> byte b = 12;
java> b
12
Here are some examples of operating on literals:
java> 17 + 18;
35
java> 17.0 + 18.0;
35.0
java> 17 / 18;
0
8
Jan 25, 2017
CS18
Lecture 1: Introduction to CS 18
Jan 25, 2017
java> 17.0 / 18.0;
0.9444444444444444
java> 17 % 18;
17
java> 17.0 % 18.0;
17.0
All of that was as expected, hopefully. But now for something new and different:
java> 17 + 18.0;
35.0
java> 17.0 / 18;
0.9444444444444444
Wow! What just happened? OCaml would have BLURGHed had we attempted to add or divide
an int and a float. But Java doesn’t. Instead it does something called implicit conversion,
whereby it converts from one type to another, so as to preserve precision. That is, a byte in this
situation would be converted to a short, a short to an int, an int to a long, etc., and all of the
above to a float or double, as in these two examples.
In short, if you want to perform some arithmetic operation on two integers and produce a floatingpoint number as the result, just make one of the operands a floating-point value. Adding just a
single decimal point (writing 17. instead of 17) does the trick:
java> 17. / 18;
0.9444444444444444
String is also a common (though not primitive) type in Java. Non-primitive types are called
reference types (see Lecture 8, or so). Syntax-wise, primitive types are spelled with lower-case
letters, while references types begin with capitals.
Flow of Control Flow of control in imperative programs is more complicated than in functional
programs. Among other things, there are more control flow constructs. For today, however, we stick
with two which are already familiar from CS 17: conditionals and functions.
Conditionals
We use if/else statements to incorporate conditional logic into our Java programs:
if (hcondition1 i) {
hthenExpn1 i;
else if (hcondition2 i) {
hthenExpn2 i;
else if (hcondition3 i) {
...
else
9
CS18
Lecture 1: Introduction to CS 18
Jan 25, 2017
helseExpni;
}
This template tests a condition to see if it is true of our data, and then proceeds in one direction or
another depending on the result. For example, assuming a variable called temp:
if (temp > 100) {
System.out.println("too hot");
} else if (temp < 0) {
System.out.println("too cold");
} else {
System.out.println("ahhh...just right");
}
Functions Functions are called methods in Java and other object-oriented programming languages. But we will stick with the name function for the first few weeks, until we move into the
OOP world.
Here is the Java syntax for a one-line function:
hReturnTypei functionName(hformalArgumentsi) {
return hexpressionOfTypeReturnTypei;
}
Before the name of the function, the function’s return type is given. Then the formal parameters
are listed, again with each type preceding each name. The code appears between left and right
curly braces, delimiting the start and end of the function. The only line of a one-line function is
a return statement. This statement ‘‘returns,’’ meaning it passes flow of control back to the
function that invoked this function, and along with that flow of control it also passes an expression
whose type is a value of type ReturnType.
For example, the following addition function adds two integers and returns their sum.
/**
* Returns the sum of two numbers
*
* @param num1 - an argument
* @param num2 - another argument
*
* @return the sum of num1 and num2
*/
int add (int num1, int num2) {
return num1 + num2;
}
The general template for writing a function in Java is as follows:
hReturnTypei hfunctionNamei(hformalArgumentsi) {
hReturnTypei hreturnValuei;
// Code that computes the return value
10
CS18
Lecture 1: Introduction to CS 18
Jan 25, 2017
return hreturnValuei;
}
For example:
/**
* Returns the sum of two numbers
*
* @param num1 - an argument
* @param num2 - another argument
*
* @return the sum of num1 and num2
*/
int add(int num1, int num2) {
int sum = num1 + num2;
return sum;
}
A
Why Java?
Obviously, many programming languages existed before Java was created. There was Lisp, C, C++,
Scheme, and Perl, to name a few. So why Java? According to the Java Language Environment
white paper, Java was designed with the following five key objectives in mind:
• Simple, Object-Oriented, and Familiar: Java was meant to be accessible and easy to use.
The C family of languages had a well-known and accepted language syntax so Java leveraged
it. The object-oriented model was emphasized to aid in the development of large and complex
software from smaller and concise components (component-based design). The release also
included access to a large and extensible class library to provide fundamental building blocks
for software design.
• Robust and Secure: Java provides a layer of abstraction between the programmer and the
hardware. This safeguards the programmer from problems that arise from manual memory
management (e.g., memory leaks) and error-prone arithmetic (e.., memory corruption) that
had caused endless amounts of frustration to C/C++ programmers in the past.
• Architecture Neutral and Portable: Java code is compiled into an architecture-neutral,
intermediate byte-code and executed on the JVM. Java programs can be executed on any
architecture for which a JVM has been written. This has resulted in Java code being nearly
universally portable. Accomplishing this with other programming languages can be nothing
short of a nightmare.
• High Performance: The creators of Java wanted to make sure that the interpreted environment provided by the JVM was optimized to minimize its overhead and provide short
response times – it is a large layer of indirection after all! Initially, Java was significantly
slower than other programming languages, but they have closed the performance gap over
the last decade. Also, garbage collection (memory management) threads were intended to be
scheduled intelligently to ensure with high probability that memory would be available when
required.
11
CS18
Lecture 1: Introduction to CS 18
Jan 25, 2017
• Interpreted, Threaded, and Dynamic: Java provides integrated thread and synchronization facilities with thread-safe libraries. In addition, the interpreter is able to dynamically link
to other classes referred to in the code. This allows classes to only be linked on demand and
to be easily interchangeable and also provides quick and light-weight development cycles.
While few of these bullet points are unique to Java, the collocation of all ideas into a single package
provided programmers with a significant reason to explore Java. The marketing department at Sun
Microsystems (the company behind Java) really pushed and emphasized these capabilities of Java
and the language ended up seeing a lot of use in industry.
Also, Java debuted with Javadoc, which was one of the very first automatic documentation
generators for translating comments in source code directly into HTML. The Java class library is
incredibly well documented—take a look at the ‘‘javadocs’’ whenever you are in doubt of how
something in the Java libraries works. Also, be sure to leverage the benefits of Javadoc by using
it to present your own documentation in a nice and easy to use format for you and any others
who may use your code.
Please let us know if you find any mistakes, inconsistencies, or confusing language in this or any
other CS18 document by filling out the anonymous feedback form:
http://cs.brown.edu/courses/cs018/feedback.
12