Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CS18 Integrated Introduction to Computer Science Nelson, Greenwald Lecture 1: Introduction to CS 18 Jan 25, 2017 Contents 1 Introduction to CS 18 1 2 Getting Started 3 2.1 Key Features of Java and Scala . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 The Java Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 Hello, World! 5 4 Getting Started with Java 6 A Why Java? 11 Objectives By the end of this lecture, you will know: • our goals for the semester • key features of Java and Scala • what the Java virtual machine (JVM) is By the end of this lecture, you will be able to: • write a program that prints "Hello world!" in Java 1 Introduction to CS 18 Welcome to CS 18! This course builds on CS 17 and CS 19. The mechanics of CS 18 are very similar to those of CS 17. Lectures, homeworks, labs, projects, and exams all work in pretty much the same way. The only minor distinction that the TAs hold weekly review sessions, instead of workshops. Content-wise, here’s a rundown of some of what we will cover this semester: • Programming Paradigms: – Imperative Programming: computations are expressed in terms of state—a program’s current information. Statements can change that state. The key word here is mutation. CS18 Lecture 1: Introduction to CS 18 Jan 25, 2017 – Object-Oriented Programming (OOP): computations are expressed in terms of classes and their instances—so-called objects—and how they compose, inherit from, and interface with one another. • Data Structures: – Arrays: collections of data, indexed by integers, with constant time access to individual elements – Lists: implemented as linked lists and dynamic arrays – Hash Tables: collections of data, indexed by non-integers, with (average) constant time access to individual elements – Heaps: tree-based data structures such that the greatest (or least) element is always stored at the root – Self-Balancing Binary Search Trees: BSTs that automatically rearrange themselves as necessary to keep their depth small • Algorithms: – Dynamic Programming: faster algorithms for some problems we solved last semester using exhaustive search – Sorting Algorithms, in addition to those already covered in CS 17 – Graph Algorithms: algorithms that manipulate connected data – Parsing: algorithms that parse text based on a given specification • Tools for proving program correctness: – Mathematical Induction: a technique for proving properties of the natural numbers – Program (Loop) Invariant: a condition that is initially true, and remains true, throughout the execution of a program (or loop) • Programming Languages: – Java – Scala • Programming Models: – Client-Server Architecture: a network architecture in which work is divided between service providers, known as servers, and service requesters, known as clients – Web Programming: programming that generates web pages and their content • Programming Environments and Libraries: – Eclipse: an integrated development environment (IDE) – JavaFX: a library for developing graphical user interfaces (GUIs) for Java and Scala programs 2 CS18 2 Lecture 1: Introduction to CS 18 Jan 25, 2017 Getting Started We will start off this semester with Java, an object-oriented programming language that is widely used in industry today. About halfway through the semester, we will transition to Scala, an object-oriented programming language that supports both functional and imperative programming.1 Hence, in CS 18, you’ll be learning how to program in both Java and Scala. But, more generally, you’ll learn how to structure programs using the object-oriented and imperative paradigms—the focus of CS 18—together with the functional programming paradigm of CS 17. 2.1 Key Features of Java and Scala • Imperative: Some would say that imperative programming is more “natural” than functional, because we often think “imperatively.” Imperative programs are sequences of steps, much like recipes. As each step executes, the program’s state changes. This is called mutation. • Object-Oriented: Scala and Java include fundamental object-oriented design capabilities such as inheritance, encapsulation, and subtype polymorphism, which facilitate code re-use and extensibility. We will cover these principles in detail over the course of the semester. • Platform Independence: Java’s motto is “Write Once, Read Anywhere,” meaning: write a program once, compile it once, and run it everywhere.2 Both Java and Scala achieve this ideal via the JVM (Java virtual machine).3 • Extensive Libraries: Although one of the CS 17 / 18 mottos is “no magic,” both Java and Scala actually enable a whole lot of magic, making it possible for programmers to write program quickly. Q: How do they achieve this? A: With extensive libraries. A library (in computer science) is a collection of programs that are organized to be easily invoked by other programs. There exist Java and Scala libraries that do pretty much everything (e.g., networking, graphics, parsing, security, etc.). That said, in CS 18, we work from the ground up. You’ll be learning how to implement many fundamental data structures—like the lists you used in Racket (but with mutation!)—all on your own. So, in short, hands off the libraries unless we tell you otherwise. Like Racket, which is an offspring of Lisp, Java derives from and extends earlier languages. Java syntax is closest to that of C++, which, in turn, is based on C. Its execution model, the JVM, although popularized by Java, dates back to Smalltalk. Scala runs on the JVM and can leverage all code written in Java, including Java’s API. Hence, Scala is said to interoperate with Java. 1 A few of the companies who are using Scala at the time of writing include: LinkedIn, Twitter, FourSquare, Netflix, Tumblr, AirBnB, and Apple. 2 In reality, “Write Once, Run Anywhere” is often more like “Write Once, Debug Everywhere” due to minor incompatibilities and inconsistencies among JVM implementations. 3 And so do a number of other languages, like Groovy, Clojure, JRuby, and Jython. 3 CS18 2.2 Lecture 1: Introduction to CS 18 Jan 25, 2017 The Java Virtual Machine People write source code, but machines read “machine code,” or, more specifically, platformdependent native code. There are two ways (and combinations thereof) to run source code: it can be compiled into native code, so that it can be directly executed, or it can be interpreted, meaning that it is indirectly executed by another program, called an interpreter. A compiler is a program that translates source code into native code, which is then directly executed on the intended platform. Hence, a compiler is specific to a given machine architecture (e.g., there is a C compiler for Macs, and another one for IBMs). Using a compiler (only), running a program is a two-step process: 1. Compilation: Translate the source code into native code. 2. Execution: Run the native code. An interpreter is a program that executes code. Historically, the code executed by an interpreter was source code, written by humans (e.g., the first Lisp interpreter). Today it is more common for there to be an additional layer of indirection, so that the code executed by an interpreter is not source code, but virtual machine code, output by a compiler. Indeed, Java and Scala,4 are compiled into virtual machine code (also called bytecode), which is then interpreted by the JVM, as illustrated below: Compilers are difficult to write, but they are well-optimized so that compiled code executes very fast. Interpreters are easy to write,5 but they tend to execute code very slowly. On the JVM, byte code, not source code, is interpreted. Byte code interpreters are relatively easy to write and they execute code relatively quickly. So this additional layer of indirection helps achieve a sweet spot between writing correct code and generating code that executes quickly. Furthermore, the JVM facilitates platform independence. A developer can code on a MAC, and then compile to byte code, which they can then release, and that byte code should run reliably on a Linux machine. This approach is preferable to either releasing their source code for users to attempt to compile and run on their own (this would both give away intellectual property and be a burden for users), and to compiling their source code for every imaginable platform. According to Sun Microsystems, the creators of Java, there are over 5.5 billion JVM-enabled devices!6 Imagine a developer attempting to release their code for even a small fraction of that many devices. 4 and Racket, and optionally OCaml Indeed you wrote one in CS 17—during your very first semester of computer science. 6 And that statistic dates back to 2012! 5 4 CS18 3 Lecture 1: Introduction to CS 18 Jan 25, 2017 Hello, World! When you set out to learn a new programming language, 90% of the time, the first program you write is “Hello, World!” This program does one thing, and one thing only. It outputs “Hello, World” to standard output (i.e., on the console). Here it is, folks: the “Hello, World!” program in Java. /** * A class that prints "Hello, world!" to standard output. */ public class HelloWorld { public static void main(String[] args) { System.out.println("Hello, World!"); // print ‘‘Hello, World!" } } Object-oriented code is structured in terms of classes and objects. Objects are instances of classes; likewise, classes are abstractions of objects. Each object stores and manipulates data. The latter is achieved via methods, which specify behavior relevant to an object. What we called functions or procedures in CS 17 will be called methods in CS 18. Our “Hello, World” program consists of one class definition, the HelloWorld class. That class is declared to be public. The fact that classes can be declared public, private, etc., is called access control. We will undertake a detailed discussion of access control in a couple of weeks. For now, we will simply declare all our classes to be public. Within the HelloWorld class, there is one function definition, main. Every Java program has exactly one main method,7 which is the program’s entry point: i.e., the point where the flow of execution begins. The main method is always declared to be both public and static. When a method is static, it means that the method is accessible even if an instance of the class in which it resides is not created (which it is not in this program). The main method is required to be static so that the JVM can access the entry point of a Java program without having to instantiate any objects. It leaves the instantiation of objects to the programmer (and we leave a discussion of instantiating objects for a later lecture). The signature of the main method in a Java program is void main(String[] args). That is, the input to the main method is String[] args, and its output is void. The void keyword means “nothing.” So, main returns nothing. Why invoke a method that returns nothing? Such a method is executed for its side effects only. An example of a side effect is printing to standard output. The syntax String[] means an array of strings. We will learn about arrays soon. In the meantime, you can just think of them as finite lists. This means that the main method takes as input a (finite) list of strings, which, by convention, is always named args in Java programs. The strings in the list args are called command-line arguments. They are useful when you want your 7 Even though we will use the word function, not method, for the first couple of weeks of class, until we dive into OOP, it seems anachronistic to refer to the main method as the main function, even though that is, of course, what it was originally called when it was introduced in C. 5 CS18 Lecture 1: Introduction to CS 18 Jan 25, 2017 program’s performance to vary at run time. For example, the unix program ‘ls’ takes in a variety of command-line arguments, such as ‘-a’ and ‘-l’. Try them. This single program can function very differently (without recompiling) depending on its input. Within the main method, the line System.out.println("Hello, World!"); is a call to the System class in the Java library to print the "Hello, World!" message to standard output. When we start programming in Scala, you will be relieved to learn that it is sufficient to type println("Hello World!");. But Java requires that you type the entire path to the method in the Java library. Compiling and running the “Hello, World!” program is fairly straightforward. To compile source code into bytecode, use ‘javac’, the Java compiler. To run bytecode, use ‘java’, the JVM. cslab1a ~ $ javac HelloWorld.java cslab1a ~ $ java HelloWorld Hello, World! By way of comparison, here is the “Hello, World” program in Racket. "Hello, World!" By way of another comparison, here is the “Hello, World” program in Basic, the first programming language that Professors Greenwald and Nelson learned. 10 PRINT "Hello, World!" It’s pretty easy to figure out what this program does, even if you don’t know any Basic. It has one line, and that one line does one action, PRINT, which means print the string that follows the PRINT command to standard output. 4 Getting Started with Java We’re now going to take a look at the components of Java programs. At first, you might be a little confused about how the pieces fit together as a program, but try to suspend your confusion and focus on understanding the pieces alone for now. Lexical Structure • Java is a case-sensitive language (e.g., “if” is different than “If”) • Comments (which are necessary for the design recipe!) take two forms: // This is a single line comment /* * This is a * multiple line * comment */ 6 CS18 Lecture 1: Introduction to CS 18 Jan 25, 2017 • Keywords with predefined meanings include: if, else, true, false, class, interface, abstract, public, private, protected, static, final, void, this, etc. • Literals (i.e., constants) are numbers, characters in single quotes, strings in double quotes, true, false, etc.. • Identifiers are names that are given to a part of a program (e.g., a variable or a function). Their meaning is known to Java through their declarations (e.g., int i = 18 declares an identifier named i with the integer value 18). Identifiers are comprised of arbitrary numbers of alphanumeric characters, and the underscore character, beginning with a lowercase letter (by convention) (e.g., i, x18, foo, myObject). • Punctuation Marks: {, }, (, ), ;, ., etc. • Arithmetic Operators: +, -, *, /, %, +=, -=, etc. • Comparison Operators: <, <=, >, >=, ==, !=, etc. • Logical Operators: &&, ||, ! Primitive Data Types The following are primitive (i.e., built-in) data types in Java. They vary in size, as noted. Size is measured in bytes, where 1 byte equals 8 bits. A bit is a binary digit, meaning a 0 or a 1. The bit is the smallest unit of computer storage. • a character type: char, whose size is 2 bytes. • four integer types: byte, short, int, and long. These vary in size, measuring 1, 2, 4, and 8 bytes, respectively. A char differs from a short in that the former is unsigned, so its two bytes range in value from 0 to 216 = 65, 536; whereas the latter is signed, so its two bytes range in value from −215 = −32, 768 to 215 = 32, 767—the 16th bit stores the sign. Generalizing from shorts, an byte ranges in value from −27 to 27 ; an int, from −231 to 231 ; and a long, from −263 to 263 . • two floating-point types: float and double. These also vary in size, measuring 4 and 8 bytes, respectively. They differ from ints and longs in that much of their storage is reserved for significant decimal digits: floats store approximately 7 significant decimal digits, and doubles, approximately 15. • a boolean type: boolean. A boolean stores one bit of information (true or false). Collectively, the integer types are called integral types, and the integer and floating-point types together are called numeric types. Here are some examples of declaring literals of these types in a Java REPL: java> boolean myBool = true; java> myBool true 7 CS18 Lecture 1: Introduction to CS 18 java> char my_favorite_letter = ’a’ java> my_favorite_letter ’a’ java> int a = 18; java> a 18 java> int b = 018; Invalid variable declaration java> int b = 017; java> b 15 java> int c = 0x18; java> c 24 java> long l = 12345678; java> l 12345678 java> short s = 12345678; Static Error: Bad types in assignment: from int to short java> short s = 1234; java> s 1234 java> byte b = 1234; Static Error: Bad types in assignment: from int to byte java> byte b = 12; java> b 12 Here are some examples of operating on literals: java> 17 + 18; 35 java> 17.0 + 18.0; 35.0 java> 17 / 18; 0 8 Jan 25, 2017 CS18 Lecture 1: Introduction to CS 18 Jan 25, 2017 java> 17.0 / 18.0; 0.9444444444444444 java> 17 % 18; 17 java> 17.0 % 18.0; 17.0 All of that was as expected, hopefully. But now for something new and different: java> 17 + 18.0; 35.0 java> 17.0 / 18; 0.9444444444444444 Wow! What just happened? OCaml would have BLURGHed had we attempted to add or divide an int and a float. But Java doesn’t. Instead it does something called implicit conversion, whereby it converts from one type to another, so as to preserve precision. That is, a byte in this situation would be converted to a short, a short to an int, an int to a long, etc., and all of the above to a float or double, as in these two examples. In short, if you want to perform some arithmetic operation on two integers and produce a floatingpoint number as the result, just make one of the operands a floating-point value. Adding just a single decimal point (writing 17. instead of 17) does the trick: java> 17. / 18; 0.9444444444444444 String is also a common (though not primitive) type in Java. Non-primitive types are called reference types (see Lecture 8, or so). Syntax-wise, primitive types are spelled with lower-case letters, while references types begin with capitals. Flow of Control Flow of control in imperative programs is more complicated than in functional programs. Among other things, there are more control flow constructs. For today, however, we stick with two which are already familiar from CS 17: conditionals and functions. Conditionals We use if/else statements to incorporate conditional logic into our Java programs: if (hcondition1 i) { hthenExpn1 i; else if (hcondition2 i) { hthenExpn2 i; else if (hcondition3 i) { ... else 9 CS18 Lecture 1: Introduction to CS 18 Jan 25, 2017 helseExpni; } This template tests a condition to see if it is true of our data, and then proceeds in one direction or another depending on the result. For example, assuming a variable called temp: if (temp > 100) { System.out.println("too hot"); } else if (temp < 0) { System.out.println("too cold"); } else { System.out.println("ahhh...just right"); } Functions Functions are called methods in Java and other object-oriented programming languages. But we will stick with the name function for the first few weeks, until we move into the OOP world. Here is the Java syntax for a one-line function: hReturnTypei functionName(hformalArgumentsi) { return hexpressionOfTypeReturnTypei; } Before the name of the function, the function’s return type is given. Then the formal parameters are listed, again with each type preceding each name. The code appears between left and right curly braces, delimiting the start and end of the function. The only line of a one-line function is a return statement. This statement ‘‘returns,’’ meaning it passes flow of control back to the function that invoked this function, and along with that flow of control it also passes an expression whose type is a value of type ReturnType. For example, the following addition function adds two integers and returns their sum. /** * Returns the sum of two numbers * * @param num1 - an argument * @param num2 - another argument * * @return the sum of num1 and num2 */ int add (int num1, int num2) { return num1 + num2; } The general template for writing a function in Java is as follows: hReturnTypei hfunctionNamei(hformalArgumentsi) { hReturnTypei hreturnValuei; // Code that computes the return value 10 CS18 Lecture 1: Introduction to CS 18 Jan 25, 2017 return hreturnValuei; } For example: /** * Returns the sum of two numbers * * @param num1 - an argument * @param num2 - another argument * * @return the sum of num1 and num2 */ int add(int num1, int num2) { int sum = num1 + num2; return sum; } A Why Java? Obviously, many programming languages existed before Java was created. There was Lisp, C, C++, Scheme, and Perl, to name a few. So why Java? According to the Java Language Environment white paper, Java was designed with the following five key objectives in mind: • Simple, Object-Oriented, and Familiar: Java was meant to be accessible and easy to use. The C family of languages had a well-known and accepted language syntax so Java leveraged it. The object-oriented model was emphasized to aid in the development of large and complex software from smaller and concise components (component-based design). The release also included access to a large and extensible class library to provide fundamental building blocks for software design. • Robust and Secure: Java provides a layer of abstraction between the programmer and the hardware. This safeguards the programmer from problems that arise from manual memory management (e.g., memory leaks) and error-prone arithmetic (e.., memory corruption) that had caused endless amounts of frustration to C/C++ programmers in the past. • Architecture Neutral and Portable: Java code is compiled into an architecture-neutral, intermediate byte-code and executed on the JVM. Java programs can be executed on any architecture for which a JVM has been written. This has resulted in Java code being nearly universally portable. Accomplishing this with other programming languages can be nothing short of a nightmare. • High Performance: The creators of Java wanted to make sure that the interpreted environment provided by the JVM was optimized to minimize its overhead and provide short response times – it is a large layer of indirection after all! Initially, Java was significantly slower than other programming languages, but they have closed the performance gap over the last decade. Also, garbage collection (memory management) threads were intended to be scheduled intelligently to ensure with high probability that memory would be available when required. 11 CS18 Lecture 1: Introduction to CS 18 Jan 25, 2017 • Interpreted, Threaded, and Dynamic: Java provides integrated thread and synchronization facilities with thread-safe libraries. In addition, the interpreter is able to dynamically link to other classes referred to in the code. This allows classes to only be linked on demand and to be easily interchangeable and also provides quick and light-weight development cycles. While few of these bullet points are unique to Java, the collocation of all ideas into a single package provided programmers with a significant reason to explore Java. The marketing department at Sun Microsystems (the company behind Java) really pushed and emphasized these capabilities of Java and the language ended up seeing a lot of use in industry. Also, Java debuted with Javadoc, which was one of the very first automatic documentation generators for translating comments in source code directly into HTML. The Java class library is incredibly well documented—take a look at the ‘‘javadocs’’ whenever you are in doubt of how something in the Java libraries works. Also, be sure to leverage the benefits of Javadoc by using it to present your own documentation in a nice and easy to use format for you and any others who may use your code. Please let us know if you find any mistakes, inconsistencies, or confusing language in this or any other CS18 document by filling out the anonymous feedback form: http://cs.brown.edu/courses/cs018/feedback. 12