Download final06

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
COS 441: Final Exam
January, 2006
This final should be the individual work of each student in the class. Please to not talk to
anyone other than the professor (David Walker) or TA (Aquinas Hobor) about the
questions on this midterm. Talking to anyone else about this exam between the 18th and
20th of January, 2006 constitutes a violation of Princeton's code of academic integrity.
You may consult your lecture notes, or any of the textbooks listed as required or
recommended on the course Web pages, but do not search for the answers on the
general Web. If you need clarification on a question, please e-mail me or come to see
me.
Details:



You must complete the exam in a 24-hour period. Write down the time you
download the exam and the time you hand it at the top of the exam itself.
You will work on the exam between the 18th and 19th of January, 2006. Students
who requested a different time period in advance have been granted special
consideration.
Total points = 75
Reminders:





Read questions completely before beginning your response. This exam has quite
a lot of reading because part of the test is whether or not you are able to read and
understand formal inference rules (typing rules and operational rules). Take your
time reading the question. You should have plenty of time so do not feel the need
to jump right in right away.
Always state your proof methodology clearly and explicitly before writing down
the details of a proof. Points are automatically deducted from anyone who does
not do this. Points will also be deducted for proofs that are unclear or poorly
structured.
Always use the exact syntax of expressions, types, judgments, etc. that you are
given in a question, or clearly define the abbreviations that you're using. When in
doubt, avoid abbreviations. Definitely do not just start using some new, informal
notation without defining it -- graders will not be able to figure out your intent.
When defining a new type system, set of operational rules, etc., be sure to state
the form of the new judgments in question before laying out the details of the
rules (unless, of course, the question specifies the form of the judgment you
should use).
If some aspect of the question is confusing or underspecified and I am
unavailable, make some reasonable assumption and write down the assumption
clearly before beginning the question.


Even if you can't come up with the correct solution for a question, write down as
clear and concise an explanation of your thought process and partial work as you
can to get partial credit. Do not leave a question entirely blank unless you know
nothing about the topic.
When writing out your proofs, use plenty of space (either electronically or on
paper) to make them easy to read. One of the best ways is to format each line
with one true statement (judgment) on the left and the justification for that
statement on the right (in terms of earlier statements and inference rules, etc.).
Good luck!
1. [20] Answer each question below concisely. Use no more than a sentence or two, a
picture, a simple equation, a judgement or a typing rule, etc.
a) Give one brief reason or situation where strongly typed programming languages like
ML or Java have an advantage over “weakly” typed languages like C.
b) Give one brief reason or situation where strongly typed programming languages like
ML or Java have an advantage over languages that are not statically (i.e., compile-time)
typed at all like Scheme or Lisp or scripting languages like Perl, Python, Ruby, PHP, and
JavaScript.
c) What is the difference between a covariant subtyping rule and a contravariant
subtyping rule?
d) When performing the type inference algorithm discussed in class, what part of the
algorithm fails when you try to do type inference for this function: (fun id (x) = x x). Be
as specific as possible. The best answer has this form: “During _______, ______ fails”
e) In ML, we can write down the following data type definition:
datatype tree = Leaf of int | Node of int * tree * tree
Using only int, recursive, sum and product types, encode this type.
f) Concisely describe the difference between an operational and a denotational semantics
for a language.
g) Why might one choose to write down an operational semantics using “evaluation
contexts” as opposed to the more standard way we did things at the beginning of class.
h) Is this a reasonable subtyping rule:
------------------------ (for any well-formed type t’)
forall a. t <= t[t’/a]
Yes or no? Why or why not -- briefly? Hint: it might help to think about a specific
example where t is a function type.
i) Is this a reasonable subtyping rule:
------------------------ (for any well-formed type t’)
t[t’/a] <= forall a. t
Yes or no? Why or why not? Is it more or less reasonable than the rule in question h?
j) What’s your favorite type?
2. In class, we have studied type systems for high-level programming languages like C,
Java and ML. However, it is possible to typecheck low-level languages, like those
generated by compilers, to ensure they are safe as well. Java bytecode is an example of a
relatively low-level, but type-safe language. However, it turns out that it is possible to
develop type systems for even lower languages. In fact, even the assembly or machine
language output of a compiler may be ascribed a safe type system! In this question, we
will develop a simple type system for a very simple, idealized assembly language and
prove it safe.
Our simple assembly language will only allow programmers to compute with integers (n)
and code pointers (l -- l is a meta-variable that ranges over the “locations” where code
blocks are stored). Programs will do so by moving integers and code pointers in and out
of registers (r) and jumping from one code block to the next. Here is a summary of the
basic syntax of our assembly language:
instruction operands
v ::= n | l | r
(integer or code pointer or register)
instructions
i ::= mov r v
| add rd, rs, v
| jz r, v
(move operand v into register r)
(add rs to v and put contents in rd)
(conditional jump:
if r is 0 then jump to v, else execute
the following instruction)
instruction sequences/
code blocks
I ::= i; I
(single instruction followed
by a sequence)
(unconditional jump to v)
(halt execution, print contents of r,
which must be an integer)
|jv
| halt r
programs
P ::= {l1: I1,...,lk: Ik} (a collection of code blocks I
each associated with
labels/code addresses l)
Here is a simple example program (PROG1) that computes the product of registers r1 and
r2, placing the final result in r3 before jumping to a return address assumed to be in r4:
// Assume:
// r1 : int, r2 : int
// r4: return address
// result produced in r3
prod: mov r3, 0
j loop
// initialize result
loop: jz r1, done
add r3, r2, r3
add r1, r1, -1
j loop
// if r1 = 0 then goto done
// result = r2 + result
// r1 = r1 - 1
done: j r4
Next, as usual, we will define the operational semantics for the language. In this case, we
will specify execution of our machine using a triple (P, R, I) where P is the program
being executed, R is a register file that assigns values to registers (see below), and I is the
sequence of instructions to be executed next. Intuitively, I represents the “program
counter”, but in an idealized way, so that execution of these programs looks a little bit
more like execution of high-level expressions in the lambda calculus or MinML.
Register files
R ::= {r1 = v1, ..., rk = vk}
(vi’s are not themselves
registers. They may only be
values: integers n or
locations l)
Where appropriate, we will use R and P as if they were functions from registers to values
and code locations to instruction sequences respectively. R(ri) will be a value (provided ri
does indeed contain a value in the current register file R) and P(li) will be an instruction
sequence (provided li is indeed a code location in the current program P). Ie:
{r1=v1,...,rk=vk} (ri) = vi
(lookup contents of register ri in register file)
{l1: I1,...,lk: Ik} (li) = Ii
(lookup contents of code location li in program)
For the sake of convenience, we write “R underlined” --- R(v) --- when v may be a
register or may be some other non-register value like n or l. R(v) = n and R(l) = l. (ie: R
does not really do anything in these two cases but using this notation helps us write down
the operational semantics in a very clean and elegant style.) R(r) = R(r) (ie: extract the
value of r from the register file R).
One last operation we need is a register file update R[r1 = v] updates the contents of
register r1. For example, the register file update {r1 = 3; r2 = 17}[r2 = 44] gives us this
resulting register file: {r1 = 3; r2 = 44}. Using these operations, we define the
operational semantics:
P(R(v)) = I
---------------------------- (jump)
(P, R, j v) ---> (P, R, I)
------------------------------------------------ (move)
(P, R, mov r v; I) ---> (P, R[r = R(v)], I)
R(r2) = n2 R(v) = n3
n1 = n2 + n3
----------------------------------------------------- (add)
(P, R, add r1, r2, v; I) ---> (P, R[r1 = n1], I)
R(r) = 0
P(R(v)) = I2
------------------------------------ (cond jump)
(P, R, jz r, v; I) ---> (P, R, I2)
R(r) ≠ 0
----------------------------------- (cond fall thru)
(P, R, jz r, v; I) ---> (P, R, I)
Notice that programs can easily “get stuck” or “crash.” For example, in the add
instruction, if r2 does not contain an integer value in R (maybe R does not even associate
register r2 with anything at the current point in execution), the program will get stuck.
Also, if v is some register, but that register does not contain an integer, the program will
also get stuck.
As a second example, consider execution of a jump instruction: (P, R, j l3). This machine
will get stuck if l3 is not a label in the program P. (Note, this may not be exactly how a
real machine actually works -- the real machine might continue computing for a little
while before it does something terrible like seg faulting or trying to read from an illegal
address and getting a bus error, but as we normally do, we model this by pretending the
machine gets stuck right away.)
Of course, we will be able to prevent the machine from ever getting stuck by type
checking. Here are the types we will use:
types t ::= int
| code(G)
(type of an integer)
(type of a code pointer:
in order to be allowed to jump to this code pointer,
either directly or indirectly, the current register file
must have type G)
register file types G ::= {r1 : t1, ..., rk : tk}
(registers r1,...,rk have types t1,...,tk
respectively)
whole-program type H ::= {l1: code(G1), ..., lk : code(Gk)}
(code locations l1,...,lk hold code blocks with
types code(G1) ... code(Gk) respectively)
Here is a partial specification of the typing rules for these machines and their programs.
You will have to fill in missing typing rules.
Judgement 1: H |-- v : t
-------------- (int)
H |-- n : int
(value, not register v has type t)
H(l) = code(G)
----------------------- (loc)
H |-- l : code(G)
Judgement 2: H |-- R : G (register file has type G)
For all r in the domain of G, H |-- R(r) : G(r)
------------------------------------------------------- (regfile)
H |-- R : G
(note: since we only type check the registers r in the domain of G -- NOT necessarily all r
in the domain of R -- register files R may contain more things than appear in their type G)
Judgement 3: |-- P : H (whole-program P has type H)
P = {l1 = I1,...,rk = Ik}
H = {l1 : code(G1), ..., lk : code(Gk)}
H |-- I1 : code(G1)
...
H |-- Ik : code(Gk)
--------------------------------------------------------------------------- (prog)
|-- P : H
Judgement 4: H; G |-- v : t (operand v has type t)
H |-- v : t
--------------- (op-val)
H; G |-- v : t
G(r) = t
--------------- (op-reg)
H; G |-- r : t
Judgment 5: H |-- i : G1 => G2 (instruction i requires input register file typed by G1
and after execution, produces register file typed by G2)
H; G1 |-- v : t
-------------------------------------- (mov)
H |-- mov r, v : G1 => G1[r : t]
(note: G1[r : t] updates the register file typing with r mapped to new type t;
r could have had a completely different type before executing this move instruction.)
(there are missing rules for the other instructions)
Judgement 6: H |-- I : code(G) (this code block has type code(G);
in other words, before jumping to or otherwise
entering this code block, the register file must
have type G)
H; G1 |-- v : code(G2)
G1 <= G2
----------------------------------------------------------- (jump)
H |-- j v : code(G1)
H |-- i : G1 => G2
H |-- I : code(G2)
------------------------------------------------------ (I seq)
H |-- i; I : code(G1)
Judgement 7: G1 <= G2
(register files with type G1 are
subtypes of register files with type G2)
(missing rules; uses Judgement 8)
Judgement 8: t1 <= t2
(t1 is a subtype of t2)
(missing rules; uses Judgement 7)
Judgement 9: |-- (P, R, I) ok (machine state (P, R, I) executes safely and does not
get stuck)
|-- P : H H |-- R : G H |-- I : code(G)
-----------------------------------------------|-- (P, R, I) ok
Questions follow [55 points]. Keep in mind that you do not have to do these questions in
the order that they are given. If you find it more convenient you may certainly do them
in the order that pleases you.
a) [5] Implement the (abstract) syntax of machine states, programs, register files, etc in
ML. It should be clear which ML definitions implement which sorts of things. In other
words, implement the formal theory as directly as possible using datatypes where
appropriate. Do not worrying about optimizing your representation in any way.
b) [10] Implement the operational semantics of the assembly language. Implement the
operational judgment as a function as directly as possible.
c) [3] Define PROG2 in ML and execute it using your interpreter. What does it return?
PROG2 =
main: mov r1, 4
mov r2, 3
mov r4, exit
j prod
exit:
halt r3
prod: mov r3, 0
j loop
// initialize result
loop: jz r1, done
add r3, r2, r3
add r1, r1, -1
j loop
// if r1 = 0 then goto done
// result = r2 + result
// r1 = r1 - 1
done: j r4
(you should also do your own testing)
d) [5] Give the rest of the (sound) typing rules in judgement 5, one rule per instruction.
You will have to prove your rules are sound in a second. If you find a mistake in your
rules when you do your proof, of course you will come back and fix your answer.
e) [5] Give the missing subtyping rules in judgements 7 and 8. Use an algorithmic
subtyping definition as opposed to a declarative subtyping judgment.
f) [5] Let G1 = {r1 : int; r2 : int; r3 : int; r4 : code({r3 : int})}.
Let H1 = {prod : code(G1), loop : code(G1), done : code(G1)}
Show that PROG1 (the example on page 2) has the type H1. In other words, give a full
typing derivation for:
|-- PROG1 : H1
g) [2] State an incorrect subtyping rule -- one that involves an incorrect variance (co-,
contra- or in-) -- and demonstrate that this subtyping rule is incorrect by giving a program
that type checks and also crashes due.
h) [10] Assume the following lemmas are true (you don’t need to prove them but may
use them in any later proofs):
Lemma 1 [Register Lookup Typing] If |-- P : H, H |-- R : G and H; G |-- v : t
then H; G |-- R(v) : t.
Lemma 2 [Canonical Values] If |-- P : H and H |-- v : t then
1. If t = int then v = n for some n.
2. If t = code(G) then v = l for some l and l in Dom(H) and H |-- P(l) : code(G).
Lemma 3 [Canonical Operands] If |-- P : H and H |-- R : G and H; G |-- v : t then
1. If t = int then R(v) = n for some n.
2. If t = code(G) then R(v) = l for some l and l in Dom(H) and H |-- P(l) : code(G).
Lemma 4 [Register Update] If H |-- R : G and H |-- v : t then H |-- R[r = v] : G[r : t]
Now, prove Progress:
If |-- (P, R, I) then either
1. there exists R’ and I’ such that (P, R, I) ---> (P, R’, I’), or
2. I is halt r and R(r) = n for some integer n.
If you need other lemmas to prove Progress, state those lemmas and prove them. Your
proof will be graded both on correctness and on clarity and structure.
i) [10] Assuming the lemmas given in part h, prove Preservation:
If |-- (P, R, I) ok and (P, R, I) ---> (P, R’, I’) then |-- (P, R’, I’) ok
Once again, if you need other lemmas state and prove them.