Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Formalising Java Safety – An overview Pieter H. Hartel [email protected] 2003.10.02.목 박숙영 HPCC Lab 1 Contents Introduction Methodology Java Semantics The compiler Java extensions Small footprint devices Conclusions HPCC Lab 2 Introduction 1/2 Java is a safe programming language Type safe and memory safe The two main features Java does not offer pointer arithmetic Java offers references to objects Unused objects are automatically garbage collected Java is a strongly typed language Java performs runtime checks to avoid array index errors HPCC Lab 3 Introduction 2/2 Class loader Accepting and loading JVM programs into the Java runtime environment Byte code verifier Another type checker operating on the JVM byte codes. Both do their work before execution ofo the code from a newly loaded class starts. HPCC Lab 4 Methodology 1/4 Formal specifications The semantics of Java The semantics of the JVM language The Java to JVM compiler The runtime support, that is parts of the Java API, including all java.* classes. HPCC Lab 5 Methodology 2/4 The methodology to build these specifications Construct clear and concise formal specifications of the relevant components Validate the specifications by animating them, and by stating and proving relevant properties of the components. Refine the specifications into implementations Create all specifications in machine-readable form HPCC Lab 6 Methodology 3/4 Principal difficulties Multi-threading, exception handling, object orientation and garbage collection Careful consideration Ambiguous, inconsistent, incomplete Reference implementation is complex HPCC Lab 7 Methodology 4/4 Popular assumptions Unlimited memory Individual storage locations can hold all primitive data types Individual JVM program locations can hold all byte code instructions HPCC Lab 8 Java and JVM language features IM: Imperative core consisting of basic data, expressions and statements OO: Object orientation, i.e. Objects, classes, interfaces, and arrays TY: The Java type system, or byte code verification in the JVM CL: Class loading EH: Exception handling MT: Multi-threading, monitors, synchronisation GC: Garbage collection HPCC Lab 9 Java Semantics Table 1 HPCC Lab 10 Object Orientation Alves-Foss and LAM[1] denotational semantics of most of Java detail on the various basic data types in Java Better understanding HPCC Lab 11 The type system 1/2 Based on simple sub typing One novel feature Java offers interfaces by way of creating multiple inheritance Drossopoulou and Eisenbach[24] Static semantics and dynamic semantics of a relatively small subset of Java Drossopoulou et al[23] Extend their subset to include exception handling Syme[55] DECLARE system, gives proofs To uncover 40errors made during the translation Found two non-trivial errors in the hand written proofs of Drossopoulou and Eisenbach HPCC Lab 12 The type system 2/2 Nipkow and von Oheimb[45] Prove type soundness of a similar subset to Drossopoulou et al. Drossopoulou et al. Use Isabelle/HOL to machine-check the proofs from the outset Higher degree of confidence in the correctness of the specifications and the proofs Not able to validate the specifications Due to the lack of support for generating executable semantics[58] Glesner and Zimmermann[26] Specify the type system for a small fragment of Java HPCC Lab 13 Class Loader Wragg et al[62] Offer a model of class loading for a relatively small subset of Java to study one of Java’s more experimental features(binary compatibility) Multi-threading Borger and Schulte[10] and Cenciarelli et al[13] Multi-threading at the Java level The study of the issues left open by the official SUN documentation HPCC Lab 14 The compiler Diehl[21] Compilation schemes for a subset of the Java that excludes exceptioin handling, multi-threading and garbage collection to the corresponding subset of the JVM Operational semantics of this JVM subset Rose[50] Natural semantics of a subset of Java Static type systems for both(Java, JVM) A specification of the compiler for the subsets HPCC Lab 15 The Abstract State Machine approach 1/3 Borger and Shulte Working on formal specifications of Java, JVM, Compiler Based on the Abstract State Machine formalism Full semantic account: in Gurevich[29] Specify a modular semantics of a subset of the JVM[11], a subset of Java[10] Modular approach The two subsets do not entirely coincide HPCC Lab 16 The Abstract State Machine approach 2/3 [7] Reducing the subsets of Java and the JVM to omit Multithreading, class loading and arrays. Main result Informal theorem stating the correctness of the compiler Two papers revisit exception handling and object initialisation [8]: On problems with the initialisation of objects [9]: exception handling mechanism of java, the JVM, and the Compiler Main result: Formulation of the correctness of compiling exception handling, with a full proof HPCC Lab 17 The Abstract State Machine approach 3/3 Stark[53] The specification of Java and the JVM from Borger and Schulte[11,10] Presents a compiler from the imperative core of Java Gives a correctness proof of the compiler A forthcoming book[6] More complete specification of Java, the JVM, the compiler, the byte code [9] Mechanical checking of the specification Wallace[60] Includes Multi-threading, exception handling Excludes class loading and garbage collection HPCC Lab 18 Java extensions The safety of Java programs By using program verification techniques Fewer design and implementation problems Smart cards HPCC Lab 19 Model checking 1/3 Demartini et al[18], Havelund et al[31] How core features of Java can be mapped onto the Promela language of the SPIN model checker. multi-threading and objects.(Havelund et al model exceptions.) the objects using Promela’s arrays(one array element per instance of the class) The resulting models quickly grow too large to model check effectively only check for safety properties(assertions, deadlock) do not provide support for the checking of liveness properties HPCC Lab 20 Model checking 2/3 One of the most useful features of the SPIN model checker Its ability to display scenarios leading to problems(deadlock) Demartini et al To relate these scenarios back to the original Java sources More user friendly than that of Havelund et al. HPCC Lab 21 Model checking 3/3 Jensen et al[33] Use model checking to verify properties of Java programs, more abstract approach Static analysis techniques To reduce a Java program to a control flow graph Method calls, method returns, assertions Defines the state transitions of the abstract Java program Example[38] How the system can be used to model Java’s sandbox The stack inspection introduced by Java 2 HPCC Lab 22 Theorem proving 1/3 Detlefs et al, Modula 3[20], Java[52] Offers by requiring the programmer to annotate programs with pre- and post-conditions. The compiler is able to generate and prove the verification conditions. The system of Detlefs et al does not require the programmer to annotate programs with loop invariants and variants derives loop invariants automatically Assume that loops are executed at most once Powerful The type checker < the system < full verification HPCC Lab 23 Theorem proving 2/3 The LOOP project of Jacobs et al Full verification of Java programs Use a denotational semantics based tool to translate Java into the higher order logic of widely used theorem provers(PVS[32], Isabelle/HOL[57]) .. Properties Termination of a method In-variants on the fields of a class HPCC Lab 24 Theorem proving 3/3 Poetzsch-Heffter and Muller[47] An operational/axiomatic semantics of a subset set of Java prove the soundness of the axiomatic semantics with respect to the operational semantics. embedded in HOL Mechanical checking of the soundness proof would be feasible. Moore[39] A new version of a small subset of Cohen’s specification[15] of the JVM How the ACL2 theorem prover is capable HPCC Lab 25 Controlling type casts Java’s lack of polymorphism Requires programmers to insert type casts in their programs Example When storing an object, MyObject One must remember to cast the raw object back into the user class MyObject when retrieving the information Erroneous type casts: cause unexpected runtime exceptions Pizza[46] and Generic Java[12] Automatically inserting the required type casts. Generic Java No cast inserted by the compiler will fail HPCC Lab 26 Controlling execution time Java safety would be able to guarantee that computations terminate(within certain bounds) The denial of service attack would be prevented Execution time is one of the most difficult to control resources. HPCC Lab 27 Code certification 1/2 Necula and Lee[40]: proof carrying code(PCC) Automatic verification technique(assembly level programs) The producer expresses a safety property in terms of pre and post conditions on the program annotates the program, with loop invariants etc generates a proof of the safety property(by hand/using a mechanical proof assistant) The consumer receives the code and the proof mechanically checks that the proof is consistent with the program The program satisfies the safety property Does not need to trust the producer relies only on a small trusted infrastructure(type checker) HPCC Lab 28 Code certification 2/2 The problems of the PCC approaches The size of a proofs: exponential in the size of the program[42] The amount of redundancy Necula and Lee[41] Reduce a proof of size n to a proof of size √n by avoiding some redundancy Program verification requires special skills To formulate properties To discover appropriate loop invariants To drive mechanical theorem provers etc. It is essential that tools are automatic, or at least require as little programmer intervention as possible HPCC Lab 29 Small footprint devices Small footprint devices Mobile phones, PDAs, K Virtual Machine: 128KB of RAM Smart card A few hundred bytes of RAM & a dozen or so KB of EEPROM Java-Card VM(JCVM) 3 disadvantages The full potential and flexibility of client server software development cannot be realised Java applets running on the smallest embedded controllers cannot be verified appropriately before they are run The freedom of code migration is restricted Based on the Split VM concept Pushes part of the byte code verification from the loading to the compilation/linking phase. JVM byte code ☞ JCVM format Byte code verification, optimises, prepares the code for loading into the device. HPCC Lab 30 Byte code compression Clausen et al[14] Retain JVM byte codes Propose to compress them for the benefit of embedded systems The compression technique Commonly occuring sequences of instructions A new ‘macro’ instruction 30% loading time increase ☞ 30% space save up HPCC Lab 31 Class file conversion 1/3 Hartel et al[30]: the Java Secure Processor(JSP) Provide a complete specification of an early version of the JCMV Excludes multi-threading, garbage collection and exception handling Validated using the letos tool Methodological point[56] Earlier JSP the full JVM ☞ cutting back unwanted features. Newer KVM Scratch ☞ adding features as required. The developers of the picoPERC version of the JVM [44] offer a core VM(64KB) provide tools to add further functionality to the core VM HPCC Lab 32 Class file conversion 2/3 Lanet and Requet[35] B-method To study one particular aspect of the conversion from JVM to JCVM code Their results include 1. A specification of the constraints imposed by the byte code verifier for a small subset of the JVM 2. A specification of the semantics of this subset of the JVM byte codes 3. A specification of the semantics of the corresponding subset of the JCVM byte codes 4. A proof that the specification of the JCVM subset is a data refinement of the JVM subset HPCC Lab 33 Class file conversion 3/3 Denney and Jensen[19] Complementary to that studied by Lanet and Requet. Lanet and Requet o The conversion of JVM class files to JCVM class files by a ‘tokenisation’ Replaces names in the class files o o Reducing the size of the class files Speeding up the loading process Use the Coq theorem prover to mechanically check their proofs. Use an elegant method to parameterise their operational semantics over name resolution HPCC Lab 34 Byte code verification revisited 1/2 Split VM concept Off-line verification: Signing the results digitally(signature) Posegga and Vogt[49,48] To use a model checker(SMV) to perform off-line byte code verification for smart cards. Posegga et al[27] Propose to implement a tiny proof checker on a smart card. HPCC Lab 35 Byte code verification revisited 2/2 Rose and Rose[51] Use Necula and Lee’s proof carrying code(PCC) method to ‘split’ the byte code verifier. 1. The verification To reconstruct the types associated with all local variables and stack locations of JVM code 2. The certification To check based on the reconstructed types, that each instruction is correctly typed. Advantage 1. The certification process is simple 2. Only the certification needs to be trusted, not the verification HPCC Lab 36 Conclusions On modelling garbage collection, and the Java API. On building more appropriate theories for programming language semantics modelling. On simplifying and modularising the individual components of Java implementations. On reducing the size of the trusted computing base, so that flaws are less likely to compromise the security of the system as a whole. On considering formal specification, validation and provably correct implementation as a whole, rather than in separation. On presenting clear an concise formalisations of systems, which are accessible to the designers and implementors of these systems. On using machine machine-readable specifications. HPCC Lab 37