Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bytecode Verification on Java Smart Cards Xavier Leroy Presentation(day 1) -Nithya Overview JavaCard Architecture Sandbox model Sun’s Bytecode verification Our Verification Algorithm Performance Analysis of both Differences Overview of JVM Stack based abstract machine - Instructions pop their arguments off the stack - Push their results on the stack Set of registers (local variables) can be accessed via “load” and “store” - push the value of a register on the stack - store the top of the stack in the register Stack + register -> activation record for a method Type-correctness eg. of JVM Iadd : - Stack initially contains at least 2 elements of type int and it pushes back result of int Getfield C.f. τ : - Top of the stack contains a reference to an instance of class C - it then pops it and pushes back a value of type (the value of the field f) τ Conditions for JVM’s proper operation Type Correctness No stack overflow or underflow Code containment Register initialization Object initialization Smart Cards What are Smart Cards? Security tokens - applications Traditional Smart Cards - C or Assembler Challenged by Open Architectures - Multos, Java Card Java Card Architecture Applications written in Java and so are portable across all Java cards Java Cards can run multiple applications which can communicate thru shared objects New applications called applets can be downloaded on the card post issuance Security Issues (by malicious applet) Leaking confidential information outside (eg: PINS and secret cryptographic keys) Modifying sensitive information (eg: balance of an electonic purse) Interfering with other honest application on the card (Trojan Attack) Solution Put forward by Java Programming Environment Execute the applets in a so called “Sandbox” Its an insulation layer preventing direct access to the hardware resources Also implements a suitable access control policy Sandbox Model-3 components Applets --> Bytecode VM manipulates secure abstractions of data than hardware processor No direct access to hardware resources but to a set of API classes Upon Downloading, applet subject to static analysis called Bytecode Verification Bytecode Verification- Purpose 1. 2. 3. 4. To check ->code is well typed Doesn’t bypass protections 1 & 2 by performing ill typed operations at runtime: Forging object references from integers Illegal casting of obj ref from one class to another Calling private methods of API Jumping in middle of API method or jumping to data as if it were code Java Card Arch & Sandbox Component 1: Applets executed by JVM Component 2: Java Card runtime environment provides access control through its “firewall” Component 3: Bytecode Verifier missing! - complex and expensive process requires large amounts of working memory Approach 1 Relying on off-card tools Attesting a cryptographic signature on well-typedness of the applet Oncard downloading restricted to signed applets Drawbacks: - to extend the trusted computing base to off card components - practical issues: (how to deploy the signature keys?) - legal issues: (who takes liability for a buggy applet?) Defensive VM Approach To type-check dynamically during applet execution VM computes bytecode instructions Keeps track of all data it manipulates Arguments- correct types?stack overflow or underflow? Are class member accesses allowed? Drawback: - Dynamic Type-checks expensive in terms of execution speed and memory Traditional Bytecode Verifn For Straight line code? -Verifier checks every instruction of the method code - Checks whether stack has enough entries - Checks whether the entries are of correct types - Effect of the instruction on Operand stack and registers Intialisation: - Stack ->empty - Reg (0…n-1) -> method parameters - Reg(n…m-1)-> T (uninitialised reg) Method invocations ? Traditional Bytecode Verifn For Branches- Forks and Joins Forks? - must propagate inferred stack & reg type to all successors Joins? - makes sure the types of the stack and registers along all paths agree Handling joins Dictionary A data structure associating a stack and register type to each program point that is the target of a branch or exception handler During analysis, Updates ->type associated with target of branch Replacing by -> LUB(prev type,type inferred for the instruction) If changed -> the corr.instrn & successors are reanalysed until a fixpoint is reached This way, the dictionary entry for a branch target contains LUB inferred on all branches of control that lead to this instrn. Errors 1. Stack heights may differ An instruction can be reached through several paths with inconsistent operand stacks 1. Types for a particular stack entry or register may be incompatible eg: branch 1: short branch 2 : Obj Ref Type set to T in dictionary Semi-Lattice Interfaces and LUB Property: Every pair of types possesses a smallest common supertype (LUB) Does not hold -> Interfaces are types and classes can implement several interfaces Example interface I { ... } interface J { ... } class C1 implements I, J { ... } class C2 implements I, J { ... } Subtyping relation Approach Solution to Interface problem Bytecode verification ignores interfaces Treats all interface types as class type “Object” Verifier contains proper classes and subtyping between them is simply inheritance relation Subtyping relation is tree shaped ->forms a semilattice. LUB of 2 classes ->closest common ancestor in inheritance tree Performance Analysis Straight line Code: Time: as fast as executing instruction Space: 1 stack and 1 set of reg types (if each type-3 bytes, then memory=3S+3N bytes) Similar for method invocation: Space for method invocation = space for verification Performance Analysis 1. 2. 3. 4. Branches- costly since instructions are analyzed several times to reach fixpoint Real issue: Memory space for storing dictionary If B ->no of branch targets, memory =(3S+3N+3)*B bytes [3 bytes of overhead for each Dic. Entry] Drawbacks: No. of branch targets proportional to method size -> dictionary size increases linearly Dictionary entries quite often redundant Merging several methods into a larger one result in nonverifiable code Storing the dictionary in EEPROM not an option. Questions? Part II – Novel Bytecode Verification Algorithm Background for Our Verification Algorithm Operand stack is empty at the beginning and end of every statement JCVM for e1 ? e2 : e3 : code to evaluate e1 ifeq lbl1 code to evaluate e2 goto lbl2 lbl1: code to evaluate e3 Lbl2: … (Branch to lbl2 occurs with a non-empty operand stack) Background… In Java source, a local variable has only one type throughout the method: τ In JCVM bytecode, register with type T acquires the type τ at 1st store in reg. For Java code, A x; if (cond) x = new B(); // B is a subclass of A else x = new C(); // C is another subclass of A Register x has type A (ie,LUB(B,C)) when two arms merge Background { short x; ... } { C y; ... } Scopes of x & y disjoint ->store x & y in same register Requirements R1: Operand stack is empty at all branch and branch target instructions R2: Each register has only one type throughout the method code R3: On method entry,the VM initialises all registers that are not parameters to the bit pattern representing null object reference The Algorithm Global variables: Nr,Ns,r[Nr],s[Ns],sp,chg Initialisation: Set sp <- 0 Set r[0],…, r[n - 1] to the types of the method arguments Set r[n],…, r[Nr - 1] to ┴ Set chg <- true Algorithm While chg: Set chg -> false For each instruction i of the method, in code order: If i is the target of a branch instruction: If sp != 0 and the previous instruction falls through, error Set sp -> 0 If i is the target of a JSR instruction: If the previous instruction falls through, error Set s[0] -> retaddr and sp -> 1 If i is a handler for exceptions of class C: If the previous instruction falls through, error Set s[0] -> C and sp -> 1 If two or more of the cases above apply, error Algorithm Determine the types a1,…, an of the arguments of i If sp < n, error (stack underflow) For k = 1,…,n: If s[sp - n + k - 1] is not subtype of ak, error Set sp -> sp - n Determine the types r1,…, rm of the results of i If sp + m > Ns, error (stack overflow) For k = 1,…,m: Set s[sp + k - 1] -> rk Set sp -> sp + m If i is a store to register number k: Determine the type t of the value written to the register Set r[k] -> lub(T, r[k]) If r[k] changed, set chg -> true If i is a branch instruction and sp ! = 0, error Differences with straight line code verification Stack empty? When checking Store,reg type ->LUB Typechecking until a fixpoint Initialisation: Stack -> Empty reg(0..n-1) -> method parameter types reg(n..m-1)-> ┴ & not T (REQ R3) Next class Subroutines Object Initialisation Off card code transformations Bytecode Transformation on small computers