Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
UC Irvine – project transprose: transporting programs securely New Approaches to Mobile Code: Reconciling Execution Efficiency with Provable Security Michael Franz University of California at Irvine Technical Objective (1) design the third line of defense in a mobile-code system false authentication intrusion malicious mobile program prevent execution unless provably secure new third line of defense second line of defense: authentication first line of defense: access control (physical, logical) Technical Objective (2) make this “third line of defense” a pervasive property of every computer system, not just a luxury good afforded by only a few expensive ultra-secure highend installations rather than simply demonstrating the viability of mobile-code security, also make it practical across a wide spectrum of applications in this context, practical means scalable to large applications, with excellent final code quality, at resonable just-in-time compilation speed and cost Existing Practice: Java “Java” is the de-facto standard format for distributing mobile programs when we speak of “distributing mobile programs using Java”, we in fact usually mean “using the Java Virtual Machine” the JVM has an instruction set that has been designed specifically for representing Java programs – interestingly enough, there still are JVM programs for which no legal equivalent Java source program exists Existing Practice: Java Security although the Java programming environment is typesafe, programs compiled from Java into JVM-code must be re-checked upon arrival because they may have been corrupted in transit class MyClient { import MyLibrary; {...MyLibrary.NoSecret();…} } JVM-code stream call MyLibrary.NoSecret ... class MyLibrary { public void NoSecret(); private void ASecret(); } Existing Practice: Java Security although the Java programming environment is typesafe, programs compiled from Java into JVM-code must be re-checked upon arrival because they may have been corrupted in transit class MyClient { import MyLibrary; {...MyLibrary.NoSecret();…} } corrupted JVM-code stream call MyLibrary.ASecret ... class MyLibrary { public void NoSecret(); private void ASecret(); } Existing Practice: Java Security Java’s byte-code security model requires timeconsuming static verification and/or dynamic checking while the code is being executed IF THEN ... ELSE MyLibrary.Asecret() systematic study of security issues is still in its infancy Existing Practice: JVM Performance upon arrival at a target machine, most JVM code is translated into the appropriate native code “just-intime” performance resulting from “just-in-time” compilation is not competitive with off-line compilers – compilation systems such as Sun’s HotSpot are incredibly complex and haven’t delivered on their promise JVM approach is unlikely to scale to large programs requiring top-level performance Raising JVM Performance raising the performance of JVM-code has been addressed by “annotating” the byte-code stream with compiler back-end related information “annotated” class-files run much faster if an annotation-aware byte-code compiler is available on the target platform security is lost: the “annotations” are not optional to the annotation-aware compiler; if an adversary falsifies them, the compiler will create a program that may be unsafe! Emerging Practice: PCC ship a native program along with a “proof” that it doesn’t violate a given security policy although more general security policies are imaginable, current PCC systems essentially use type safety (and concomitant memory safety) as their security policy (our approach does the same) PCC drastically reduces the size of the trusted computing base Emerging Practice: PCC - Problems PCC is based on native code – (otherwise the trusted computing base would become larger again, defeating the main advantage of PCC) PCC has the performance advantages of fully optimized code, but requires multiple versions for multiple platforms also, in the long run, dynamically generated code (using feedback from dynamic profiling) will generally outperform native code Our Technical Approach study the interaction of security-related information, optimization-enhancing information, and compression, rather than considering them separately – use syntax-directed compression as a means of obtaining guaranteed referential integrity – transport compiler-related annotations to obtain top-level performance on the eventual target machine – use a proof-based approach to guard the compiler-related annotations from falsification in transit Our Technical Approach (2) no single focus on security, code-quality, or encoding density, but attempt to study their interaction and make progress along all three dimensions preliminary evidence suggests that these three topics are strongly interrelated and that representations based on adaptive compression of syntax trees are ideally suited for transporting mobile programs this research is orthogonal and complementary to work on authentication and security policies Our Policy Assumptions type safety using the typing model of the source language – all of the host’s library routines are guaranteed to be called with parameters of the correct types – capabilities (object pointers) owned by the host can be manipulated by the mobile client application only as specified in the host’s interface definition (private, protected, …) and cannot be forged type safety is guaranteed by our mobile code transportation scheme Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example: int i, j, k, l; float r, s, t, u; {i=j} Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example: int i, j, k, l; float r, s, t, u; {i=j} := operator Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example: int i, j, k, l; float r, s, t, u; {i=j} := operator i first operand (1 out of 8) Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example: int i, j, k, l; float r, s, t, u; {i=j} := operator i first operand (1 out of 8) j second operand (1 out of 3 or 4!) Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example: int i, j, k, l; float r, s, t, u; {i=j} := operator i first operand (1 out of 8) j second operand (1 out of 3 or 4!) higher-level encodings: enumerate all legal n assignments = at most 2 * possibilities (2) Virtual Machines vs. Graphs information is lost when compiling to the “flat” representation of virtual machines many native code optimizations require this information to be re-discovered Graph-Based Representation ... bra +2 Virtual Machine Representation ... Performance-Enhancing Information compiler-related information intended for improving code quality re-introduces redundancy that can be exploited by an adversary for example, a program can be encoded with guaranteed referential integrity using a grammar close to the semantics of the source language but in order to allow optimizations, the grammar needs to be relaxed the “holes” in the relaxed grammar need to be guarded by other means based on proof-carrying code concepts General Approach Taken use encoding-inherent security wherever possible (a well-formedness property of the encoding itself) use proof-based security where necessary to support optimizations – transporting results of alias analysis – removing range or type checks this approach applies regardless of the semantic level on which the program is being transported but the correct choice of such a semantic level must also be considered! Highest-Level Encoding simple and easily understood security policy based on type-safety ultra-compact representation using grammar-based compression guaranteed referential integrity provided essentially “for free” by the encoding – relatively small amount of proof-based security required only for additional performance-enhancing annotations – e.g., exceptions, alias analysis, escape analysis, dynamic type safety time required for dynamic compilation may be a problem Project Workflow “High Level Thread” 2. compression of Java programs Compression P2K JAG arithmetic encoding arithmetic encoding dictionary encoding dictionary encoding Encoding Java abstract grammar Proofs theorem prover well formedness combination heuristics (enhanced) static semantics annotated JAG 1. guarantee complete static semantics through encoding efficient annotated JAG 3. reduced verification effort due to abstract grammar encoding Lowest-Level Encoding compiler-oriented intermediate representation goal is to provide much better code quality with far less effort at the code consumer’s site requires more proof-based security than the “highlevel” approach, but still far less than the “original PCC idea” where the goal is to reduce the TCB more voluminous transportation format could be more difficult to reason about safety because further removed from the source language Project Workflow “Low Level Thread” 1. universal (sourcelanguage neutral) abstract syntax tree representation Compression SSA-directed encoding Encoding typed SSA Proofs theorem prover 2. UAST after performing all target-machine independent optimizations annotation encoding secure annotated typed SSA annotated typed SSA annotation encoding 3. encoding for the proofs required to guard the TASSA 4. provably secure targetmachine independent low-level representation Third Way: Core Calculus two-stage mapping of the mobile code – source constructs are mapped to the core calculus – mapping may be transported as well, or assumed global shared knowledge simple and easily understood security policy only approach that is easily extensible even by third parties not clear if this approach will yield adequate native code quality at the consumer’s site the relative trade-offs are as of yet unknown Current Status and Rationale developed a comprehensive library of stream compressors in Java “high-level” encoding prototype is up and running – working on a contribution to PLDI 2001 on Java compression “low-level” encoding and “core calculus” prototypes will be operational over the summer the relative trade-offs (encoding density vs. decoding/dynamic compilation speed vs. code quality) can only be determined by collecting experience with actual prototypes Quantitative Metrics security – publish complete design specification and rationale and open the design to public scrutiny and external validation efficiency – measure by comparing generated code quality with that of existing on-the-fly compilers code density – measure by comparing with competing proof-carrying code and mobile-code distribution formats Expected Major Achievements demonstrate that graph-based encoding formats are superior to virtual machines explore the relative trade-offs that can only be determined by building an actual prototype – encoding density/network transfer speed vs. – decoding/dynamic compilation speed – code quality, especially when using the core-calculus approach publish a design rationale that can form the basis of a subsequent standardization effort Long-Term Impact enable an educated choice of a replacement technology at the end of the Java Virtual Machine’s life-cycle royalty-free and free of particular proprietary intellectual property claims developed under the scrutiny of and in dialogue with the security community Task Schedule Y1 Milestones: •source-level representation => Java compression •low-level representation •core calculus representation 1999 investigate: •multiple source languages •graph-based encoding schemes •proof-carrying code Y2 Milestones: •3 system prototypes •trade-off analysis •encoding format comprehensive definition 2000 2001 investigate: •requirements of optimizing code generators •integration of security vs. compiler-related data End of Project: •system deliverable •comprehensive documentation 2002 investigate: •mutual interaction of security, efficiency, and compression density •security of system Transition of Technology the final design rationale document will provide enough detail that unrelated third parties will be able to replicate our code-transportation scheme(s) our prototype implementation(s) will be made available in source form the graduate students involved in this work are likely to transfer into the industrial sector Thank You