Download Lecture 20 - The University of Texas at Dallas

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data Protection Act, 2012 wikipedia , lookup

Object storage wikipedia , lookup

Computer security wikipedia , lookup

Data model wikipedia , lookup

Clusterpoint wikipedia , lookup

Data center wikipedia , lookup

Data analysis wikipedia , lookup

SAP IQ wikipedia , lookup

Information privacy law wikipedia , lookup

Database model wikipedia , lookup

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

3D optical data storage wikipedia , lookup

Apache Hadoop wikipedia , lookup

Open data in the United Kingdom wikipedia , lookup

Transcript
Secure Cloud Computing
and Cloud Forensics
Dr. Bhavani Thuraisingham
The University of Texas at Dallas (UTD)
October 2010
Cloud Computing: NIST Definition
• Cloud computing is a pay-per-use model for enabling available, convenient,
on-demand network access to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications, services) that can be
rapidly provisioned and released with minimal management effort or service
provider interaction. This cloud model promotes availability and is comprised
of five key characteristics, three delivery models, and four deployment
models.
• Key Characteristics: On-demand self-service, Location independent resource
pooling. Rapid elasticity, Pay per use.
• Delivery Models: Cloud Software as a Service (SaaS), Cloud Platform as a
Service (PaaS), Cloud Infrastructure as a Service (IaaS).
• Deployment Models: Private cloud, Community cloud, Public cloud. Hybrid
cloud.
• Our goal is to demonstrate policy based assured information sharing on clouds
Security Challenges for Clouds
• Policy
– Access Control and Accountability
• Data Security and Privacy Issues
– Third party publication of data; Security challenges associated with data outsourcing;
– Data at the different sites have to be protected, with the end results being made
available; querying encrypted data
– Secure Query Processing/Updates in Cloud
•
•
•
•
•
•
Secure Storage
Security Related to Virtualization
Cloud Monitoring
Protocol and Network Security for Clouds
Identity Management
Cloud Forensics
Layered Framework
Policies
XACML
Application
(Law Enforcement)
QoS
Resource
Allocation
HIVE/SPARQL/Query
Hadoop/MapReduc/Storage
XEN/Linux/VMM
Risks/
Costs
Cloud
Monitors
Secure Virtual
Network Monitor
Figure.2 Layered Framework for Assured Cloud
Approach: Study the problem with current principles and technologies and then
develop principles for secure cloud computing
7/6/2017
4
Secure Query Processing with
Hadoop/MapReduce
• We have studied Clouds based on Hadoop
• Query Rewriting and Optimization Principles defined and
implemented for two types of data
• (i) Relational data: Secure query processing with HIVE
• (ii) RDF Data: Secure query processing with SPARQL
• Demonstrated with XACML Policies (content, temporal, association)
• Joint demonstration with Kings College and University of Insubria
– First demo (2010): Each party submits their data and policies
– Our cloud will manage the data and policies
– Second demo (2011): Multiple clouds
Principles of Secure Query
Optimization
• Query optimization principles defined and strategies implemented
in the 1970s and 1980s for relational data (IBM System R and DB2
Ingres)
– Query Rewriting, Query Evaluation Procedures, Search strategy, Cost
functions
• Secure query optimization principles defined and strategies
implemented in the 1980s and 1990s (Honeywell, MITRE)
• Extended secure query optimization for cloud environment
– Query optimization for RDF data
– Secure query optimization for RDF data
– Secure query optimization for RDF data in a cloud environment
Fine-grained Access Control with Hive
Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to
enable easy data summarization, adhoc querying and analysis of large datasets
data stored in Hadoop files. It provides a mechanism to put structure on this data
and it also provides a simple query language called Hive QL which is based on
SQL and which enables users familiar with SQL to query this data
 Policies include content dependent access control, association based access control,
time-dependent access control
 Table/View definition and loading,
 Users can create tables as well as load data into tables. Further, they
can also upload XACML policies for the table they are creating.
Users can also create XACML policies for tables/views.
 Users can define views only if they have permissions for all tables
specified in the query used to create the view. They can also either
specify or create XACML policies for the views they are defining.
Fine-grained Access Control with Hive
System Architecture
SPARQL Query Optimizer for Secure
RDF Data Processing
• Developed a secure query optimizer and query
rewriter for RDF Data with XACML policies and
implemented on top of JENA
• Storage Support
– Built a storage mechanism for very large RDF graphs
for JENA
– Integrated the system with Hadoop for the storage of
large amounts of RDF data (e.g. a billion triples)
– Need to incorporate secure storage strategies
developed in FY09
System Architecture
Web Interface
New Data
Answer
Query
Data Preprocessor
MapReduce Framework
Parser
N-Triples Converter
Query Validator &
Rewriter
Prefix Generator
Predicate Based
Splitter
Predicate Object
Based Splitter
Server
Backend
XACML PDP
Query Rewriter By
Policy
Plan Generator
Plan Executor
Security for AMAZON S3
•
Many organizations are using cloud services like Amazon S3 for data
storage. A few important questions arise here –
–
Can we use S3 to store the data sources used by Blackbook?; Is the data we store on
S3, secure? Is it accessible by any user outside our organization? ; How do we restrict
access to files to the users within the organization?
–
BLACKBOOK is a semantic-web based tool used by analysts within the Intelligence
Community. The tool federates queries across data sources. These data sources are
databases or applications located either locally or remotely on the network.
BLACKBOOK allows analysts to make logical inferences across the data sources, add
their own knowledge and share that knowledge with other analysts using the system.
•
We use Amazon S3 to store the data sources used by Blackbook.
•
To keep our data secure, we encrypt the data using AES (Advanced
Encryption Standard) before uploading the data files on Amazon S3.
•
To restrict access to the files to the users within the organization, we
implemented RBAC policies using XACML
XACML Design Implementation
in Hadoop
• Until July 2010, little security in Hadoop
• We have designed XACML for Hadoop
• Use of In-line Reference Monitor Concept is being
explored
• Examining current Hadoop security (released July
2010 and will complete XACML implementation
December 2010)
• Also examining accountability for Hadoop (with
Purdue)
Secure VMM: Xen Architecture



Xen Hypervisor – The
hypervisor runs just on top of
the hardware and traps all calls
by VMs to access the
hardware.
Domain 0 (Dom0): Domain 0 is
a modified version of Linux that
is used to manage the other
VMs.
Domain U (DomU): Domain U is
the user domain in Xen. DomU
is where all of the untrusted
guest OSs reside.
Virtual Machines
DomU is broken into two parts Para-Virtualized Domains (PV) and Hardware
Assisted Virtualized Domains (HVM)

Para-virtualized Domain (PV): A Para-virtualized domain is a modified operating
system that is aware that it is a virtual machine. Can achieve near native
performance.

Hardware Assisted Virtualized Machine Domain (HVM) – HVMs are VMs that run
operating systems that have not been modified to work with Dom0. This allows
closed source operating systems like Windows.
Memory: PVs are given Read-Only access to memory and any updates are controlled
by the hypervisor. HVMs are given a shadow page table because they do not know
how to work with non-contiguous physical address spaces.
I/O Management: I/O Management is controlled by Dom0. PVs share memory with
Dom0 through which they can pass messages with it. Dom0 runs the Qemu
deamon to emulate the devices for the HVMs

Security Issues
Access Control – At the moment access control is discretionary. Finegrained multilevel controls are needed (Integrity Lock architecture)
Secure Boot – The boot process needs to be secured. Proper attestation
methods need to be developed.
Component Isolation – Dom0 supports networking, disk I/O, VM boot
loading, hardware emulation, workload balancing, etc. Dom0 needs to
be decomposed into components
Logging – More robust logging is needed to develop a clear view of the
chain of events.
Introspection – Introspection is a security technique where a virtual
machine running security software is allowed to look inside the
memory of another VM. Software such as IPSs and antriviruses, using
introspection should be safe from tampering if the monitored VM is
exploited.
Overall Architecture of Accountable
Grid Systems (Purdue)
* Accountability agents
* Strategies for accountability
data collection
* Exchange of information
among accountability agents to
generate alarms
Data Collection Approaches
Job-flow based approach
Jobs flow across different organizational units
Long computations are often divided into many sub-jobs to be run in parallel
A possible approach is to employ point-to point agents which collect data at
each node that the job traverses
Grid node based approach
It focuses on a given location in the flow and at a given instant of time for all
jobs
Viewpoint is fixed
The combination of two approaches allows us to collect complementary
information
Detection at the victim node (e.g.,
gatekeeper, head node)
• From the data obtained in grid-node based strategy, the agent detects anomalies
concerning resource consumption by using methodologies such as statistical
modeling, entropy based approaches
• However such approaches are often not accurate, and result in high rate of false
detection
• By using data concerning job’s flow collected in job-flow based strategy, agents
cooperate in three alarms (light, moderate, and critical) to further detect attacks
• Upon receiving a critical alarm, the agent takes proper actions such as increasing
the priority of jobs identified as legal or killing malicious jobs including jobs that
may potentially perform bad operations.
Cloud Forensics
(Kyun Ruan, University College Dublin)
• Forensic Readiness
– General Forensic Readiness; Synchronization of Data; Location Anonymity; Identity
Management; Encryption and Key Management; Log Format
• Challenges Unique to Cloud
– Multi-tenancy and Resource Sharing; Multiple Jurisdictions; Electronic Discovery
• Challenges Exacerbated by the Cloud
– The Velocity of Attack Factor; Malicious Insider; Data Deletion; Hypervisor-level Investigation;
Proliferation of Endpoints
• Opportunities with Cloud
– Cost-effectiveness; Robustness; Scalability and Flexibility; Forensics as a Cloud Service;
Standards and Policies
Current and Future Research
• Secure VMM (Virtual Machine Monitor)
– Exploring XEN VMM and examining security issues
• Demonstration using the Secure Cloud with North Central Texas
Fusion System Data (with ADB Consulting)
• Coalition demonstration (with London and Italy)
• Integrate Secure Storage Algorithms into the Storage System
Developed (2011)
• Identity Management (2011 with Purdue)
• Secure Virtual Network Monitor (Future, 2012)
• Cloud Forensics
Education Program
• We offer a course in Cloud computing (Industry
adjunct professor; Spring 2009)
• Course planned for Summer 2011 that incorporate
the research results (Building and Securing the
Cloud)
– Topics covered will include technologies, techniques,
tools and trends