Download Document

Homework 1: Common Mistakes Memory Leak Storing of memory pointers instead of data Memory Leak  A program that uses “new” (“malloc”) without “delete” (“free”) suffers from memory leak. Why?   C++ “new” (C “malloc”) allocate space from heap. If a program loses access to this space then this memory space remains unused until the program stops executing. Example pseudo-code: For cntr ranging from 1 to 100 do BEGIN Record = new byte[1024]; Set Record fields Insert the record into the database END Memory Leak  Why is memory leak bad?    As the program executes, the available heap space shrinks. This space is allocated from the virtual memory managed by the operating system. If the virtual address space exceeds the available memory space and starts to thrash, the program becomes very, very, …, very slow! Memory Leak  A program that uses “new” (“malloc”) without “delete” (“free”) suffers from memory leak. Why?   C++ “new” (C “malloc”) allocate space from heap. If a program loses access to this space then this memory space remains unused until the program stops executing. Correct pseudo-code: For cntr ranging from 1 to 100 do BEGIN Record = new byte[1024]; Set Record fields Insert the record into the database delete record; END Returns the 1 kilobyte of data back to the heap! Bad Design: Store Pointers  Insertion of record 1 inserts pointers (not data): Shahram id Record 1  Age Name 5 25 Why is this design bad?  When the program stops execution, all the memory addresses (pointers) stored in the database become invalid. Bad Design: Store Pointers  Insertion of record 1 inserts pointers (not data): Shahram id Record 1  0101010101010 Age Name 5 25 Why is this design bad?   When the program stops execution, all the memory addresses (pointers) stored in the database become invalid. The operating system may move “Shahram” around, invalidating the stored memory address. Good Design: Store Data  Serialize the record to generate data: id Record 1  Age 5 25 Name Shahram Insertion of record 1 inserts data into the DBMS. This is the right design! Homework 2   Posted on the 585 web site (http://dblab.usc.edu/csci585) and is due on Feb 24th. Objective:     Use of primary and secondary indexes Highlight a few limitations of BDB when configured in main memory. Use of transactions to maintain integrity of a main-memory database. Read the homework description for in-class review this Thursday. Gamma DBMS (Part 3): Function Shipping versus Data Shipping Evaluation Shahram Ghandeharizadeh Computer Science Department University of Southern California Data Shipping    Client retrieves data from the node. Client performs computation locally. Limitation: Dumb servers, utilizes the limited network bandwidth. Process f(x) Xmit Data Data A Node Function Shipping     Client ships the function to the node for processing. Relevant data is sent to client. Function f(x) should produce less data than the original data stored in the database. Minimizes demand for the network bandwidth. Process function f(x) Output of f(x) A Node Gamma  Gamma is based on function shipping.  Hybrid-hash join partitions the referenced tables across the nodes of the shared-nothing architecture. Data does not leave the realm of the shared-nothing hardware. Service Time  Focus on query service time (only 1 request executing in the system) as a function of input table size.   Hash partition the table. Store the results of each query back in the database. Why? Why?  Seek time is a function of the distance traveled by the disk head. Join Queries  Join tables A and Bprime.    A is 10X Bprime. Produces the same number of records as BPrime. Note that re-partitioning the table is not that expensive. Join Queries  Join tables A and Bprime.   A is 10X Bprime. Produces the same number of records as BPrime. Why? How to Evaluate?  Focus on use of parallelism and scalability of the system. How?  Speedup:   Given a table with r rows and a query, if the service time of the system is X with one node, does it speedup by a factor of n with n nodes? # of Nodes Scaleup:   Speedup If the service time of a query referencing a table with r rows and a system with n nodes is X, does the service time remain X with a table of mr rows and mn nodes? Scaleup Both metrics measure service time of the system because only one request is submitted to the system. # of Nodes Selection Predicates: Speedup  Super-linear speed-up with 1% nonclustered index and 10% clustered index selection. Referenced table consists of 1 million rows. Selection Predicates: Scaleup Join Predicates: Speedup  1 Bucket starting with 5 nodes.  Results would have been superlinear if Bprime did not fit in main memory of 5 nodes. Join Predicates: Scaleup  Overhead of parallelism: The scheduler coordinating the activation, coordination, and de-activation of different operators. 2009: Evolution of Gamma  Shared-nothing architecture consisting of thousands of nodes!  A node is an off-the-shelf, commodity PC. Yahoo’s Pig Latin Google’s Map/Reduce Framework Google’s Bigtable Data Model Google File System ……. Gamma in 2009  Shared-nothing architecture consisting of thousands of nodes!  A node is an off-the-shelf, commodity PC. Yahoo’s Pig Latin Google’s Map/Reduce Framework Google’s Bigtable Data Model Google File System ……. Divide & Conquer Gamma in 2009  Source code for Pig and hadoop are available for free download. Yahoo’s Pig Latin Google’s Map/Reduce Framework Google’s Bigtable Data Model Google File System ……. Pig Hadoop References  Pig Latin   Map Reduce   Dean and Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, Vol. 51, No. 1, January 2008. Bigtable   Olston et. al. Pig Latin: A Not-So-Foreign Language for Data Processing. SIGMOD 2008. Chang et. al. Bigtable: A Distributed Storage System for Structured Data. In OSDI 2006. GFS  Ghemawat et. al. The Google File System. In SOSP 2003. Overview: Pig Latin  A high level program that specifies a query execution plan.  Example: For each sufficiently large category, retrieve the average pagerank of high-pagerank urls in that category.  SQL assuming a table urls (url, category, pagerank) SELECT FROM WHERE GROUP BY HAVING category, AVG(pagerank) urls pagerank > 0.2 category count(*) > 1,000,000 Overview: Pig Latin  A high level program that specifies a query execution plan.  Example: For each sufficiently large category, retrieve the average pagerank of high-pagerank urls in that category.  Pig Latin: 1. 2. 3. 4. Good_urls = FILTER urls BY pagerank > 0.2; Groups = GROUP Good_urls BY category; Big_groups = FILTER Groups by COUNT(Good_urls) > 1,000,000; Output = FOREACH Big_groups GENERATE category, AVG(Good_urls, AVG(Good_urls.pagerank); Overview: Map/Reduce (Hadoop)  A programming model to make parallelism transparent to a programmer.  Programmer specifies:  a map function that processes a key/value pair to generate a set of intermediate key/value pairs.   a reduce function to merge all intermediate values associated with the same intermediate key.    Divides the problem into smaller “intermediate key/value” sub-problems. Solve each sub-problem. Final results might be stored across R files. Run-time system takes care of:     Partitioning the input data across nodes, Scheduling the program’s execution, Node failures, Coordination among multiple nodes. Overview: Bigtable      A data model (a schema). A sparse, distributed persistent multi-dimensional sorted map. Data is partitioned across the nodes seamlessly. The map is indexed by a row key, column key, and a timestamp. Output value in the map is an un-interpreted array of bytes.  (row: byte[ ], column: byte[ ], time: int64)  byte[ ] Overview: Bigtable  Used in different applications supported by Google. Overview: GFS  A highly available, distributed file system for inexpensive commodity PCs.    Supports node failures as the norm rather than the exception. Stores and retrieves multi-GB files. Assumes files are append only (instead of updates that modify a certain piece of existing data).   Atomic append operation to enable multiple clients to append to a file with minimal synchronization. Relaxed consistency model to simplify the file system and enhance performance. How to start?  Bottom-up, starting with GFS. Yahoo’s Pig Latin Google’s Map/Reduce Framework Google’s Bigtable Data Model Google File System ……. Google File System: Assumptions Google File System: Assumptions (Cont…) GFS: Interfaces   Create, delete, open, close, read, and write files. Snapshot a file:   Create a copy of the file. Record append operation:  Allows multiple clients to append data to the same file concurrently, while guaranteeing the atomicity of each individual client’s append. GFS: Architecture      1 Master Multiple chunkservers File is partitioned into fixedsize chunks. Each chunk has a 64 bit chunk handle that is unique globally. Each chunk is replicated on several chunkservers.  Degree of replication is application specific; default is 3.  Software     Master maintains all file system meta-data: namespace, access control info, mapping from files to chunks, current location of chunks. GFS client caches meta-data about file system. Client receives data from chunkserver directly. Client and chunkserver do not cache file data. GFS: Architecture      1 Master Multiple chunkservers File is partitioned into fixedsize (64 MB) chunks. Each chunk has a 64 bit chunk handle that is unique globally. Each chunk is replicated on several chunkservers.  Degree of replication is application specific; default is 3.  Software     Client chooses one of the replicas. Master maintains all file system meta-data: namespace, access control info, mapping from files to chunks, current location of chunks. GFS client caches meta-data about file system. Client receives data from chunkserver directly. Client and chunkserver do not cache file data. GFS Master    1 master simplifies software design. Master monitors availability of chunkservers using heart-beat messages. 1 master is a single point of failure:  Master does not store chunk location information persistently: When the master is started, it asks each chunkserver about its chunks (and whenever a chunkserver joins).    File and chunk namespaces, Mapping from files to chunks, Location of each chunk’s replica. Mutation = Update   Mutation is an operation that changes the contents of or metadata of a chunk. Content mutation:     Performed on all chunk’s replicas. Master grants a chunk lease to one of the replicas, primary. Primary picks a serial order for all mutations to the chunk. Lease:    Granted by master, typically 60 seconds. Primary may request extensions. If master loses communication with a primary, it can safely grant a new lease to another replica after the current lease expires. Updates

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document