Download DB2 – Triggers This chapter describes triggers, their types, creation

DB2 – Triggers This chapter describes triggers, their types, creation and dropping of the triggers. Introduction A trigger is a set of actions, which are performed for responding to an INSERT, UPDATE or DELETE operation on a specified table in the database. Triggers are stored in the database at once. They handle governance of data. They can be accessed and shared among multiple applications. The advantage of using triggers is, if any change needs to be done in the application, it is done at the trigger; instead of changing each application that is accessing the trigger. Triggers are easy to maintain and they enforce faster application development. Triggers are defined using an SQL statement “CREATE TRIGGER”. Types of triggers There are three types of triggers: 1. BEFORE triggers They are executed before any SQL operation. 2. AFTER triggers They are executed after any SQL operation. Creating a BEFORE trigger Let us see how to create a sequence of trigger: Syntax: db2 create sequence <seq_name> Example: Creating a sequence of triggers for table shopper.sales1 db2 create sequence sales1_seq as int start with 1 increment by 1 Syntax: db2 create trigger <trigger_name> no cascade before insert on <table_name> referencing new as <table_object> for each row set <table_object>.<col_name>=nextval for <sequence_name> Example: Creating trigger for shopper.sales1 table to insert primary key numbers automatically db2 create trigger sales1_trigger no cascade before insert on shopper.sales1 referencing new as obj for each row set obj.id=nextval for sales1_seq Now try inserting any values: db2 insert into shopper.sales1(itemname, qty, price) values('bicks', 100, 24.00) Retrieving values from table Let us see how to retrieve values from a table: Syntax: db2 select * from <tablename> Example: db2 select * from shopper.sales1 Output: ID ITEMNAME ------- ------------ QTY ---------- 3 bicks 100 2 bread 100 2 record(s) selected. Creating an AFTER trigger Let us see how to create an after trigger: Syntax: db2 create trigger <trigger_name> no cascade before insert on <table_name> referencing new as <table_object> for each row set <table_object>.<col_name>=nextval for <sequence_name> Example: [To insert and retrieve the values] db2 create trigger sales1_tri_after after insert on shopper.sales1 for each row mode db2sql begin atomic update shopper.sales1 set price=qty*price; end Output: //inseting values in shopper.sales1 db2 insert into shopper.sales1(itemname,qty,price) values('chiken',100,124.00) //output ID ITEMNAME QTY PRICE ----- -------------- ----------- ----------3 bicks 100 2400.00 4 chiken 100 12400.00 2 bread 100 2400.00 3 record(s) selected. Dropping a trigger Here is how a database trigger is dropped: Syntax: db2 drop trigger <trigger_name> Example: db2 drop trigger slaes1_trigger DBMS - Joins We understand the benefits of taking a Cartesian product of two relations, which gives us all the possible tuples that are paired together. But it might not be feasible for us in certain cases to take a Cartesian product where we encounter huge relations with thousands of tuples having a considerable large number of attributes. Join is a combination of a Cartesian product followed by a selection process. A Join operation pairs two tuples from different relations, if and only if a given join condition is satisfied. We will briefly describe various join types in the following sections. Theta (θ) Join Theta join combines tuples from different relations provided they satisfy the theta condition. The join condition is denoted by the symbol θ. Notation R1 ⋈θ R2 R1 and R2 are relations having attributes (A1, A2, .., An) and (B1, B2,.. ,Bn) such that the attributes don’t have anything in common, that is R1 ∩ R2 = Φ. Theta join can use all kinds of comparison operators. Student SID Name Std 101 Alex 10 102 Maria 11 Subjects Class Subject 10 Math 10 English 11 Music 11 Sports Student_Detail − STUDENT ⋈Student.Std = Subject.Class SUBJECT Student_detail SID Name Std Class Subject 101 Alex 10 10 Math 101 Alex 10 10 English 102 Maria 11 11 Music 102 Maria 11 11 Sports Equijoin When Theta join uses only equality comparison operator, it is said to be equijoin. The above example corresponds to equijoin. Natural Join (⋈) Natural join does not use any comparison operator. It does not concatenate the way a Cartesian product does. We can perform a Natural Join only if there is at least one common attribute that exists between two relations. In addition, the attributes must have the same name and domain. Natural join acts on those matching attributes where the values of attributes in both the relations are same. Courses CID Course Dept CS01 Database CS ME01 Mechanics ME EE01 Electronics EE HoD Dept Head CS Alex ME Maya EE Mira Courses ⋈ HoD Dept CID Course Head CS CS01 Database Alex ME ME01 Mechanics Maya EE EE01 Electronics Mira Outer Joins Theta Join, Equijoin, and Natural Join are called inner joins. An inner join includes only those tuples with matching attributes and the rest are discarded in the resulting relation. Therefore, we need to use outer joins to include all the tuples from the participating relations in the resulting relation. There are three kinds of outer joins − left outer join, right outer join, and full outer join. Left Outer Join(R S) All the tuples from the Left relation, R, are included in the resulting relation. If there are tuples in R without any matching tuple in the Right relation S, then the S-attributes of the resulting relation are made NULL. Left A B 100 Database 101 Mechanics 102 Electronics Right A B 100 Alex 102 Maya 104 Mira Courses HoD A B C D 100 Database 100 Alex 101 Mechanics --- --- 102 Electronics 102 Maya Right Outer Join: ( R S) All the tuples from the Right relation, S, are included in the resulting relation. If there are tuples in S without any matching tuple in R, then the R-attributes of resulting relation are made NULL. Courses HoD A B C D 100 Database 100 Alex 102 Electronics 102 Maya --- --- 104 Mira Full Outer Join: ( R S) All the tuples from both participating relations are included in the resulting relation. If there are no matching tuples for both relations, their respective unmatched attributes are made NULL. Courses HoD A B C D 100 Database 100 Alex 101 Mechanics --- --- 102 Electronics 102 Maya --- --- 104 Mira DBMS - Indexing We know that data is stored in the form of records. Every record has a key field, which helps it to be recognized uniquely. Indexing is a data structure technique to efficiently retrieve records from the database files based on some attributes on which the indexing has been done. Indexing in database systems is similar to what we see in books. Indexing is defined based on its indexing attributes. Indexing can be of the following types −  Primary Index − Primary index is defined on an ordered data file. The data file is ordered on a key field. The key field is generally the primary key of the relation.  Secondary Index − Secondary index may be generated from a field which is a candidate key and has a unique value in every record, or a non-key with duplicate values.  Clustering Index − Clustering index is defined on an ordered data file. The data file is ordered on a nonkey field. Ordered Indexing is of two types −  Dense Index  Sparse Index Dense Index In dense index, there is an index record for every search key value in the database. This makes searching faster but requires more space to store index records itself. Index records contain search key value and a pointer to the actual record on the disk. Sparse Index In sparse index, index records are not created for every search key. An index record here contains a search key and an actual pointer to the data on the disk. To search a record, we first proceed by index record and reach at the actual location of the data. If the data we are looking for is not where we directly reach by following the index, then the system starts sequential search until the desired data is found. Multilevel Index Index records comprise search-key values and data pointers. Multilevel index is stored on the disk along with the actual database files. As the size of the database grows, so does the size of the indices. There is an immense need to keep the index records in the main memory so as to speed up the search operations. If single-level index is used, then a large size index cannot be kept in memory which leads to multiple disk accesses. Multi-level Index helps in breaking down the index into several smaller indices in order to make the outermost level so small that it can be saved in a single disk block, which can easily be accommodated anywhere in the main memory. B+ Tree A B+ tree is a balanced binary search tree that follows a multi-level index format. The leaf nodes of a B+ tree denote actual data pointers. B+ tree ensures that all leaf nodes remain at the same height, thus balanced. Additionally, the leaf nodes are linked using a link list; therefore, a B+ tree can support random access as well as sequential access. Structure of B+ Tree Every leaf node is at equal distance from the root node. A B+ tree is of the order n where n is fixed for every B+ tree. Internal nodes −  Internal (non-leaf) nodes contain at least ⌈n/2⌉ pointers, except the root node.  At most, an internal node can contain n pointers. Leaf nodes −  Leaf nodes contain at least ⌈n/2⌉ record pointers and ⌈n/2⌉ key values.  At most, a leaf node can contain n record pointers and n key values.  Every leaf node contains one block pointer P to point to next leaf node and forms a linked list. B+ Tree Insertion  B+ trees are filled from bottom and each entry is done at the leaf node.  If a leaf node overflows −  o Split node into two parts. o Partition at i = ⌊(m+1)/2⌋. o First i entries are stored in one node. o Rest of the entries (i+1 onwards) are moved to a new node. o ith key is duplicated at the parent of the leaf. If a non-leaf node overflows − o Split node into two parts. o Partition the node at i = ⌈(m+1)/2⌉. o Entries up to i are kept in one node. o Rest of the entries are moved to a new node. B+ Tree Deletion  B+ tree entries are deleted at the leaf nodes.  The target entry is searched and deleted. o  If it is an internal node, delete and replace with the entry from the left position. After deletion, underflow is tested, o  If distribution is not possible from left, then o  If underflow occurs, distribute the entries from the nodes left to it. Distribute from the nodes right to it. If distribution is not possible from left or from right, then o Merge the node with left and right to it. DBMS - Hashing For a huge database structure, it can be almost next to impossible to search all the index values through all its level and then reach the destination data block to retrieve the desired data. Hashing is an effective technique to calculate the direct location of a data record on the disk without using index structure. Hashing uses hash functions with search keys as parameters to generate the address of a data record. Hash Organization  Bucket − A hash file stores data in bucket format. Bucket is considered a unit of storage. A bucket typically stores one complete disk block, which in turn can store one or more records.  Hash Function − A hash function, h, is a mapping function that maps all the set of search-keys K to the address where actual records are placed. It is a function from search keys to bucket addresses. Static Hashing In static hashing, when a search-key value is provided, the hash function always computes the same address. For example, if mod-4 hash function is used, then it shall generate only 5 values. The output address shall always be same for that function. The number of buckets provided remains unchanged at all times. Operation  Insertion − When a record is required to be entered using static hash, the hash function h computes the bucket address for search key K, where the record will be stored. Bucket address = h(K)  Search − When a record needs to be retrieved, the same hash function can be used to retrieve the address of the bucket where the data is stored.  Delete − This is simply a search followed by a deletion operation. Bucket Overflow The condition of bucket-overflow is known as collision. This is a fatal state for any static hash function. In this case, overflow chaining can be used.  Overflow Chaining − When buckets are full, a new bucket is allocated for the same hash result and is linked after the previous one. This mechanism is called Closed Hashing.  Linear Probing − When a hash function generates an address at which data is already stored, the next free bucket is allocated to it. This mechanism is called Open Hashing. Dynamic Hashing The problem with static hashing is that it does not expand or shrink dynamically as the size of the database grows or shrinks. Dynamic hashing provides a mechanism in which data buckets are added and removed dynamically and ondemand. Dynamic hashing is also known as extended hashing. Hash function, in dynamic hashing, is made to produce a large number of values and only a few are used initially. Organization The prefix of an entire hash value is taken as a hash index. Only a portion of the hash value is used for computing bucket addresses. Every hash index has a depth value to signify how many bits are used for computing a hash function. These bits can address 2n buckets. When all these bits are consumed − that is, when all the buckets are full − then the depth value is increased linearly and twice the buckets are allocated. Operation  Querying − Look at the depth value of the hash index and use those bits to compute the bucket address.  Update − Perform a query as above and update the data.  Deletion − Perform a query to locate the desired data and delete the same.  Insertion − Compute the address of the bucket o If the bucket is already full.  Add more buckets.  Add additional bits to the hash value.  Re-compute the hash function. o Else  o Add data to the bucket, If all the buckets are full, perform the remedies of static hashing. Hashing is not favorable when the data is organized in some ordering and the queries require a range of data. When data is discrete and random, hash performs the best. Hashing algorithms have high complexity than indexing. All hash operations are done in constant time. DBMS - Transaction ACID Properties A transaction is a very small unit of a program and it may contain several lowlevel tasks. A transaction in a database system must maintain Atomicity, Consistency, Isolation, and Durability − commonly known as ACID properties − in order to ensure accuracy, completeness, and data integrity.  Atomicity − This property states that a transaction must be treated as an atomic unit, that is, either all of its operations are executed or none. There must be no state in a database where a transaction is left partially completed. States should be defined either before the execution of the transaction or after the execution/abortion/failure of the transaction.  Consistency − The database must remain in a consistent state after any transaction. No transaction should have any adverse effect on the data residing in the database. If the database was in a consistent state before the execution of a transaction, it must remain consistent after the execution of the transaction as well.  Durability − The database should be durable enough to hold all its latest updates even if the system fails or restarts. If a transaction updates a chunk of data in a database and commits, then the database will hold the modified data. If a transaction commits but the system fails before the data could be written on to the disk, then that data will be updated once the system springs back into action.  Isolation − In a database system where more than one transaction are being executed simultaneously and in parallel, the property of isolation states that all the transactions will be carried out and executed as if it is the only transaction in the system. No transaction will affect the existence of any other transaction. Serializability When multiple transactions are being executed by the operating system in a multiprogramming environment, there are possibilities that instructions of one transactions are interleaved with some other transaction.  Schedule − A chronological execution sequence of a transaction is called a schedule. A schedule can have many transactions in it, each comprising of a number of instructions/tasks.  Serial Schedule − It is a schedule in which transactions are aligned in such a way that one transaction is executed first. When the first transaction completes its cycle, then the next transaction is executed. Transactions are ordered one after the other. This type of schedule is called a serial schedule, as transactions are executed in a serial manner. In a multi-transaction environment, serial schedules are considered as a benchmark. The execution sequence of an instruction in a transaction cannot be changed, but two transactions can have their instructions executed in a random fashion. This execution does no harm if two transactions are mutually independent and working on different segments of data; but in case these two transactions are working on the same data, then the results may vary. This ever-varying result may bring the database to an inconsistent state. To resolve this problem, we allow parallel execution of a transaction schedule, if its transactions are either serializable or have some equivalence relation among them. Equivalence Schedules An equivalence schedule can be of the following types − Result Equivalence If two schedules produce the same result after execution, they are said to be result equivalent. They may yield the same result for some value and different results for another set of values. That's why this equivalence is not generally considered significant. View Equivalence Two schedules would be view equivalence if the transactions in both the schedules perform similar actions in a similar manner. For example −  If T reads the initial data in S1, then it also reads the initial data in S2.  If T reads the value written by J in S1, then it also reads the value written by J in S2.  If T performs the final write on the data value in S1, then it also performs the final write on the data value in S2. Conflict Equivalence Two schedules would be conflicting if they have the following properties −  Both belong to separate transactions.  Both accesses the same data item.  At least one of them is "write" operation. Two schedules having multiple transactions with conflicting operations are said to be conflict equivalent if and only if −  Both the schedules contain the same set of Transactions.  The order of conflicting pairs of operation is maintained in both the schedules. Note − View equivalent schedules are view serializable and conflict equivalent schedules are conflict serializable. All conflict serializable schedules are view serializable too. States of Transactions A transaction in a database can be in one of the following states −  Active − In this state, the transaction is being executed. This is the initial state of every transaction.  Partially Committed − When a transaction executes its final operation, it is said to be in a partially committed state.  Failed − A transaction is said to be in a failed state if any of the checks made by the database recovery system fails. A failed transaction can no longer proceed further.  Aborted − If any of the checks fails and the transaction has reached a failed state, then the recovery manager rolls back all its write operations on the database to bring the database back to its original state where it was prior to the execution of the transaction. Transactions in this state are called aborted. The database recovery module can select one of the two operations after a transaction aborts −  o Re-start the transaction o Kill the transaction Committed − If a transaction executes all its operations successfully, it is said to be committed. All its effects are now permanently established on the database system. DBMS - Storage System Databases are stored in file formats, which contain records. At physical level, the actual data is stored in electromagnetic format on some device. These storage devices can be broadly categorized into three types −  Primary Storage − The memory storage that is directly accessible to the CPU comes under this category. CPU's internal memory (registers), fast memory (cache), and main memory (RAM) are directly accessible to the CPU, as they are all placed on the motherboard or CPU chipset. This storage is typically very small, ultra-fast, and volatile. Primary storage requires continuous power supply in order to maintain its state. In case of a power failure, all its data is lost.  Secondary Storage − Secondary storage devices are used to store data for future use or as backup. Secondary storage includes memory devices that are not a part of the CPU chipset or motherboard, for example, magnetic disks, optical disks (DVD, CD, etc.), hard disks, flash drives, and magnetic tapes.  Tertiary Storage − Tertiary storage is used to store huge volumes of data. Since such storage devices are external to the computer system, they are the slowest in speed. These storage devices are mostly used to take the back up of an entire system. Optical disks and magnetic tapes are widely used as tertiary storage. Memory Hierarchy A computer system has a well-defined hierarchy of memory. A CPU has direct access to it main memory as well as its inbuilt registers. The access time of the main memory is obviously less than the CPU speed. To minimize this speed mismatch, cache memory is introduced. Cache memory provides the fastest access time and it contains data that is most frequently accessed by the CPU. The memory with the fastest access is the costliest one. Larger storage devices offer slow speed and they are less expensive, however they can store huge volumes of data as compared to CPU registers or cache memory. Magnetic Disks Hard disk drives are the most common secondary storage devices in present computer systems. These are called magnetic disks because they use the concept of magnetization to store information. Hard disks consist of metal disks coated with magnetizable material. These disks are placed vertically on a spindle. A read/write head moves in between the disks and is used to magnetize or de-magnetize the spot under it. A magnetized spot can be recognized as 0 (zero) or 1 (one). Hard disks are formatted in a well-defined order to store data efficiently. A hard disk plate has many concentric circles on it, called tracks. Every track is further divided into sectors. A sector on a hard disk typically stores 512 bytes of data. Redundant Array of Independent Disks RAID or Redundant Array of Independent Disks, is a technology to connect multiple secondary storage devices and use them as a single storage media. RAID consists of an array of disks in which multiple disks are connected together to achieve different goals. RAID levels define the use of disk arrays.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download DB2 – Triggers This chapter describes triggers, their types, creation