Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Relational Algebra “The beginning…” What is a query? Why use a theoretical basis for designing ‘queries’? What is a relation? What is relational algebra (RA)? Start with some unary operators in RA Then the basic binary operators A query What do we mean by a query to a database? A query What do we mean by a query to a database? Given one or more relations, apply operations that ultimately return a new relation. Usually we assume the new relation will have different information than the original input relations and will combine the original relations in meaningful ways. We can write a query accurately in English We can accurately name the resulting relation in English We can draw queries as a way of planning them out SQL is the language used for talking to a relational database management system Allows But the querying part of SQL isn't very good changing the schema (“data definition”), changing the data (“data modification”), and querying Confusing syntax that is hard to read Major differences from one DBMS to the next We teach you relational algebra (RA) for querying We use a graphical representation of RA It's easy to convert RA into SQL A relational database is a set of relations A relation is a set of rows and a set of columns A relation has a name A relation has an identifier which is a set of columns We consistently underline columns in the identifier Rules: A row has a value for each column in the identifier (A row may or may not have values for columns that are not in the identifier) No two rows have the same values in all the identifying columns Is this a relation? First name Last name Award Movie Jeff Bridges Crazy Heart Sandra Bullock Kathryn Bigelow Best Actor Best Actress Director The Blind Side The Hurt Locker Is this a relation? Oscar First name Jeff Last Award name Bridges Best Actor Sandra Bullock Best Actress Kathryn Bigelow Director Movie Crazy Heart The Blind Side The Hurt Locker Is this a relation? Oscar First name Last name Award Movie Jeff Sandra Bridges Bullock Crazy Heart The Blind Side Kathryn Paul Bigelow Ottosson Paul Ottosson Best Actor Best Actress Director Sound Editing Sound Mixing The Hurt Locker The Hurt Locker The Hurt Locker Is this a relation? Oscar First name Last name Award Movie Jeff Bridges Best Actor Crazy Heart Sandra Bullock Kathryn Bigelow Paul Ottosson Paul Ottosson Best Actress The Blind Side Director The Hurt Locker Sound The Hurt Mixing Locker Sound Spider Man Mixing Are these two relations the same (both named Oscar)? First name Award Movie Jeff Last name Bridges Best Actor Crazy Heart Sandra Bullock Best Actress The Blind Side Kathryn Bigelow Director Last name Award Bigelow Director Bridges Bullock The Hurt Locker Movie The Hurt Locker Best Actor Crazy Heart Best Actress The Blind Side First name Kathryn Jeff Sandra More about relations No two rows of data are duplicated implied by our rule about identifying columns No two columns have the same name No two relations in a given database have the same name Relational Algebra Inputs: one or two relations Operators: work on those relations (or more explicitly columns or rows) Output: one relation RA Operators We'll look at unary operators first These take a relation and return a relation project filter reduce Group Project Project removes and adds nonidentifying columns Adds computed columns – arithmetic, string manipulation, etc. result: as wide as number of columns named, which must include the identifying columns exactly as tall as original relation identifier is same SQL: Select columns from relation; Project example First name Jeff Sandra Kathryn Last name Bridges Bullock Bigelow Award Best Actor Best Actress Director Movie Crazy Heart The Blind Side The Hurt Locker oscar Project Id: ??? Carry: Movie Compute: age = currentYear - BirthYear ????? BirthYear 1949 1964 1951 Project example First name Jeff Sandra Kathryn Last name Bridges Bullock Bigelow Award Best Actor Best Actress Director Movie Crazy Heart The Blind Side The Hurt Locker BirthYear 1949 1964 1951 oscar Project Id: First_name, Last_name, Award Carry: Movie Compute: age = currentYear - BirthYear First_name, Last_name, Award, Movie, Actor_age of Oscar Filter Filter removes rows based on a condition result: as wide as the original relation shorter or exactly as tall as the original relation identifier is same SQL: Select * from relation where condition; First name Last name Award Jeff Bridges Best Actor Sandra Bullock Best Actress Kathryn Bigelow Director Movie Crazy Heart The Blind Side The Hurt Locker Birth year 1949 1964 1951 Filter example: Find each oscar award for the movie “The Blind Side” What algorithm do you think Filter uses? What algorithm do you think Filter uses? Trick question: depends on the DBMS and the configuration! We often combine project and filter An activity you could try on your own Go to sqlzoo.net Try tutorial 0 Reverse engineer the example queries into relational algebra Reduce Reduce changes the identifier, result: SQL: narrower or the same width as the original relation possibly shorter than the original relation unsafe! Select distinct new_identifier from original relation Let’s try it with the tiny database: Practice Query #3 Reduce example First name Last name Award Movie Jeff Bridges Best Actor Crazy Heart Sandra Bullock Best Actress The Blind Side Kathryn Bigelow Director The Hurt Locker Paul Paul Ottosson Ottosson Sound Editing Sound Mixing The Hurt Locker The Hurt Locker Group Group changes the identifier, combining duplicate rows using an aggregate function result: narrower, the same width, or wider than the original relation possibly shorter than the original relation unsafe! SQL (2 basic types): Select distinct new_identifier, aggregate_function from relation group by new_identifier Select aggregate_function from relation Group example First name Last name Award Movie Birth year Jeff Bridges Best Actor Crazy Heart 1949 Sandra Bullock Best Actress The Blind Side 1964 Kathryn Bigelow Director The Hurt Locker 1951 Group example 2 First name Last name Award Birth year Jeff Bridges Best Actor 1949 Sean Sandra Penn Bullock Best Actor Best Actress 1960 1964 Kate Winslet Best Actress 1975 Tiny database SQL (2 basic types): Select distinct new_identifier, aggregate_function from relation group by new_identifier Select aggregate_function from relation Practice queries 12, 16