Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Foundations of Preferences in Database Systems Author: Werner Kieling Institute of Computer Science University of Augsburg Germany Presented by: Haoyuan Wang Contents Introduction Practical Work Preference Queries Preference Algebra Preference Engineering Basics of Preferences An Online Algorithm for Skyline Queries Introduction (Motivation) We have the Exact World Delivering exact dream objects. Wishes satisfied completely or not at all. Uses hard constraints. Already investigated, such as: SQL, E-R modeling, XML. We Need the Real World If not perfect match, worse alternative acceptable. Requires best possible match making. Requires soft constraints. Preference-driven choice lagged behind SELECT * FROM car WHERE make = 'Opel' PREFERRING (category = 'roadster' ELSE category <> 'passenger' AND price AROUND 40000 AND HIGHEST(power)) CASCADE color = 'red' CASCADE LOWEST(mileage); Introduction Approaches to cope these two worlds: Cooperative Database System: Not comprehensive. Exact world Real world Preference Database Systems: Intuitive semantics. Concise mathematical foundation. Constructive and extensible preference model. Conflicts of preferences do not cause system failure. Declarative preference query languages. Basics of Preferences What is preference ? • Intuitively: “ I like A better than B”. • Mathematically: relation which is in strict partial order. irreflexive, transitive a b c d • Our definition: Preference P=(A, <P). A: a set of attribute names. <p dom(A) dom(A) a strict partial order referring to attribute names A. <P A={A1, A2, …, Ak} A={A1, A2, …, Ak} Basics of Preferences Better-than graph G Level 1: val1 val3 Level 2: val2 val4 Level 3: val5 val6 val7 x <P y, if y is a predecessor of x in G. Values in G without predecessor are maximal, being at level 1. x is on level j if the longest path from x to maxmal has j-1 edges. if no directed path between x and y, x and y are not ranked. Basics of Preferences Special cases of preferences Chain preference If for all x, y∈dom(A), x≠y: x <P y ∨ y <P x. All values are ranked in chain preference. Anti-Chain preference S = (S, ), given any set of values S. Any set, including dom(A) can be converted into an anti-chain. Dual preference P=(A, P) reverse the order on P, x <P y iff y<P x. Any P can be completely reversely ranked Subset preference Given P=(A, P), every Sdom(A) induces a subset reference P =(S, < P ), if for any x, yS: x P y iff x<P y. For any subset of A, <P can apply. Preference Engineering Inductive Construction of Preference-Nonnumerical base preference POS preference: POS(A, POS-set) P is a POS preference, if: x <P y iff x POS-set yPOS-set dom(A) POS-set Example POS(transmission, {automatic}) Intuitive definition: A desired value should be in a finite set of favorites POSdom(A). If this infeasible, better than getting nothing, any other value from dom(A) is acceptable. Implication: all value in POS-set are maximal, all value not in POS-set are at level 2 and worse than all POS-set values. Preference Engineering Inductive Construction of Preference-Nonnumerical base preference NEG preference: NEG(A, NEG-set) Intuitive definition: P is a NEG preference, if: x <P y iff y NEG-set xNEG-set dom(A) A desired value should not be in a finite set of dislikes NEG-set. If this infeasible, better than getting nothing, any other value from NEGset is acceptable. Implication: NEG-set Example NEG(make, {Ferrari}) all value not in NEG-set are maximal, all value in NEGset are at level 2 and worse than all maximal values. Preference Engineering Inductive Construction of Preference-Nonnumerical base preference POS/NEG preference: POS/NEG(A, POS-set, NEG-set) P is a POS preference, if: x <P y iff ( x NEG-set y NEGset)(x NEG-set x POS-set yPOS-set) dom(A) NEG-set POS-set Intuitive definition: A desired value should be one in a finite set of favorites. Otherwise it should not be any from a finite set of disjoint dislikes. If this infeasible, better than getting nothing, any other value from dislikes is acceptable. Example POS/NEG(color, {yellow}; {gray}) Preference Engineering Inductive Construction of Preference-Nonnumerical base preference POS/POS preference: POS/POS(A, POS1-set, POS2-set) x <P y iff ( x POS2-set y POS1-set)(x POS1-set x POS2-set yPOS2-set) (x POS1-set x POS2-set yPOS1-set) Intuitive definition: A desired value should be in a finite set of favorites POS1-set. Otherwise, it should be from a disjoint finite set of alternatives POS2-set. If this infeasible, better than getting nothing, any other value is acceptable. Example dom(A) POS2-set POS1-set POS/POS(category, {cabrio}; {roadster}) Preference Engineering Inductive Construction of Preference-Numerical base preference AROUND preference: AROUND(A, z) Given z dom(A), for all v dom(A) . Define distance(v,z) :=abs(v-z) Intuitive definition: The desired value should be z. If this infeasible, values with shortest distance from z are acceptable. P is called AROUND preference, if: x <P y iff distance(x,z)>distance(y,z) z v Example AROUND(price, 40000) Preference Engineering Inductive Construction of Preference-Numerical base preference BETWEEN preference: BETWEEN(A,[low, up]) Given z[low, up]dom(A)dom(A), for all v dom(A) . Define distance(v,[low, up]) := if v [low, up] then 0 else if v <low then low-v else v-up P is called BETWEEN preference, if: x <P y iff: distance(x, [low,up] ) > distance(y, [low, up]) Intuitive definition: The desired value should be between the bounds of an interval. If this infeasible, values with shortest distance from the bounds are acceptable. Example v v low up BETWEEN(mileage, [20000, 30000]) Preference Engineering Inductive Construction of Preference-Numerical base preference LOWEST, HIGHEST preference: LOWEST(A), HIGHEST(A) P is called LOWEST preference, if x <P y iff x > y P is called HIGHEST preference, if x <P y iff x < y Intuitive definition: The desired value should be as low (high) as possible. Example LOW(price) HIGHEST(power) Preference Engineering Inductive Construction of Preference-Numerical base preference SCORE preference: SCORE(A, f) P is called SCORE preference, if for some x, y dom(A): x <P y iff f(x)<f(y) Note: <p can be applied after the score function f gives a value. Preference Engineering Inductive Construction of Preference-Complex preference Pareto preference: P=(A1A2, <P1 P2) Given P1=(A1, <P1) and P2=(A2, <P2) and P2=(A2, <P2), for x, y dom(A1)dom(A2), define: x < P1 P2 y iff (x1 <P1 y1 (x2 <p y2 x2=y2)) (x2 <P2 y2 (x1 <P1 y1 x1=y1)) Intuitive definition: P1 and P2 are considered as equally important preferences. In order for x=(x1, x2) to be better than y=(y1, y2), it is not tolerable that x is worse than y in any x i. Preference Engineering Inductive Construction of Preference-Complex preference Pareto preference example For dom(A1)=dom(A2)=dom(A3)=integer and P1 :=AROUND(A1, 0), P2 :=LOWEST(A2), P3 :=HIGHEST(A3), P4 :=({A1, A2, A3}), <P4) :=(P1P2)P3 let’s study a subset preference of P4 for the following set: R(A1, A2, A3)={val1:(-5, 3, 4), val2:(-5, 4, 4), val3:(5, 1, 8), val4:(5, 6, 6), val5:(-6, 0, 6), val6:(-6, 0, 4), val7:(6, 2, 7)} The ‘better-than’ graph of P4 for subset R can be obtained by performing exhaustive ‘better-than’ checks: Level 1: val1 val3 Level 2: val2 val4 val5 val7 val6 Preference Engineering Inductive Construction of Preference-Complex preference Prioritized preference: Intuitive definition: P=(A1A2, <P1& P2) Given P1=(A1, <P1) and P2=(A2, <P2) and P2=(A2, <P2), for x, y dom(A1)dom(A2), define: P2 is respected only when P1 does not mind. Example: Let’s revisit Example 1, now studying: P8 = ({A1, A2}, <P8) := P1&P2 R(A1, A2, A3)={val1:(-5, 3), val2:(-5, 4), val3:(5, 1), val4:(5, 6), val5:(-6, 0), val6:(-6, 0), val7:(6, 2)} The ‘better-than’ graph of P8 for subset R is this: Level 1: val1 val3 (x1 <P1 y1 (x1=y1 Level 2: x2 <P y2) Level 3: val2 val4 x < P1& P2 y iff val5 val6 val7 Preference Engineering Inductive Construction of Preference-Complex preference Writing a preference query - a used-car scenario 1. Write wish-list Julia wants to buy a used car for herself and her friend Leslie, she wishes: “My favorite car is Cabrio, but roadster is also good, I like an automatic car and it should have a horsepower of about 100, these issues are equally important to me, but color is the most important, it should not be gray, I do not care too much about price, but since it is a used car, the lower, the better”. Julia goes to vendor Michael. Michael wishes : “ Clients’ wishes are always more important than mine, I like to sell older cars, the have higher commission” . Julia also needs to ask Leslie, Leslie wishes: “I agree with Julia, I convinced Julia money should matter as much as color, I like blue, if not available, please not gray and red” Preference Engineering Inductive Construction of Preference-Complex preference 2. Convert wish-list to preference query terms Julia: P1:=POS/POS(category, {cabrio};{roadster}) P2:=POS(transmission, {automatic}) P3:=AROUND(horsepower, 100) P4:=LOWEST(price) P5:=NEG(color, {gray}) Michael: P6:=HIGHEST(year-of-construction) P7:=HIGHEST(commission) Leslie: P8:=POS/NEG(color, {blue};{gray, red}) 3. Use Pareto and Prioritization to add each query terms Q1=({color, category, transmission, horsepower, price}, <Q1) :=P5&((P1P2P3)&P4) Q2=({color, category, transmission, horsepower, price, year-ofconstruction, commission},<Q2) :=(Q1&P6)&P7 =((P5&((P1P2P3)&P4))&P6)& P7 Q2=({color, category, transmission, horsepower, price, year-ofconstruction, commission},<Q2) :=(Q1&P6)&P7= (((P5 P8 P4)&(P1P2P3))&P6))&P7 Evaluation of Preference Queries How a preference query like the following is evaluated: Decomposition of queries Q2=({color, category, transmission, horsepower, price, year-of-construction, commission},<Q2) :=(Q1&P6)&P7=(((P5 P8 P4)&(P1P2P3))&P6))&P7 Decomposition of ‘&’ and ‘’ [P1&P2](R)= [P1] [P2 groupby A1](R) [P1P2](R)= ([P1](R) [P2 groupby A1](R)) ([P2](R) [P1 groupby A2](R)) YY(P1&P2, P2& P1)R Decomposition of ‘+’ and ‘’ [P1+P2](R)= [P1](R) [P2](R) [P1P2](R)= [P1](R) [P2](R) YY(P1, P2)R Evaluation of Preference Queries BMO query model R Database preference P Assume P=(A, <P), where A=(A1, A2, …, Ak). a) Each R[A] dom(A) defines a subset preference, called a database preference and denoted by: PR=(R[A], <P) b) Tuple tR is a perfect match in a database set R, if: t[A] max(P) t[A] R. Note: preference query performs a match-making between the stated preference and the database preference. [P](R)={tR | t[A] max(PR )} Note: [P](R) evaluates P against database set R by retrievingall maximal R values from P Shooting Stars in the Sky: An Online Algorithm for Skyline Queries Authors: Donald Kossmann Frank Ramsak Steen Rost Technische Universit. at M. unchen Orleansstr. 34 81667 Munich Germany A Online Algorithm for Skyline Queries Contents: Skyline Queries The NN Algorithm for 2-Dimensional Skylines An Example for 2-Dimensional Skylines The NN Algorithm for d-Dimensional Skylines Skyline Queries Retrieves all interesting points. Helps user get a big picture of interesting options. If users moves, skyline should be re-computed, user’s choice based on user’s location. The NN Algorithm for 2- Dimensional Skylines Input: Data set D Distance function f (Euclidean distance) /* Initialization: the whole data space needs to be inspected*/ T ={(, )} /* Loop: iterate until all regions have been investigated*/ WHILE (T 0) Do (mx, my)=takeElement(T) IF( boundedNNSearch(O, D, (mx, my), f)) THEN ( nx, ny)= boundedNNSearch(O, D, (mx, my), f)) T=T{(nx, my), (mx, ny)} OUTPUT n ENDIF ENDWHILE An Example for 2-Dimensional Skylines The NN Algorithm for d-Dimensional Skylines Approaches to Deal With Duplicates Laisser-faire propagate Merge Fine-grained Partitioning Hybrid Approaches Foundations of Preferences in Database Systems Shooting Stars in the Sky: An Online Algorithm for Skyline Queries Authors: Donald Kossmann Frank Ramsak Steen Rost Technische Universit. at M. unchen Orleansstr. 34 81667 Munich Germany