Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Using Visual Symptoms for Debugging Presentation Failures in Web Applications Sonal Mahajan, Bailan Li, Pooyan Behnamghader, William G. J. Halfond University of Southern California Los Angeles, California, USA Work supported by NSF Grant CCF-1528163 Background Information • What do we mean by presentation? – “Look and feel” of the website in a browser • What is a presentation failure? – Web page rendering ≠ expected appearance • Importance of presentation – Aesthetics impact users’ evaluation [Tractinsky et. al. 2006] – Impacts trustworthiness and usability [Lindgaard et. al. 2011] 2 Usage Scenario Regression Debugging – modify current version of the web page to correct a bug or refactor the HTML structure. <table> Menu | Contact News ---------------------------- Username Password <div> <tr> <td> <div> <tr> <td> <div> <div> <td> <div> Sign in About us | Feedback| FAQ Web page <tr> <td> <div> Div-based layout Table-based layout 3 Usage Scenario – Difficulties Menu | Contact News ---------------------------- Menu | Contact Username Password Problem1 Sign in News ---------------------------- Username Password Sign in About us | Feedback| FAQ Oracle (Previous version) Problem2 About us | Feedback| FAQ Test web page Developer 4 Usage Scenario – Difficulties Menu | Contact News ---------------------------- Analyze the observed differences Username Password Sign in About us | Feedback| FAQ Oracle (Previous version) Explore the UI to find the fault Background color of “Sign in” button Menu | Contact News ---------------------------- Username Password Sign in About us | Feedback| FAQ Test web page Developer 5 Usage Scenario – Difficulties Menu | Contact Analyze the observed difference Menu | Contact Manual debugging is difficult Username interaction between HTML, News News 1. Complex CSS, Username and JS ------------------Explore the UI to Password Password 2. Hundreds of HTML elements + CSS properties ------------------find the fault Sign in Sign in ------------------3. Makes labor intensive and error prone About us | Feedback| FAQ Background color About us | Feedback| FAQ of “Sign in” [Mahajan et.button al. ICST 2015] Test web page Prior user study Oracle (Previous version) • Correct fault identified in only 36% test cases! Developer 6 Limitations of Existing Techniques • DOM comparison techniques (e.g., XBT) – Not effective if DOM has changed significantly • Invariant specification techniques (e.g., Selenium) – Not practical, since all correctness properties need to be provided • Fighting layout bugs – Checks app independent problems only Our approach – Automate debugging of presentation failures 7 Three Key Insights 1. Visual differences can help diagnosis Visual symptoms Sign in Oracle Color related presentation failure Sign in Test page 8 Definition Visual symptom – boolean predicate describing the visual difference clues to the fault Visual Symptoms 1. Almost matched element 2. Shift bottom element 3. Page size changed 4. Added color 2. Shift bottom element . . – moved downwards . . . . – e.g.: margin-top 23. –All analyze diff. pixels diff. pixels in top of the element CSS Properties 1. Almost matched element – only position changed margin-top, padding-top, etc. – e.g.: margin-top margin-top, margin-left, etc. – sub-image searching height, width, padding, etc. background-color, color, etc. . . . padding-top, border-top-width, etc. 9 Three Key Insights 2. Probabilistic correlations can help identify faults <button, background-color> Color Symptom Size Symptom T F F F 0.0 1.0 F T 0.2 T F 0.95 T T 0.7 ✔ 0.8 0.05 0.3 10 Three Key Insights • Building probabilistic model – Approaches ✗ ✗ • Pool of known presentation failures – Differences depend on page layout. Not generalizable. • Historical data for the page – Available only for mature pages – Manual extraction from bug-tracking system 3. Probabilistic models can be automatically generated from the faulty test page ✔ 11 Our Approach Input 1. Test page 2. Oracle image (previous version screenshot, mockup etc.) PhasesThe goal is to automatically identify the fault of a presentation failure observed in a test page. 1. Detect presentation failures 2. Build the probabilistic model 3. Identify the most likely faults Output 1. Ranked list of likely faults 12 P1. Detect Presentation Failures Use WebSee [Mahajan et. al. ICST 2015] Oracle image Presentation failures Visual comparison Computer vision technique, Perceptual Image Differencing (PID) Test web page 13 P2. Build the Probabilistic Model Model based on conditional probability Set of visual symptoms e = HTML element Fault = Root cause <e, p> p = CSS property Probability that a potential root cause, r, is faulty given the observed set of visual symptoms, S. 14 P2. Build the Probabilistic Model 1. Generate data samples – Inject faults into the test page • Assign different values to potential root causes – Observe visual symptoms – Build truth table 15 Truth Table – Example Root Causes <p, color> Injected values blue <div, margin-top> 0px, 50px Visual Symptoms Added color Almost matched element T F F F F F, F T, T T, F F, T F, T ✗ Data samples Shift Shift top bottom element element <div, margin-top = 0px> Page size changed ✔ <div, margin-top = 50px> 16 P2. Build the Probabilistic Model 1. Generate data samples – Inject faults into the test page • Assign different values to potential root causes – Observe true visual symptoms – Build truth table 2. Calculate probabilities – Individual symptoms and conditional probability – Learn correlation between the root cause and visual symptoms 17 Probabilities Calculation Conditional probability Bayes’ theorem r = root cause S = set of visual symptoms 18 Probabilities Calculation P(S | r)P(r) P(r | S) = P(S) r = root cause S = set of visual symptoms P(S|r) = Probability of the status of visual symptoms S given r is the faulty root cause • Assumes visual symptoms are conditionally independent given the root cause • Advantages – Easier to calculate – Parallelizable 19 Probabilities Calculation P(S | r)P(r) P(r | S) = P(S) r = root cause S = set of visual symptoms P(S|r) = Probability of the status of visual symptoms S given r is the faulty root cause • Measure P(s|r) in data samples • Observe visual symptoms for a seeded root cause 20 Conditional Probability Table – Example Root Causes <p, color> Injected values blue <div, margin-top> 0px, 50px Visual Symptoms Added color Almost matched element Shift Shift top bottom element element T F F F F F, F T, T T, F F, T F, T 21 Page size changed Conditional Probability Table – Example Root Causes <p, color> Injected values blue <div, margin-top> 0px, 50px Visual Symptoms Added color Almost matched element Shift Shift top bottom element element T F F F F F, F T, T T, F F, T F, T 22 Page size changed Conditional Probability Table – Example Root Causes <p, color> Injected values blue <div, margin-top> 0px, 50px Visual Symptoms Added color Almost matched element Shift Shift top bottom element element 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.5 0.5 0.5 23 Page size changed Probabilities Calculation P(S | r)P(r) P(r | S) = P(S) r = root cause S = set of visual symptoms P(r) = Relative probability of r being the faulty root cause • Assume developers cause faults with uniform probability r = <e, p> e = HTML element p = CSS property 24 Probabilities Calculation P(S | r)P(r) P(r | S) = P(S) P(r) = Relative probability of r being the faulty root cause r = root cause S = set of visual symptoms r = <e, p> e = HTML element p = CSS property 25 P(p) Computation – Example Total 2 properties in the page color, margin-top 26 Probabilities Calculation P(S | r)P(r) P(r | S) = P(S) r = root cause S = set of visual symptoms P(S) = Probability of symptoms in S being T/F for a given page • P(S) is independent of r • Values of s Î S are given 27 Probabilities Calculation P(S | r)P(r) P(r | S) = P(S) r = root cause S = set of visual symptoms P(e), P(S) = Constants é ù êÕ P(s | r)ú P(e)P( p) ë û P(r | S) = sÎS P(S) 28 P3. Identify Most Likely Root Causes • for r R = {<e1, p1>, …, <en, pn>} 1. calculate P(p) for r = <e, p> 2. determine visual symptoms, S 3. for s S look up P(s|r) in the model 4. calculate • Rank root causes by their probabilities 29 Empirical Evaluation • RQ1: How accurate is our approach in identifying root causes of presentation failures? • RQ2: What are the computational resources needed to run our approach? 30 Implementation • Approach implemented in FieryEye (火眼) • Building the probabilistic model – Parallelized over 200 Amazon EC2 c4.large instances • Identifying visual symptoms – Used OpenCV to compare screenshots, extract color information, perform sub-image searching, etc. 31 Experiment Protocol Refactoring of web pages Regression Debugging activity 1. Migrate HTML 4 to 5 (<div id=‘head’> to <header>) 2. Convert table-based layout to div based 3. Replace deprecated tags (<font> to CSS font) For each subject Generate test cases Performance comparison – – – – Download page (H), take screenshot = oracle Refactor H to get H’ Seed presentation failure in H’ to create a variant Run FieryEye on oracle and variant WebSee, XPERT, Text Diff Tool (TDT) – diff 32 Subjects Random URL generator (http://www.uroulette.com) Subject Size (Total RC) Generated # test cases Perl 1,592 36 GTK 1,121 30 Konqueror 6,779 39 Amulet 88 22 UCF 2,415 47 33 RQ1: Accuracy Ranking of the correct root cause in the result set (Effort required to find the correct root cause) • Other techniques do not rank root causes • Adapted other techniques to report rank Quantify a range in the way developers may use the results Ranking U = Upper bound on effort Ranking L = Lower bound on effort 34 RQ1: Accuracy Results 7.9 FieryEye FieryEye rank = 7.9 WebSee rank-L = 10.2 100 38.9 65.6 WebSee-U 10.2 WebSee-L 65.6 FieryEye recall = 100% Avg. Median. Rank WebSee recall = 65.6% 244.3 XPERT-U 57.6 Recall (%) 29.7 57.6 XPERT-L 549.1 TDT-U 100 64.3 TDT-L 100 0 100 200 300 400 500 35 600 RQ1: Accuracy Results 80 Cumulative Frequency (%) 70 In Y% cases, correct root cause ranked in the top X (X, Y) 60 FieryEye 50 FieryEye:WebSee-U 45% cases WebSee-L 40 XPERT-U 30 20 10 Correct root cause in top 5 XPERT-L WebSee: 5% (U), 10% (L) cases TDT-U TDT-L XPERT, TDT: 1% (U and L) 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 Ranking 36 RQ2: Computational Resources Prediction, 17 sec 25 20 Time (sec) FieryEye Model building, 3 min 15 10 Fast but imprecise 5 0 FieryEye WebSee XPERT TDT 200 Amazon EC2 instances 1 c4.large = $0.11 per hour Cost = 200 * $0.0018/min * 3 Model building cost = $1 37 Summary • Technique for finding root cause of presentation failure – Image processing to find visual symptoms – Probabilistic models to predict root causes • Empirical evaluation shows positive results – Avg. median correct root cause rank = 7.9 – Prediction time = 17 sec – Model building cost = $1 38 Thank you Using Visual Symptoms for Debugging Presentation Failures in Web Applications Sonal Mahajan, Bailan Li, Pooyan Behnamghader, William G. J. Halfond [email protected] [email protected] [email protected] [email protected] Work supported by NSF Grant CCF-1528163 Ranking U and L • Techniques report HTML elements – Add defined CSS properties Set of root causes • for e Î reported faulty HTML elements { if (e == incorrect faulty element) { rankingU = rankingU + e.getProps() rankingL = rankingL + 1 } else { rankingU = rankingU + e.getProps() / 2 rankingL = rankingL + e.getProps() / 2 } } WebSee: ranked list of HTML elements XPERT, TDT: unsorted rankingU = rankingU / 2 rankingL = rankingL / 2 40 Definitions Root cause – tuple <e, p>, where e = HTML element and p = CSS property Oracle image Test web page <div, margin-top> 41 Definitions Visual symptom – boolean predicate describing the visual difference clues to the root cause 1. Almost matched element – only position changed – e.g.: margin-top – sub-image searching Oracle image Test web page 2. Shift bottom element – moved downwards – e.g.: margin-top – analyze diff. pixels 42 Full Running Example Running example Menu | Contact Menu | Contact News ---------------------------- Username Password Sign in News ---------------------------- Username Password Sign in About us | Feedback| FAQ About us | Feedback| FAQ Oracle (Previous version) Test web page 44 P2. Generate Data Samples – Example 2 HTML elements in page <p> color <div> margin-top Test web page 2 Potential Root Causes <p, color> <div, margin-top> 45 P2. Generate Data Samples – Example 2 Potential Root Causes <p, color> <div, margin-top> inject 3 Data Samples <p, color> blue <div, margin-top> 0px, 50px Data sample 1: <p, color = blue> 1 Visual Symptom - Added color 46 P2. Generate Data Samples – Example 2 Potential Root Causes <p, color> <div, margin-top> inject 3 Data Samples <p, color> blue <div, margin-top> 0px, 50px Data sample 2: <div, margin-top = 0px> 2 Visual Symptoms - Almost matched element - Shift top element 47 P2. Generate Data Samples – Example 2 Potential Root Causes <p, color> <div, margin-top> inject 3 Data Samples <p, color> blue <div, margin-top> 0px, 50px Data sample 3: <div, margin-top = 50px> 3 Visual Symptoms - Almost matched element - Shift bottom element - Page size changed 48 Truth Table – Example Root Causes <p, color> Injected values blue <div, margin-top> 0px, 50px Visual Symptoms Added color Almost matched element Shift Shift top bottom element element T F F F F F, F T, T T, F F, T F, T Data samples 49 Page size changed Conditional Probability Table – Example Root Causes <p, color> Injected values blue <div, margin-top> 0px, 50px Visual Symptoms Added color Almost matched element Shift Shift top bottom element element T F F F F F, F T, T T, F F, T F, T 50 Page size changed Conditional Probability Table – Example Root Causes <p, color> Injected values blue <div, margin-top> 0px, 50px Visual Symptoms Added color Almost matched element Shift Shift top bottom element element T F F F F F, F T, T T, F F, T F, T 51 Page size changed Conditional Probability Table – Example Root Causes <p, color> Injected values blue <div, margin-top> 0px, 50px Visual Symptoms Added color Almost matched element Shift Shift top bottom element element 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.5 0.5 0.5 52 Page size changed P(p) Computation – Example Total 2 properties in the page color, margin-top 53 P3. – Example • All root causes, R = {<p, color>, <div, margin-top>} • r = <p, color> 1. P(p) = 0.5 2. S = {almost element matched, shift bottom element} 3. s1 = almost element matched, P(s1|r) = 0.0 s2 = shift bottom element, = 0.0 P(s2|r) = 0.0 4. 54 Visual Symptoms (S) – Example 1. Almost matched element – only position changed – e.g.: margin-top – sub-image searching Oracle image Test web page 2. Shift bottom element – moved downwards – e.g.: margin-top – analyze diff. pixels 55 P3. – Example • All root causes, R = {<p, color>, <div, margin-top>} • r = <p, color> 1. P(p) = 0.5 2. S = {almost element matched, shift bottom element} 3. s1 = almost element matched, P(s1|r) = 0.0 s2 = shift bottom element, = 0.0 4. P(s2|r) = 0.0 = P(<p, color> | S) = 0.0 * 0.5 = 0.0 56 P3. – Example • All root causes, R = {<p, color>, <div, margin-top>} • r = <div, margin-top> 1. P(p) = 0.5 2. S = {almost element matched, shift bottom element} 3. s1 = almost element matched, P(s1|r) = 1.0 s2 = shift bottom element, = 1.0 * 0.5 = 0.5 4. P(s2|r) = 0.5 = P(<div, margin-top> | S) = 0.5 * 0.5 = 0.25 57 P3. – Example • Rank root causes by their probabilities ✔1. P(<div, margin-top>) = 0.25 2. P(<p, color>) = 0.0 58