Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Correlated subqueries Tutorial 3, Q2 d) Produce a table showing the total population for each LA and the area it covers. Compute the population density. Order your output from highest to lowest density. e) Extend the previous query to include the total number of crimes in each LA. Part e) is not a simple extension of d). Bringing in violent_crime in a 3 way join will not work, and this will mean the information about population density and ward are is repeated multiple times. The answer is to use a subquery: SELECT LA_code, sum(population)/sum(wardarea) AS density, (SELECT count(*) FROM wards_by_LA WLA2, violent_crime WHERE WLA2.ward_code =violent_crime.ward_code AND WLA2.LA_code= WLA1.LA_code ) AS n_crimes FROM ward_profile, wards_by_LA WLA1 WHERE ward_profile.ward_code= WLA1.ward_code GROUP BY LA_code ORDER BY sum(population)/sum(wardarea) DESC; Note the use of the subquery as an output field. This type of subquery is called a correlated subquery because it makes an external reference (WLA1) to the main query, and has to be re-evaluated for each iteration of the main query, i.e. each value of LA_code More SQL All the joins we have been considering so far are called inner joins. SQL provides an alternative syntax: SELECT * FROM violent_crime INNER JOIN ward_profile ON violent_crime.ward_code= ward_profile.ward_code Inner joins only include combinations of rows when there is a match in both tables. This can be problematic – the above join, for instance, would not produce any rows for wards where no crimes were committed. Sometimes we may want to include all the rows from one table even if there is no matching row in the other table. This is called an outer join. For example: SELECT ward_profile.ward_code, count(crime_id) AS n_crime FROM ward_profile LEFT OUTER JOIN violent_crime ON ward_profile.ward_code = violent_crime.ward_code GROUP BY ward_profile.ward_code ORDER BY count(crime_id) DESC LEFT just means all rows from the first table are included (vice versa for RIGHT) FINALLY – note that once you have stored an Access query, you can query it just like a table, e.g. SELECT * from Query1. This allows very complex queries to be built up. You can think of the query as providing a particular VIEW of the database. Chen is not the only notation MORE ERM Relationships can be a) three way and b) recursive. SUPPLY SUPPLIER PROJECT a) three way PART EMPLOYEE N SUPERVISOR 1 b) Recursive relationship. Note that the double line means every employee has to have a supervisor. This is called TOTAL PARTICIPATION in the relationship. ERM – finale: Wozzie’s Gym USER ID NAME This notation is used to indicate subclasses of an entity. Note that the subclasses “inherit” all the attributes of the “parent class”. BOOKS USER d CASUAL MEMBER FULL MEMBER BANK DETAILS FACILITY Two implementation options: 1) Combine casual and full members in same table, with an extra attribute to indicate status. Alternatively, have a separate table for each and then use the UNION operator to combine them: SELECT user_id FROM full_member UNION SELECT user_id FROM casual_member Business Intelligence (BI) David Wastell The production of timely, accurate, high value and actionable information • A rational approach to continuous improvement based on: – Gathering/analyzing operational (transactional) data – Making decisions & taking actions based on that data – Measuring results according to predetermined metrics (Key performance indicators – KPIs) – Feeding lessons from one decision into the next Data warehousing A Data Warehouse is a centrally managed and integrated database containing structured data from operational sources in an organization. Data extracts are validated, cleansed and transformed into a common, stable, relatable view of data. ETL = Extract Transform Load Dashboards and KPIs Business dashboards provide a “control panel” for monitoring the vital functions of the business, supplying immediate information, indicating when and where performance is lagging….. BI in Action: Crime Policy MADE Police Monthly Reports County Council Fire service Ambulance Service Probation Other ETL Datamining (policy research) 3000 no of serious crimes 2500 2000 alcohol involved alc not involved total 1500 1000 500 0 time of day 600 serious violent crime Data mining example: the aetiology of alcoholrelated violence 500 400 Town 1 Town 2 300 Rural 1 Rural 2 Town 3 200 100 0 time of day Correlation = 0.67, very sig. Evidence-based Policy: crime control % of total in target zone 25 20 15 Guess what….. Street drinking ban 10 5 0 Month % total serious violent crime committed within target zone (reduced 14.8% to 12.3%) BUT … any validity concerns, alternative explanations?? Ambulance Incidents (monthly) Target zone County demand Before ban 9.1 516.2 After ban 8.3 541.1 Change -9% +5% All effects stat. sig. OLAP: Online analytical processing Crime data: relational view Crime as a multi-dimensional “cube” Dimensions are typically hierarchies of categories Slicing and dicing – and drilling down