Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Evaluating assessment performance Mikko Pohjola, THL 23/05/2017 Presentation name / Author 1 Contents • Eye opener • Purpose <-> performance • Contemporary conventions – Uncertainty analysis – Quality assurance/control • Properties of good assessments • Evaluation of performance 23/05/2017 Presentation name / Author 2 Analogy • What is/would be a good mobile phone for you? – What features are important? Analogy • Consider you are a mobile phone manufacturer (say Nokia) - what would you think were a good mobile phone then? 23/05/2017 Presentation name / Author 4 Where do mobile phones come from? • Factory -> warehouse -> store -> use – – – – – – – Assembly (components + code + covers) Mass production, mass customization Packaging Logistics Marketing (advertising, market analysis, …) Use environment Design • What does this have to do with the topic of this lecture? Assessment • Dual purpose of assessments: – Meeting the needs of use (societal decision making, policy) – Striving for truth (science) • Both requirements must be met simultaneously – not easy, but possible • A business of creating understanding about reality – Making right questions – Providing good answers – Delivering the answers (and questions) to use Performance (goodness) • Performance of something can only be evaluated according to its purpose!!! • What is the purpose of assessments? – Satisfying the information need of intended use • Meeting the need • Truthlikeness of the information Contemporary conventions • Uncertainty analysis – Basically: what is the exactness of the answer provided as the outcome of an assessment? – Multiple applications and extensions of concept – Product-oriented • Quality assurance/control – Through which steps should an assessment be conducted in order to get good outcomes? – Process-oriented Environmental health assessment Process requirement Product requirement Assessment Assessment process Knowledge need Use Assessment product Decision making Properties of good assessment • Quality of content • Applicability • Efficiency Properties of good assessments • How good is the information (Quality of content)? – Informativeness and calibration – Relevance • How well does it transfer into use (Applicability)? – Usability – Availability – Acceptability • How much effort is spent (Efficiency)? • What would these mean in the context of making/using mobile phones? Evaluation of performance • • • • • informativeness and calibration – result relevance - scope (vs. need) usability – organization, appearance, implementation availability – (observable) access to information acceptability of premises - (assumed) acceptance of premises by intended users, or others interested, or affected • acceptability of process – evaluation of definition → peer review = outsourced evaluation • efficiency – measurement of spent effort (given outcome) Properties of good assessments • Uncertainty analysis + quality assurance/control + functionality in use + efficiency of production • Reviewing past assessments (a posteriori) • Guiding design and execution of on-going assessments (a priori) • Applicability secondary to quality of content • Efficiency depends on quality of content and applicability Properties of good assessments • Different points of reference – Which relate to need/use? – Which relate to truth? Evaluation of performance • How much is the framework built in to open assessment and opasnet? • What are the methods to do evaluation in practice? – Jouni, anything you want to say about this? – Comments anyone? – Consider this in the exercises and discussion! • uncertainty analysis, discrepancy analysis between result and another estimate (assumed as a golden standard) • (assumed) intended user opinion, participant rating for (technical) quality of information objects 23/05/2017 Presentation name / Author 16