Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MUSiC for Applications Projects: A Summary MUSiC for Applications Projects: A Summary page 1 Please note: This document is a partial summary of the results of the MUSiC ESPRIT project, extracting all the parts which are relevant to the MUSiC Applications Projects proposal. Those readers interested in finding out more about the full range of products from the MUSiC project are kindly requested to contact Mr Mike Kelly, of Brameur Ltd, UK (adresses on last page). MUSiC for Applications Projects: A Summary page 2 Product Quality Concerns The economic climate of the early 1990s has led to crises in most sectors of industry with many organisations unable to cope with the resultant changes imposed on them. In particular the Information Technology and software system industry were hard hit. The dependence of most organisations on software or software embedded systems has resulted in the quality of the software having a far greater effect on the business than at any time in the past - unreliable systems can cause severe, in some cases catastrophic, financial damage or in the extreme case, loss of life. Why Quality Of Use? Quality is a key issue of the 1990s in every aspect of our existence. We now talk about Quality of Life instead of standard of living, of Quality Time in our personal lives, manufacturers state Quality of Product as a key selling point in all sectors of industry (often undefined as well as unjustified!). Ask any manager to justify how much is spent on overheads such as offices and equipment, on utilities and people; ask them how necessary and how good all these are - they will generally have the facts and figures immediately at hand - ask a senior manager how much their IT departments spend and what they get for it and even today - most cannot answer (they have no justification provided), even though these budgets often run into millions! This is because people are still not measuring correctly and accurately in the IT development domain. Quality is not measured in general and specifically Quality of Use or Usability is not assessed even though it is the most critical factor in the successful take up (marketing, sales and operation) of any system or item of software. Note: we use the terms Quality of use and usability almost interchangably here and the reader is advised for the time being to treat them as equal - although in fact Quality of Use encompasses a broader domain than the traditional definitions of usability. Usability is increasingly becoming the most talked about and significant attribute of any software system. Since usability is so closely related to Quality (and on a par with Quality!) it must be necessary for managers to address the following: How do you know your product is better than the competition without being able to measure the competitions' products? How do you know what potential and actual users think about your products without being able to measure them? How do you know you are using or purchasing the right product if you cannot measure its acceptability? How do you know you are conforming to a particular standard without being able to measure your product against it? Compliance with standards and directives can only be assessed using measurement MUSiC is the only usability assessment method which is metrics, i.e. measurement, based and thus the only one which: can be used to help organisations throughout Europe check compliance to standards from a common standpoint can be used by assessors and auditors to check compliance using a common European method. Standards One method of ensuring that software usability is of constantly high quality is through the setting of standards against which measures of a product's usability can be compared. More and more standards are stressing the need for measurement as part of the development process, in particular ISO 9000 now becoming mandatory for many government contractors (in the UK and spreading through Europe). Others include ISO MUSiC for Applications Projects: A Summary page 3 9126 'Software quality characteristics and guidance for their use', ISO 9241 Part 11 Guidance on usability specification and measures' and of more immediate significance the increasing enforcement of compliance to the European Directive on the use of computers in the workplace1. Such standards and measures (validated as metrics) serve two purposes: Protecting The User: they act as a safeguard for the prospective users, who can be assured of the quality of software certified as meeting the standards. Supporting The Developers: the measurement of usability as an integral part of the design cycle should enable developers to recognise shortcomings in the design at a stage when improvements can be made relatively cheaply and easily. The need for a method to help developers and users ensure that system comply with usability related standards has never been more pressing - MUSiC provides the means by which developers, users and auditors can assess compliance to this wide range of usability oriented standards. What do we actually mean by Usability? Usability is defined by MUSiC as: The extent to which a product can be used effectively, efficiently and with satisfaction by specified users, for specified tasks in specified environments. This is very similar to the definition found in ISO 9241 Part 11 mentioned previously - the emphasis is on the user and the system's ability to help them achieve their specified goals. It becomes possible to measure against this definition when a working prototype, or an implemented system is available. There are further facets of usability which it may be desirable to evaluate such as: How easy is it for the specified users to learn to use the system? How much in control of the system are the specified users? How does the system affect the user - how do they like it? How much mental effort do the users expend? How much help does the system provide to the user? Again MUSiC can provide answers to all these questions. Efficiency, Effectiveness and Satisfaction Producers have found, in a more quality aware industry, increased competition, decreasing sales and more demanding customers. On the other hand software users need more reliable systems that allow their staff to perform tasks more effectively, to allow them to achieve their goals more efficiently and to motivate them by having systems that are more satisfying to use. It may be argued that many managers, finding themselves in a situation where current unemployment is high, have little trouble filling vacancies and would not worry about their staff being 'satisfied' in the use of the system. In fact it should be totally the opposite managers, having to cope with reduced staff levels, need to ensure these people are satisfied in order that they are motivated and hence productive. Efficiency, Effectiveness and Satisfaction go hand in hand - they are not alternatives. 1 CEC DSE Directive 90 / 270 / EEC 1992 on Health and Safety in the Workplace provides relatively straightforward standards for equipment, ergonomics, etc but only general principles on software quality requirements and expectations. MUSiC for Applications Projects: A Summary page 4 Which Usability Measurements? It should now be clear that the use of interactive computer software, as a normal part of everyday work, has brought increasing awareness in leading sectors of industry of the importance of usability as a software quality. Usability has become a prime factor in determining the user acceptance of a computer product or system. However, while the importance of designing for usability has been recognised for some time, being able to specify accurately the usability requirements and to test for their implementation remains a complex task. MUSiC now provides practical tools and methods are available to help test and improve usability. MUSiC classifies usability measures into three types of measures for which a product or prototype which representative users can use are required. These measures are: Performance Measurement Performance of users is measured through objective, rule-based observations. Users are observed as they carry out tasks representative of the final system under conditions reproducing the major features of the normal working context. Performance measurements are important for certification. Performance metrics are used when an operation simulation or prototype of the system is available. Performance measurements can be used to iteratively refine the product design throughout the development cycle. User Perceived Quality Measures Users are required to spend a period of time using the system before completing questionnaires which provide measures of their attitude towards the system. User satisfaction metrics are used whenever a working version of the software product is available (e.g. a previous version, a prototype, or test version). These metrics include efficiency, learnability, control, affect (likeability) and helpfulness. Measures Of Cognitive Workload Measures of cognitive workload assess the mental effort required by users. Users are measured as they perform tasks on the system. The ratio of performance to effort is used as an indicator of usability, with a high ratio indicating a high level of usability. measures of mental effort are used once a prototype of the system is available they can be used to iteratively refine the system throughout the design cycle. The set of MUSiC products provide a comprehensive, proven, package of usability measures within a practical framework for industrial application. Where Do We Start? A problem for many software producers and in-house development teams is knowing where to start. "We want to improve usability, we know that we should be evaluating and testing products against the competition and previous products, but how do we go about it? A reputable Human Factors or HCI (Human Computer Interaction) consultant can provide a qualitative evaluation service and offer sound recommendations, but there can be significant advantages in keeping the evaluation work in-house. This gives continuity, allows quick access to accumulated expertise, and avoids any worries over commercial MUSiC for Applications Projects: A Summary page 5 confidentiality where this is an issue. Until recently, however, there has been a paucity of tried and tested methods for would-be evaluators of usability to adopt. Happily, things are changing. There is now a range of proven methods and tools for usability evaluation, which can be used cost-effectively in a commercial setting. Several of these are products of the MUSiC project. A major thrust has been the use of the methods and tools in commercial contexts, and the development of training packages and support services for companies wishing to adopt the methods. Put IT In Context Usability Context Analysis (UCA) is the prerequisite for any evaluation. It is a paper based questionnaire and a guide to ensuring that the circumstances in which a prototype or product is evaluated match the intended circumstances of eventual use. An evaluation which studies how well users perform unrepresentative tasks in artificial circumstances is unlikely give a worthwhile indication of final system usability. The UCA Guide provides firstly a simple method for describing key features of: Users: for whom a system is designed Tasks: for which a system is designed to help users achieve Environments: in which a system is designed to operate - these are technical, physical and organisational all of which contribute the “Context of Use”. Using a questionnaire format, it gives the evaluator a structured method for describing the users, tasks and environment in which the system is to be evaluated, and of documenting how accurately this matches the intended context of use. This approach should underpin all user-based evaluations, since it provides a clear indication of how much worth should be read into the results of an evaluation. Tools For Usability Testing Most of the MUSiC tools provided are paper based with some computer software included, particularly in the more sophisticated tools. To assist the initial application of usability measures, a minimal selection of tools and some sub-sets of other tools have been tailored and packaged as the “MUSiC Toolbox” (see Brameur, UK, for more details about the MUSiC Toolbox). This enables simpler but complete and accurate usability measures to be made. It allows for subsequent upgrades to more sophisticated measurements that may require the assistance of more specialised staff or facilities not currently available within the company. The most comprehensive measurements require the facilities usually found only in Test Houses or more costly usability laboratories, a service that is provided by some of the MUSiC partners. The range of MUSiC products cover a broad spread of measurement techniques with methods required for effective usability measurement to meet different user needs. Note that some tools are alternatives, for example some are included in the “MUSiC Toolkit”, while others are for use in controlled conditions and in various combinations. This allows the measurement plan to be tailored to meet the differing needs of test houses, development companies, software systems procurers and others. MUSiC for Applications Projects: A Summary page 6 Usability Testing in the Product Development Cycle At the ealiest stages, before a product is designed, previous version of the product or competing products may be evaluated to test their strengths and weaknesses vis-a-vis the market. During an iterative design cycle rapid prototypes may be built which mirror (some of) the functionality as well as the look and feel of the proposed system. These prototypes enable the evaluator to involve expert walk-throughs or user interactions. Later on in the design cycle, high fidelity prototypes and early version of the system enable (alpha and beta) user-based evaluations to be carried out. Practical Methods A complete package of evaluation methods should incorporate all facets of usability testing, so that evaluators can choose those aspects which they (or the developers or procurers) consider most important. The MUSiC Project is providing just such a package, from which evaluators can choose to adopt methods individually or in combination. The methods give quantitative data, so that accurate comparisons can be made between different systems, versions and user groups. User-Based Testing Testing with representative users can give a far more complete picture of the usability of a system, although it requires a working prototype or implemented system. The MUSiC project has developed user-based methods for measuring the core factors of usability: efficiency, effectiveness and user satisfaction. MUSiC Usability Measurement Sequence: 1. Define the product to be tested. 2. Define the context of use. 3. Specify and prepare the context of measurement. 4. Perform the user tests. 5. Analyse the test data to derive the metrics. 6. Produce a usability report. MUSiC for Applications Projects: A Summary page 7 PRODUCT LIFE CYCLE AND TESTING SCENARIOS Stage in PLC Scenario Earliest development stage, where previous versions or competitive versions only are available for testing Low fidelity A rapid simulation of limited aspects of the system's functionality available. Only supports a scripted walk prototype through of the system, generally intended for demonstrating concepts and eliciting feedback on proposed functionality High fidelity A rapid simulation of the essential functionality of the proposed system is available. This is robust and prototype comprehensive enough to allow free access to users to perform specific tasks, though outputs may not be available. The prototype should reflect the intended interface in design detail A first release of the system is made available to specific Alpha & alpha test sites. This may be a partial system for Beta test incremental delivery system The product is actually in use in the field by real users In use performing real tasks in the real environment Feasability MUSiC for Applications Projects: A Summary page 8 COST-BENEFIT TRADE-OFF Measure Cognitive Workload: Objective Measures cognitive workload using physiological monitors (use objective measures when excessive mental effort is a critical factor, e.g. in situations such as; nuclear power, air traffic control, process industries) Cognitive Workload: Subjective Measures task complexity opinions. Subjective Mental Effort Questionnaire (SMEQ): statement selection empirically scored. Task Load Index (TLX): mental, physical, temporal also performance, effort, frustration. Advantages objective both detailed and global precise Disadvantages high costs relative to subjective measures rigid experimental evaluation design required may be intrusive User Perceived Quality SUMI (Software Usability Inventory Analysis) profiles: Affect; Helpfulness; Learnability; Efficiency and Control as well as providing a Global indicator of usability. Performance Measurement Method effectiveness & efficiency; productivity; difficulty analysis low cost hardly any methodological constraints valid and reliable indicators of mental effort and related environmental factors Global score for target setting and quick comparisons about 10 minutes to complete questionnare score by hand or computer assisted not constrained to measuring IT systems automated support to video analysis (DRUM) simplified version available MUSiC for Applications Projects: A Summary require well defined tasks and representative products only global indicators subject to conscious control of individuals only global indicators subject to conscious control of individuals representitive product required DRUM software needed for efficient analysis representitive product required (some value from early prototypes) can be intrusive (Hawthorne effect) page 9 SOME COMMON QUESTIONS ANSWERED Q A How much does it cost to run a MUSiC evaluation? Q A Can anyone use MUSiC or is specialist knowledge required? Q A How long does a usability evaluation take to perform? Q A Do I need a usability laboratory to use the MUSiC methods? Cost varies depending on the approach and type of evaluation performed (see Section 3). Assuming purchase of the MUSiC Toolbox and the presence of experienced usability evaluators in house, inexpensive questionnaire-based evaluations could be run that would involve only the costs of selecting users, performing trials involving questionnaire completion, and scoring the data. In such situations, the major cost component is the evaluator's and user's time. On the other hand, formal, laboratory-based evaluations of emerging products, utilising DRUM and the full PMM would either involve employing outside consultants to perform the evaluation or investment in training evaluators in the application of these tools. While this would be more expensive than the first scenario, it should be remembered that such an evaluation not only provides far more detailed information to the design team on effectiveness and efficiency, but the initial investment in training is a once-off cost, the trained evaluators would be capable of carrying out similar evaluations in the future without further training. Usability evaluation is a specialist skill and like all such skills, is best performed by those with training and knowledge of the relevant tools, techniques and procedures. However, MUSiC tools have been developed to enable individuals trained in their use to perform valid and reliable usability evaluations. Although it is possible to purchase and therefore use any or all of the MUSiC tools without taking the training, the MUSiC consortium will accept no responsibility for the evaluations and conclusions drawn in this manner. As the present GUIDE has sought to convey, evaluations range from quick, half-day tests to lengthy field trials, which may take several months to perform reliably and validly. MUSiC tools can be applied across all these types of evaluation. Obviously, given very limited time and resources, the absence of sophisticated evaluation facilities and so forth, it would be more appropriate to use, eg. SUMI than DRUM. However, beyond the obvious constraints of resources, which only the evaluator can determine reliably, MUSiC methods are flexible. It should be noted however, that use of MUSiC tools requires some planning and interpretation time but the resulting output is likely to be more reliable and valid that any 'quick-and-dirty' approach. No. A usability laboratory is only required for evaluations involving very detailed data capture of user-product interaction with video cameras, eg. using the full Performance Measurement Method with DRUM. Even then, such facilities could be set up where needed. Basic versions of all tools can be used with minimal facilities. MUSiC for Applications Projects: A Summary page 10 Q A Can I get help from the developers if I decide to use their method? Q What does the MUSiC method provide that other evaluation procedures do not? A As the GUIDE has outlined, MUSiC offers the first valid and reliable means of quantifying the usability of an interactive product. In so doing, it supports the derivation of context-specific measures of usability that can be employed to compare competitive products or versions of the same product undergoing modification. Regular usage of MUSiC enables the development of a database of evaluation results in-house that can be used to inform both the development of future products and the comparison of future versions of the same product. If you or your company purchase MUSiC tools, and training is purchased, flexible help is always available. MUSiC for Applications Projects: A Summary page 11 THE MUSIC PRODUCTS General Guide The General Guide describes the complete MUSiC system and introduces the tools and methods. Detailed information is contained in the specific Tool Handbooks. The following pages introduce some of the tools and techniques which could be used within the MUSiC Application Projects. It is stressed that in practice, only those tools and techniques which are identified as strategic and which suit the participating company are used. The relationships to software engineering procedures and methods are elaborated where required. Usability Context Analysis The Practical Guide to Usability Context Analysis describes a method by which: the context in which a product is, or will be used can be specified factors in the context that are likely to effect the usability can be identified, and the context in which the usability of the product will be measured can be defined. Part 1 of the guide explains the importance of context in the measurement of usability, and explains some of the terminology and concepts relating to context. This part of the handbook is written for a wide audience including procurers, designers, usability analysts and consumers of IT products. Part 2 of the guide describes the questionnaires and provides guidance by describing a product's context of use and specifying an appropriate context of measurement. It also explains how to produce a clear Measurement Plan. A Context of Use Questionnaire is included, intended for use by procurers, designers and usability analysts; a Context Report table, also included, is used for specifying the context of measurement. This is intended only for usability analysts with suitable training. Performance Measurement Method The Performance Measurement Method (PMM) is a technique for measuring usability that relies on user testing of the product. MUSiC provides a handbook that describes the PMM method by which usability of a product can be measured quantitatively. The method uses measures based on performance. It provides data on the effectiveness and efficiency of user's interaction with a product, thus enabling comparisons with similar products, or with previous versions of the product under development. It can also highlight areas in which a product can be modified to improve usability. The method provides reliable results and is efficient to use. The handbook is in 7 parts: Part 1 - explains how to get started and provides useful background Part 2 - explains how to apply the method Part 3 - provides an overview of how to analyse usability sessions Part 4 - describes in detail how to analyse the output from evaluation tasks Part 5 - describes in detail how to analyse video records of usability sessions Part 6 - describes the software tool that supports the method ( “DRUM” ) Part 7 - provides interpretation guidance. The handbook also contains four appendices: A1- provides a format for describing problems A2 - provides information on Hierarchical Task Analysis MUSiC for Applications Projects: A Summary page 12 A3 - is a directory to the Performance-based metrics A4 - provides technical information about hardware requirements. A basic version is also available separately and as part of the MUSiC Toolbox. DRUM DRUM (Diagnostic Recorder for Usability Measurement) is a software tool which provides assistance throughout the process of usability evaluation, where the evaluation is based on the analysis of user performance. DRUM greatly speeds up the analysis of video recordings and where possible activities are automated entirely. It helps the analyst build up a time-stamped log of each evaluation session, and calculates measures and metrics. The tool delivers evaluation data in a format compatible with spreadsheets and statistical packages, for further analysis and graphical display. DRUM assists many aspects of the usability analyst's work: Management of data Task analysis Video control Analysis of data DRUM requires: Apple Macintosh II (or LC) computer with a 13 inch or larger monitor System 7 (or 6.0.5 or later) HyperCard 2.0 or later, allocated at least 1.3Mbyte RAM A VCR with RS232C serial interface and appropriate time or frame coding User Perceived Quality: SUMI Perhaps the simplest way to test the usability of a system is to ask the user, and the most conveniently structured way to achieve this is to use a questionnaire. Properly conducted and analysed, this approach can provide valid and reliable measures of user satisfaction. The Software Usability Measurement Inventory (SUMI) is a fifty item, internationally-standardised questionnaire for quantitative measurement of how usable a product is in the eyes of the user. SUMI has been developed within the MUSiC Project, at University College Cork, Ireland. It gives information about five factors: Efficiency, Affect, Helpfulness, Control and Learnability, plus a Global measure of usability, together with diagnostic data specifically relating to the product being investigated. Its validity and reliability have been established in field trials, and SUMI offers the prospect of inexpensive collection of trustworthy data about the usability of a product. There is a handbook to support the correct use of the questionnaire. It is of course important that people filling in a SUMI questionnaire are representative users who have been performing representative tasks in a suitable setting. This tool also contains computer administration software and the supporting software scoring program SUMISCO which will enable the user to carry out the analysis using a data base of standardised samples and provides more accurate scores than those otherwise possible. Measures of Cognitive Workload As well as an introduction to the measurement of cognitive workload, the methods available are of two basic types: Objective Measures Subjective Measures MUSiC for Applications Projects: A Summary page 13 Objective Measures of Cognitive Workload In some circumstances (for example where use of a system is continuous, or its correct use is safety critical) it is also desirable to measure the mental effort (more correctly: cognitive workload) imposed upon the user. Work at the Technical University of Delft (Netherlands), has produced methods for the measurement of cognitive workload. Objective measures are obtained by monitoring variations in heart rate, and subjective measures are elicited by the use of questionnaires. MUSiC Objective Cognitive Workload measures are based on heart rate variability which is in turn dependent on: blood pressure regulation, temperature regulation, and respiration. Heart rate and respiration are measured. As heart rate may be affected by other factors such as stress, movement, etc., it is not always a valid measure of mental effort. Subjective Measures of Cognitive Workload There are two basic methods: The Subjective Mental Effort Questionnaire (SMEQ) The SMEQ contains just one scale and has been carefully designed in such a way that individuals are supported in rating the amount of effort invested during task performance. The support is provided by nine scale anchors. These anchors are verbal statements like “No effort at all” or “Very much effort”. The choice of statements and their scale locations are empirically derived. The SMEQ has been administered in various laboratory and field studies with high validity and reliability values. The Task Load Index (TLX) is a widely used and acknowledged technique. TLX is a multidimensional rating - it is the weighted average of six sub-scales that relate both to the individual and the task he / she is to perform. The sub-scales are: The demands imposed on the individual: mental physical temporal The interaction of the individual with the task: performance effort frustration MUSiC for Applications Projects: A Summary page 14 THE MUSIC PARTNERS The MUSiC Partners for MUSiC for Applications Projects are: In the UK: BRAMEUR Limited Clark House 2 Kings Road Fleet Hampshire GU13 9AD National Physical Laboratory Queens Road Teddington Middlesex TW11 0LW Tel: +31 1578 3720 Fax: +31 1578 2950 In Ireland: Human Factors Research Group University College Cork Cork Tel: +44 81 943 6097 Fax: +44 81 977 7091 In The Netherlands: Delft University of Technology De Vries van Heystplantsoen 2 2628 R2 Delft Tel: +44 252 812252 Fax: +44 252 815702 Tel: +353 21 276871 Fax: +353 21 270439 In Italy: Data Management Spa Laboratorio di Ricerca Applicata Via Bassi, 10 Ospedaletto (PI) Tel: +39 50 985 330 Fax: +39 50 985 358 Please note: This is not the complete list of partners of project P5429, Measuring Usability in Context ESPRIT Project. For information on this project and up-to-date information on exploitation activities arising from it, please contact Mr M Kelly, of Brameur Ltd., at the address given above. MUSiC for Applications Projects: A Summary page 15