Download eg: linear regression model

…continued… Part III. Performing the Research 3 Initial Research 4 Research Approaches: 4.1 Experience 4.2 Modeling and Simulation 4.3 Experimental Design 4.4 Qualitative Approaches 4.5 Quantitative Approaches 4.6 Collecting Data from Respondents (surveys, etc…) 4.7 Observing System Behavior 5 Hypotheses 6 Data Collection 7 Data Analysis 1 4.1 Experience The main role for experience in research is: • effective design of a research project: – selecting a do-able and valuable project – selecting and designing appropriate methods and tools: • eg: design survey to extract information (everything relevant, and without introducing bias) • effective execution of the research: – careful execution of experiments/surveys/etc... without introducing errors – not missing any new insights that may not have been expected – eg: the discovery of penicillin • thorough and insightful analysis of results, and drawing of conclusions: – through use of appropriate tools – patience, and creativity 2 4.2 Modeling and Simulation What is a model?: • a representation of an actual or designed system/object, used to: – describe the represented system (for education) – gain insights into the represented system (for education, or inference) – compare the behavior of alternative versions of the represented system (to optimize design): • eg: try different numbers of trucks in an earthmoving system, to minimize costs – predict how the system will behave under different scenarios (to plan for future events, or evaluate what happened in the past or present): • eg: how will the energy consumption of a building change with different weather conditions 3 What is simulation?: • the running of a model that is dynamic (typically that varies with time): – involves a class of models that need to be advanced through time in a step-wise manner, either because: • there is no known direct solution to the problem, • eg: simulation of construction process simulation • eg: simulation of building acoustics • or, the user needs to interact with the model through time: • eg: equipment training simulators • eg: management gaming (to develop individual and group skills in project management) 4 Classifications of models • Rosenblueth and Weiner (1945): – Material models: • transformations of original physical objects – Formal models: • logical, symbolic assertions of situations 5 • Churchman et al. (1957): – iconic models: • visual or pictorial representations of aspects of a real system • eg: design drawings • eg: simulation model diagrams CON 10 Haul Sleu Dig EXCAVATOR CYCLE load DUMP TRUCKS CYCLE Dump p = 0.95 Return p = 0.05 GEN 10 Sleu back Wait for dump truck Wait for excavator 1 excavator (1.5 cu yd bucket) Repair 5 dump trucks at start (15 cu yd capacity each) 6 CYCLONE Excavator-Truck Based Earthmoving Model – analog models: • adopt a system with a set of properties that have an appropriate correspondence with the system under investigation: • eg: electrical circuit representing heat flow in building (the behavior of the two systems follows the same mathematical principles) • eg: hydraulic analog of UK economy (MONIAC) http://en.wikipedia.org/wiki/MONIAC_Computer 7 – symbolic models: • involves logical or mathematical representations • eg: linear regression model * * * House price y = m.x + c * * * * * c * Square footage 8 • Sayre and Crosson (1963): – replications: • have significant physical similarity to the reality – formalisations: • symbolic models • eg: linear regression model – simulations: • requires stepwise evaluation to generate results • eg: CYCLONE simulation 9 • Fellows and Liu (1997) (synthesis of other classifications): – – – – iconic replications analogs symbolic 10 • The previous divisions classify modeling methods by their means of representing a problem. • There are other ways of dividing-up models. A common dichotomy is: – Stochastic: • this group of models recognize that some aspects of a problem are uncertain, and builds this into the model: • eg: how long it takes to perform a construction activity • eg: how much concrete will cost per cu yd • eg: will the HVAC equipment pass or fail its inspection • methods that take uncertainty into account include: • eg: Monte Carlo sampling • eg: PERT • eg: Markovian models 11 – Deterministic: • this group of models ignore uncertainty • eg: only use the expected duration to perform a construction activity • eg: use the expected cost rate • eg: assume the HVAC equipment will pass inspection 12 • The modeling process (from Fellows and Liu (1997), and Mihram (1972): Determine model’s objectives: its purpose(s); who will use it? Study the reality: the system, process, object to be modeled. Synthesize: combine components into model(s). Verify model(s): check model(s) for bugs or discrepancies from what you had intended Validate model(s): assess its accuracy relative to the system being represented (overall accuracy, consistency across all of the problem) Select most appropriate model(s): such as, that which produces the most accurate results, or that with the most useful scope of application. Apply model(s): for predictions, for making comparisons, for planning, for making inferences, etc... 13 • Example 1: – Multivariate Linear Regression Modeling (estimating the expected annual energy costs of a house) • (see Assignment 2) 14 • Example 2: – Construction Simulation Modeling (field application of an earthmoving system, comprising 1 excavator and ‘n’ dump-trucks): • Objectives: find number of trucks that balances the excavator thereby minimizing construction costs • Study reality: determine the activities, determine the activity durations from historic data, determine the dependences between the activities and the interactions between the equipment, etc… • Synthesize: design the simulation diagram (perhaps using CYCLONE or similar), input this to the computer using an appropriate simulation software package Dig Sleu EXCAVATOR CYCLE load CON 10 Haul DUMP TRUCKS CYCLE Dump Return Sleu back Wait for dump truck GEN 10 Wait for excavator Repair 15 CYCLONE Excavator-Truck Based Earthmoving Model Cumulative production (cu yds) • Verify: run the simulation model, observing the behavior of the simulated system, to look for bugs in the model: eg: input a truck with a 10 cu yd bucket when meant to input 1 cu yd; or designed model so that a truck is filled by just 1 bucket load when it actually takes 10 bucket load cycles… • Validate: run the model and compare its performance to available performance data (say for available data on a system with 2 dump-trucks) system comprising or compare to your expectations based on experience, to establish its accuracy, eg: qualitative graph comparison, quantitative statistical (Theil’s test) simulated actual Cumulative Production for 2 truck system 16 Time (hrs) • Select most appropriate model: repeat last 3 steps using variations of the model, looking for that which is most accurate, eg: try including breakdown or maintenance of trucks to see if it affects model accuracy, try running as a stochastic versus deterministic model • Apply model: perform a sensitivity analysis, running the simulation with 1 truck, then 2 trucks, then 3 trucks, etc, measuring the cost of completing the work for each number of trucks Cost to complete work ($) Cost versus number-trucks sensitivity analysis Number dump-trucks 1 2 3 4 5 6 7 8 9 10 11 12 13 14 17 • Example 3: – Construction Simulation Modeling (field application of an earthmoving system, comprising 1 excavator and ‘n’ dump-trucks): • Objectives: find number of trucks that balances the excavator thereby minimizing construction costs • Study reality: determine the activities, determine the activity durations from historic data, determine the dependences between the activities and the interactions between the equipment, etc… • Synthesize: design the simulation diagram (perhaps using CYCLONE or similar), input this to the computer using an appropriate simulation software package Dig Sleu EXCAVATOR CYCLE load CON 10 Haul DUMP TRUCKS CYCLE Dump Return Sleu back Wait for dump truck GEN 10 Wait for excavator Repair 18 CYCLONE Excavator-Truck Based Earthmoving Model 4.3 Experimental Design What is an experiment?: • A test or study designed to investigate relationships between independent and dependent variables: – toss a coin to see if it is biased; • independent variables = the force of the flick of the coin; the point of contact on the coin; the time allowed for the coin to spin, etc.. – these are actually all randomized • dependent variable = heads or tails – immersing a brick in a pool of water to see how much water it absorbs over time: • independent variables = the type of brick; air pressure; temperature; time into experiment.. • dependent variable = quantity of water absorbed 19 – Determining occupant satisfaction with automatic light sensors for turning lights on and off: • independent variables = the amount of delay after perceiving movement before the lights are turned off; the sensitivity of the sensors to movement; the level of external sources of lighting;.. • dependent variable = the level of satisfaction of the users (may be measured with interviews post event). 20 Basic approach: • Ideally, fix all independent variables except one: – eg: fix air pressure and temperature.. • then vary the remaining independent variable: – eg: time (to see how much water is absorbed over time) Dependent Variable (eg: water absorbed) * * * * * * * Other Independent Variables (eg: temperature, air pressure..) Main Independent Variable (eg: time) 21 • Often, cannot fix all independent variables except one (there will be some variance in the other variables): – eg: cannot regulate temperature or air pressure perfectly – then the results will include some errors (experimental errors) since these other variables will affect the results – try to keep them as fixed as possible, then: • use repeat experimentation to average out results (randomize errors); • or maybe use some other form of curve fitting with smoothing Dependent Variable (eg: water absorbed) * * * * * * * Other Independent Variables (eg: temperature, air pressure..) Main Independent Variable (eg: time) 22 – example of repeat experimentation to average out error in results: Dependent Variable (eg: water absorbed) * * * * * * * * ** * * *** * *** *** Other Independent Variables (eg: temperature, air pressure..) *** * * Main Independent Variable (eg: time) 23 – example of curve fitting: Dependent Variable (eg: water absorbed) * * * * * * * Other Independent Variables (eg: temperature, air pressure..) Main Independent Variable (eg: time) 24 Designing the set of experiments: • Need to identify the problem domain: – the range of values to be considered for each independent variable Independent Variable 1 Scope of problem (problem domain) Independent Variable 2 25 • Then determine the set of values to be considered within that domain: – this may be done on a grid (example with 2 independent variables) Independent Variable 1 Independent Variable ‘n’ 26 – the question is, how fine should the resolution of the grid be? • the finer the resolution, the more information is provided by the results, but the more costly is the set of experiments • one way to determine the resolution is, if building a model of the data, to test the accuracy of the model. If it is not accurate enough, try increasing the resolution of the experimental data. – Can plot sensitivity of accuracy to experiment resolution error * * * * * * * resolution (fineness) of data samples 27 – the set of values measured in the experiments may be chosen randomly – the rules for determining the number of experiments is affected, as before, by the cost versus information gleaned – in some situations (eg: when we just want to know the average value for a dependent variable) we can use statistics to determine the number of experiments to perform • the greater the variance, the more experiments we need – later lecture Independent Variable 1 Independent Variable ‘n’ 28 – the set of values measured in the experiments may be on a grid that is not constant – this may be because the change in the dependent variable may vary (eg: become less for larger values of time) Independent Variable ‘n’ (eg: air pressure) Independent Variable 1 (eg: time) 29 • Sometimes the problem domain is not a regular shape, and may imply some interdependence (correlation) between the independent variables: House Size – for example, when one independent variable has a large value, maybe a second independent variable tends to have a larger (or smaller) value; – as a practical example, larger houses tend to have higher quality fittings: House quality 30 Determining the size of a random sample: • Some problems have a finite population to be assessed, for example: – What percentage of the population will vote for an Independent candidate? • the population of people eligible to vote is in the millions, or – What percentage of architectural practices use nD CAD tools when designing an airport? • the number of architectural practices capable of designing airports is just a handful. • Other problems have an infinite population to be assessed: – What is the average temperature of a room during a year period • there are an infinite number of points in time when the temperature could be measured 31 • For infinite populations, and large finite populations, it is not feasible to measure for every possible situation. • Thus, we need to take a statistical sample, and make inferences from those results. – Eg: measure the mean number of construction fatalities per year for each of a random selection of construction companies • When we sample from a population: – the larger the number of samples (the greater the portion of the population we sample) then: • the more accurate will be our inferences, but • the more expensive will be the cost of the study (time and $) 32 • So, how large should the sample be? – (a) we can decide on an acceptable margin of error (E) and level of confidence (often 95%), then calculate the required sample size (n), or – (b) we can decide on an acceptable level of confidence and, for a given sample size, calculate the margin of error. – Note, the margin of error will be the difference between the mean of the sample and the mean of the total population. – Also note, problems with larger variance (or standard deviation) require more samples for a given E and confidence level: • Consider the problem of estimating the man-hours required to design a 4,000 sq-ft custom home: – If it always took 1,000 hours, then we would only require 1 measurement to establish the man-hours required – The more variance between design jobs, the more samples we would need to achieve an accurate estimate of the mean duration 33 probability density •Relationship between E and confidence limit: Greater confidence Sample mean leads to higher error (E) Eg: 95% confidence limit = (95% of the area under curve) Actual mean probability density Eg: E = 100 man-hours difference Lower confidence leads to lower error (E) Sample mean Eg: E = 73 man-hours difference Eg: 80% confidence limit = (80% of the area under curve) Actual mean 34 probability density •Relationship between E, confidence limit, and sample size: Smaller Small sample sample size size Sample mean = less certainty Eg: 95% confidence limit = (95% of the area under curve) Actual mean probability density Eg: E = 100 man-hours sq-ft difference difference Larger sample size = more certainty Sample mean Eg: E = 53 man-hours difference Eg: 95% confidence limit = (95% of the area under curve) Actual mean 35 • Example: a sample of 31 design firms provided the following data for the man-hours they required to design a 4,000 sq-ft custom home: – Sample mean (SM) = 375 man-hours = [( ∑ d ) / n] where: • d is the man-hours to design one home • n is the number of design firms sampled – Sample standard deviation (SSD) = 19 = √ [(∑ (d-SM)2) / n ] • What is the margin of error (E) for the expected (mean) man-hours if we want to be 90% (p=0.9) confident that we are within the error? – the formula to calculate this is: E = z0.9/2∙ (SSD / √n ) – note: use (n-1) instead of n for small sample sizes less than 30 – first find z0.9/2 , that is, find z for p = 0.45 (use look-up table such as http://www.intmath.com/Counting-probability/z-table.php – thus, z = 1.65 – thus E = 1.65 ∙ (19 / √ 31 ) = 5.6 man-hours – we are 90% confident that the expected value is between 375-5.6 man36 hours and 375+5.6 man-hours • What is the margin of error (E) for the expected (mean) man-hours if we want to be 95% (p=0.95) confident that we are within the error? – E = z0.95/2∙ (SSD / √n ) – first find z0.95/2 , that is, find z for p = 0.475 (use look-up table at http://www.intmath.com/Counting-probability/z-table.php – thus, z = 1.96 – thus E = 1.96 ∙ (19 / √ 31 ) = 6.69 man-hours – we are 95% confident that the expected value is between 375-6.69 manhours and 375+6.69 man-hours 37 • If we want to be 95% (p=0.95) confident that we are within plus or minus 3 man-hours of the actual expected man-hours, how many samples would we need? – first, rearrange the formula: n = [( z0.95/2 ∙ SSD ) / E ]2 – then find z0.95/2 , that is, find z for p = 0.475 (use look-up table at http://www.intmath.com/Counting-probability/z-table.php – thus, z = 1.96 – thus n = [(1.96 ∙ 19) / 3 ] 2 = 154 samples – we have to perform an additional 154-31 = 123 samples • After the 154 samples have been collected and analyzed, we could recalculate this (with the new SSD) to see if we have enough samples – but beware that this is a little like “keep collecting data until we get the answer we are looking for” !!! 38 • Class example: – Estimating average absolute difference between bid price and actual price for a class of construction contracts. 39 • What if we want to measure more than one variable? – eg: we want to measure: • the average time spent designing the house, and • the fee charged for the design work, and • the square footage of the house – then, if the level of confidence and margin of error are the same for all variables, then apply the sample size formula to the variable with the largest sample standard deviation (SSD), since this will be the variable that requires the largest sample size and will thus satisfy all variables – if the level of confidence and/or the margin of error are different for the variables, then calculate the required sample size for each variable and take the largest answer. 40 • Up to now, we have been concerned with determining an appropriate sample size when determining the expected (mean, average) value of some continuously valued variable of the population: – eg: the mean duration to design houses of a given type? – …here, duration is a continuous variable • But how do we determine the sample size for problems where we want to determine the portion of samples in a class? – eg: what percentage of urban planners have a doctoral degree? – …here, planners either have a PhD or they don’t, it is a categorized variable, not a continuous variable 41 • The approach is similar to before, but we have to use a slightly different formula: – n = [( zlimit/2))2 ∙ (p ∙ (1-p))] / E 2 – where: • p is the portion of samples in one category, and (1-p) is the portion in the remaining categories • Z is the confidence limit, and • E is the margin of error – this formula is essentially the same as before with the main exception that it replaces the sample standard deviation (SSD) with (p ∙ (1-p)) 42 • Eg: we want to know the portion of time a steel-fixer is either cutting reinforcing bars, bending reinforcing bars, or is doing something else: – we don’t want to sit there and watch the worker as we have other things to do and, moreover, they may change their behavior; – so we make spot observations at randomly selected points in time – the question becomes, how many spot observations should we make? – say, for example, we have made 30 spot-observations (at randomly selected times) over a couple of days and we got the following readings: • cutting reinforcing steel = 8 observations • bending reinforcing steel = 16 observations • doing something else = 6 observations – set “p” to the portion for the set of observations closest to a half (as this gives us the largest required sample size) • p = 16/30 = 0.533 (for bending reinforcing steel) • lets say we want a confidence limit of 95%, so z0.95/2= 1.96 • lets say we want a margin of error of E = ±5% • thus n = 1.962 ∙ (0.533 ∙ (1-0.533)) / 0.052 = 382 samples (an 43 additional 352 as we have already completed 30) • when all spot observation have been made, we simply multiply the working hours in the day by the portion of spot observations for each task, to get the hours per day spent performing each task. 44 • What if we want to know whether or not variance in the means between several groups is significant: – eg: we want to know if the type of construction project (commercial, industrial, infrastructure) affects the mean number of fatalities per year (measured, say, as per million man-hours). – the expected number of fatalities may vary due to differences in the types of work, but differences in the sampled data may just be due to sampling bias – we can solve this type of problem using ANOVA (ANalysis Of VAriance between groups), the F-test – originally developed by Fischer (hence the F in F-test). 45 • What if there are more than one variable: – eg: we want to know if the type of construction project (commercial, industrial, infrastructure) affects the mean number of fatalities per year and the mean number of non-fatal accidents: – we can solve this type of problem using MANOVA (Multivariate ANalysis Of VAriance between groups) • We have just touched the surface of the topic of sampling theory – for a detailed reference, see “Sampling” 2ND Edition by Stephen K Thompson 46 What do we use experimentation for? • for theorizing: – to provide us with the insight necessary to develop a theory that describes the causal (cause-effect) relationships between the independent and dependent variables; – to test the theory we just developed, or indeed any theory of interest (we compare experimental results with those predicted by the theory); – Note, experimental theorizing is usually based on some abstraction of the real system under investigation, eg: • a physical construction/representation of part of the system under investigation (such as immersing a brick in water as a representation of a brick wall) • an empirically derived model that we can experiment with (such as a regression model of house prices) 47 • and for empirical modeling (eg: the regression model developed for assignment 1): – to build or develop an empirical model that describes the relationships between the independent and dependent variables; – to evaluate the performance of alternative models (to select the best); and – to validate the model (test its accuracy in all respects) 48 Do not confuse experimentation with ex-post-facto research: • Experimentation is something that is planned and conducted with the purpose of understanding relationships between the system variables. – it tends to lead to a high degree of internal validity (accuracy within the boundaries of the experiment), but does not extrapolate to the real world as well as ex-post-facto research • Ex-post-facto research uses data collected from a system that has not been established specifically for understanding the relationships between its variables, eg: – productivity observations for tasks in a given construction project – evaluation of accidents based on data collected by OSHA – it usually has a higher external validity than experimentation, but tends to have less internal validity Both experimentation and ex-post-facto approaches can be used for ‘theorizing’ and ‘empirical modeling’. 49 Replication: • A desirable feature of research, in particular experimentation, is replication: – this is necessary so that others can validate (or counter) your conclusions, and – more specifically, more replication of an experiment (under identical treatments) facilitates: • identification of human error (mistakes made by the experimenter that can be avoided); • identification of systematic error (errors that are reproducible and inherent in the experiment – eg: bias when recording room temperature resulting from heating of the air by the measuring equipment); and • estimation of experimental error (random effects from unknown factors – eg: error when recording room temperature resulting from random voltage fluctuations in the measuring equipment); – Replication of an experiment allows the experimental error to be measured more accurately: • often this is measures as Standard Error which is an estimate 50 of the standard deviation of the errors. • Replication requires meticulous care in recording all details of the experiment. • In any case, it is important to take great care in setting-up and executing an experiment, as well as in recording the results – so that no unnecessary errors or misconceptions are introduced into the results and conclusions. 51 4.4 Qualitative Approaches Aim of qualitative research: • to obtain an in-depth understanding of human behavior and the factors that govern human behavior. • it investigates the ‘why’ and ‘how’ of decision making, not just ‘what’, ‘where’, and ‘when’ • it typically relies on four methods for gathering information: – – – – participation in the setting, direct observation, in depth interviews, and analysis of documents and materials. 52 4.5 Quantitative Approaches Aim of quantitative research: • the identification and quantification of phenomena and their relationships • it attempts to develop mathematical models of certain phenomena, to understand past systems or predict the behavior of future systems. • two basic questions of a quantitative approach are: – what is to be measured, and – how should those measurements be made? 53 4.6 Collecting Data from Respondents A lot of the research in our discipline (Design, Construction, and Planning) involves asking people questions. This is typical of research in the social and/or management sciences. These take the form of: • questionnaires • interviews, and • case studies 54 Each approach is suited to a different class of problems, as follows: • first, the stage of development of the research area: – research fields that are at an early stage require better understanding at a qualitative level • in this case, use the interview format since it allows for: (i) flexibility during the questioning (the interviewer can adapt to the responses being given or probe if they are hesitating); (ii) requires longer answers (as uses more open-ended questions) which interviewers can transcribe (interviewees may resist giving long answers if they have to write it down); – research fields that are at a later stage require better understanding at a quantitative level • in this case, use the questionnaire format since (given the closed-ended question style, and check box responses to many questions) it allows for: (i) more questions to be asked; (ii) larger samples to be taken of the population, to provide the statistical significance 55 • second, the scope and depth of the study: – the scope is how widely applicable the study is: • examples of a broad scope studies: ones covering all geographic areas, or all age groups, or companies from small to large size – the depth of a study is how much detail it goes into: • in depth studies identify more significant variables, and tend to quantify relationships case studies (one or a very few samples) Depth of Study * * Interviews (few samples) Questionnaires (large sample) Scope of Study * 56 • The previous diagram is a generalization. For example, it is possible for interviews to have a broader scope than questionnaires. 57 Questionnaires: • A written or screen based series of questions • The questions may be: – Closed – asking specific things usually with a set of checkbox answers, for example: • How many employees does your company employ?: less than 10; l0 to 99; 100 to 1,000, etc… • Project planning software helps keep projects on schedule - do you: strongly disagree; disagree; neutral; agree; strongly agree. – Open – asking general questions without a prescribed set of answers (useful for getting to understand the issues and key variables in a problem), for example: • What do you believe are the most important factors affecting project progress? • Explain your answers to the previous question. 58 • Some general advice on writing questionnaires: – Use as few questions as possible, to maximize response rate – Keep questions easy to understand, using simple unambiguous language. Avoid using scientific or technical terms unless you are sure the responder will understand. – Make sure the questions cover the scope of the problem under investigation – For multiple choice questions: • keep the scale the same from question to question as far as possible (to avoid causing confusion) • always keep the direction the same, for example, use lower values for more negative responses, higher values for more positive responses: (disagree, neutral, agree; or -3, -2, -1, 0, +1, +2, +3, or 1-9, 10-99, 100-1,000 – Where appropriate, give the responder the choice of “other” with a prompt to explain. This should be employed when you are not sure of all the possible responses. 59 – Use an appropriate mix of open and closed questions – start with closed questions so that the responder does not get putoff, and starts making a commitment. – You can repeat questions to make sure you get the same response, but phrase them differently to disguise this fact. – Be very careful to make sure your questions do not lead or imply a preferred answer. The following is a leading, and thus poor question: • More women should be employed as project leaders to correct the disparity: disagree / agree • A better way of phrasing this question might be: the ratio of women and men employed as project leaders is: unbalanced / balanced …then follow-up with… if your response was “unbalanced” please explain. 60 • Consider performing a pilot study to: – check the clarity of questions, and see of they contain any ambiguity – to make an initial statistical analysis to compute the number of additional questionnaires that should be sent out (the sample size) – to see if the open ended questions suggest things that you had not thought of. • Test the questionnaire on people close to you to verify the questions and get ideas for other questions to be included. 61 • Questionnaire writing is time consuming and should be performed with the greatest of care: – otherwise you will get biased and incomplete responses to the questions, and a poor response rate: • postal responses may be as low as 25% to 33% • low response rates lead to biased results since there may be something different about those that choose not to respond (maybe they are the busy ones and have a different perspective on the problem you are studying) 62 • Make sure you select an appropriate sample size, balancing cost of the study against statistical value • Follow-up non-responders to try to maximize your response rate • Try to determine any pattern in the responders and non-responders to see if there may be some resultant bias in the answers • A good reference on designing questionnaires is: – “Improving Survey Questions: Design and Evaluation (Applied Social Research Methods)”, by Floyd J. Fowler, Sage Publications Inc., 1995 63 Interviews: • Interviews can range from: – structured, through semi-structured, to unstructured – this is concerned with the extent to which the interviewer directs the subject of the interview: 64 • Structured interviews: – the interviewer asks a set of specific questions without much prompting – the questions leave little room for improvisation – it is best suited to problems that are fairly well defined and we are trying to quantify the relationships between system variables – in this sense, it has some similarity to the questionnaire method, noting: • structured interviews allow for better response to the individual questions since the interviewer can at least prompt for an answer or clarify a question (however, avoid leading the answer) • but, structured interviews (as with any type of interview method) can only be used for a relatively small sample size – due to the human effort involved in obtaining the responses – eg: homeowners attitudes towards community issues • here, we may have a good idea of the issues, but we want to quantify them • we can go door to door (or choose a random subset) without too much effort, and thus ensure a maximized response 65 • we are confined to a small sample size, that of the neighborhood • Unstructured interviews: – in the extreme case, the interviewer introduces the topic briefly, and then allows the interviewee to indulge in a monolog, – prompting from the interviewer may be limited to just encouraging the interviewee to continue, or to prompt them for more information on responses that sound promising; – it is particularly well suited to research topics that are not well defined (it is good for probing a topic area to get a feel for the problem); – in this sense, it is very much a qualitative tool; – eg: we note a large difference in the ability of planners (within a company) to accurately schedule construction work, based on their experience: • we could use an unstructured interview to determine what it is that the planners are doing differently, • we do not have any specific ideas as to what the experience provides planners to make them better planners, so the interview format should be flexible enough to ascertain thi knowledge • only a small number of planners are employed making the 66 interview format feasible. • Semi-structured interviews: – these fall somewhere between the two above extremes: • such as, a list of questions, but with some probing for more indepth answers to interesting responses • or, a list of sub-topics concerning a topic for which the interviewees responses are required. • It is good to tape record the respondents’ comments: – this reduces the risk of missing something in transcription – but get permission from the interviewee first. 67 Case Studies: • Typically, these employ a variety of data collection techniques: – interviews may be part of this, along with: – collection of documentary data – observation of on-going events: • sitting in on meetings; • observing design or construction activity. 68 Triangulation: • This is the use of two or more methods to investigate the same thing: – eg: observing on-going construction operations (using, say, timelapse photography) and interviewing construction managers about issues they have with these same operations – it has the advantage of allowing one set of data to be validated against another set; and – it allows for slightly different perspectives of the same problem to be obtained, thereby maximizing insight to the problem. 69 4.7 Observing System Behavior Observing systems: • almost always this involves existing systems that include people: – eg: studying an office information handling system (filing, data storage, data exchange) to identify inefficiencies and improved methods or resource allocations • this may involve modeling the observed system to see how it performs under other circumstances – eg: studying operator’s driving equipment to identify better operating procedures or control layout (ergonomic studies) • may set-up the system specifically for the study (such as the ergonomic study, but using benchmark tasks) – this is essentially experimentation 70 • in special cases, there may be no people involved – eg: observations of automated processes • A major difference between surveys and observation of human-based systems, is that the latter does not deal with people’s opinions – however, people may still bias the results by behaving differently under observation – Hawthorne Effect • Recording devices: – cameras (still, video, time-lapse), etc… possibly done remotely such as over the web… • Typical examples of measured parameters: – – – – – – – event timing; task durations (from events or spot readings); resource flow bottlenecks; idle time process conflict (such as for safety); user frustrations with aspects of the tasks; etc… 71

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download eg: linear regression model