Download understanding probability and statistical inference through resampling

IASE/ISI Satellite, 1993: Clifford Konold From Brunelli, Lina & Cicchitelli, Giuseppe (editors). Proceedings of the First Scientific Meeting (of the IASE). Universiti di Perugia (Italy), 1994. Pages 199-211. Copyright holder: University of Perugia. Permission granted by Dipartmento di Scienze Statistiche to the IASE to make this book freely available on the Internet. This pdf file is from the IASE website at ht~:!!~~ww.stat.auckla~ld.nz!-iase!~ublica~ions!p-oc 1993. Copies of the complete Proceedings are available for 10 Euros from the IS1 (International Statistical Institute). See http:lljsi.cbs.n1isale-iase.htmfor details. UNDERSTANDING PROBABILITY AND STATISTICAL INFERENCE THROUGH RESAMPLING Clifford Koriold ScientificReasonirrg Resenrch Institute Hnsbrourh Laboratory, Unzvrt-siiy of Mdssdchzae~s Amberst, MA 01003, USA 1. Introdr~ction During tile past four years, I have been developing curricztlic and computer sofiware for teaclzing probability and data analysis at the itltroductory high-school and college level. The approach I've taken emphasizes thc use of red data, where "telling a story" ralces priority over testing hypotheses, and in which mathenlatical for~nalismis ltcpr to a minimum (see Cobb, 1992; Scheaffer, 1990; Watlcins ctltl., 1992). A major question I have considered is how probability ought to be integrated into rhis n&wdata-rich curriculum. There are two major reasons for keeping probability in the data-analysis (or statistics) curriculum. First, at some point in the process of constructing theories that accounr for patterns in data, it is important that students consider alterilative explanations. Among these is the possibility that some outcome of interest resulted from chance. Second, probability is an important concept in its own right (Falk and Konold, 1392). It comprises a world view and should not be viewed a necessary evil that must be faced if students are to understand statistical inference. i n deciding how to relate probaLility and data at~al~sis, T have adopted an approach Julian Sitnon began &vacating in the late 1960s. Simon describes his approach as having grown out of his frustrariol~watching graduate students do siily things when trying to test a staristical hypothesis (Simon and Bruce, 1992). H e began designing experiments from which he hoped they could build up sound probnbitistic understandings. These eventually developed into a resampling approach that promised a nlore intuitive take oil probability and data analysis, and which made the connections between the two f elds more apparent. Of course, Simon didn't invent Mollre Carlo methods, nor the randomizatioo tests he would come ro employ, Lut he was among the first to see rheir educatiorlal potenrial, and long before the computer was widely available. IASE/ISI Satellite, 1993: Clifford Konold 200 UNDERSTANDING PRORARILI?YAND S TATIS11CS THROUGH HESAMPUXC Rather than elaborate Simon's argument here, I briefly describe two software tools we've developed, highlighting aspects rl-iat empilrtsize the relation between probability and data analysis. I also report some results from our primary test site, a high school in Hotyoke, R4assachusetts. 2. Modeling a problem with Prob Sim Most educationai probability-simulation software con~priseseveral ready-made models (e.g., coin model, die model). Snlde~ztsload the appropriate model, draw samples, and t11cn see results displayed. The software thus offers empirical dc~nolistrarionsof various facts and principles, such as the Iaw of large numbers and the binomial distribution. The software we have designed, "Prob Sim", inclrrdes no ready-made models. Rarl~er,the student must build the model, specifying the appropriate sampling procedure and analyses in order ro estimate the probabiiity of some went. The process of buildi~xga simulation nlodel is at least as important as, if not niore important rhan, drawing the appropriate conclusions fronl rhe resdrs. T o illustrate, I'll describe one of our activities entitled "LAPD" (see Xonold, 1933, for another example). Students begin by reading excerpts from The Nezv Yoi-k Tinzes account (March 18, 1991) of the beating of Rodney King by officers of the Los Angeles Police Department. After reporting rlzat. "at least 15 officcrs in patrol cars converged on King", the article broaches one of the issues that made chis incident explosive: "In what other police officers called a chancc deployment, all the pursuing officers were white. Tile force, which nurnbcrs about 8,300, is 14% black?. Scudcnts are asked to build a model of tliis situation LO estimate the probability of finding no blaclrs in a random sample of I6 officers. 'Yhis problem has generated lively discussions in our test sites. Wlzetz scude~lts care deeply about a problem, they are more willing to persist through difficulties. hbreover, they learn that applicarion of probability theory is not limited to rolling dice arid blindly clrawing soclcs out of dresser drawers. 2.1 Building a madel I demonstrate below the various stages through which students progress in modeling this situation, illustracing the steps wid1 screen shots from f'rob Sim. In Fig, 1, a mixer has been filled according to rile irrforznatiori prot.ided in the article. There are a total of 8,300 elements, 1,162 of them labeled B (Black) and 7,138 labeled N (Not blaclt). The non-replacement option has been selected to preclude the possibility of having the sanic IASE/ISI Satellite, 1993: Clifford Konold C. KONOI D clement (officer) in a sample more than once. Figure 1. A snnplzi?gt~1odeIf.rthe Isil'Dpro6br~t The Run controls on the far left shows the sample size set at 16, and numbcr of repetitions set (somewhat arbitrarily) at 100. After tlte llun button is pressed, the cornputer draws I6 elements from the mixer without replacement, repeating chis a rota1 of 100 rimes. This is analogous to lookiilg ar 100 occasions when IG randomly-selected officers appeared on a crime scene. The sampling process is animated in the lower palt of the Mixer window. The results of each repetition arc displayed in a Data Record window (not shown). After the block of 100 repetitions has been drawn, results can be analyzed as shown in Fig. 2. The Analysis window in this case shows the numbcr (and proportion) of occurrences of each of the 17 possible unordered outcomes. Ten percent of the samples had 0 B's. Thus, a first estimate of the probabjIiry that a "chance deployment" would include no black police of&cersis .10. 2.2 Repeati~ythe txperivzent 111many textbook examples of simulation, the experimental component ends here. But it is important not only to esrinlate the probability of sonic event, bur to get sonie sense of [he range (or variability) in that estimate given the number of reperitions. Indeed, the notion of chance is nor IASE/ISI Satellite, 1993: Clifford Konold value per se, but in the variability of result over apparent iiz a & File Edit Run flnalyze Windows BBBBBBBBBBBBBBBN BBBBBBBB8BBBBBNN BBBBBEBB'BBBBSNNN 0 O 0 BBBBBBBBBBBBNRNN o BBBBBBBrfNNNNNt{NN BBBBBBNNNNNNNNNN BBBBBNNNNNNNNNNN 0 0 2 Figure 2. ?%r 1rtln4ih zutndow shozc!ing rhc ~rsziltsof 100 rtprt/tlot.zi replications of an experi~nenr.To replicate ail cxprrimer~tirl l'rob Sim, thc studcnr has only to press the run button again. Another random sample is dmwn, and rbe Analysis window updates to show the new results. Students replicate tlte experiment about 10 times, plorting the resulrs olt a line plot whicll helps to emphasize the variability in the process. They &en pool their results to come up wid1 a final probability estimate. For the 10 resulrs plottedi~~ Kg.-3, the pooIed esrinare ofP (0 b~aclcss)is .I 09. x X X X X X X X X X 5 6 7 8 9 10 11 12 13 14 15 IG 17 Percentage of samples with no BS Figure 3. Lincpht of reszrlt of10 rrplicmonr 30 uo~ssnss!p Zun~oddnsrr! 1003 ~ueuodur!u-e s! W!S q o ~ d.as!p pue s u y o ~ j o ruIea1 art3 puoXsq saAom uoysnssu! Ipun m s m woplas suoFssn3srp y s n s -maria Jeqloj palelax uorspap awos Surlleus u! PJOM ISET xo Lluo ayl xa'iau SF 3uaAa auros j o kq!qeqo~d ayl 'sp.rO,?n.Jaylo u1 -auass ayl uo 3<uaXalM sSa3yjo y3eiq Xy*\l, r r ~ ~ d x a ley] sa!Joaqx h ~ y d s u o liue s j o L!pq!suas aql pue s r a s g o ayl j o Xuow!lsal luanbasqns arp arz ~ m l ~ o d uaxom r; - u o ~ ~ c w ~ oj oj us! a y d IueAaIas a71 j o auo lilrro s! 1na.s a q j~o 1b!l~9'e401day3 'asmosfo 'pny '93~19aJaM smsgjo 8u!puodsa~aql 30 auou zerp asucys lsn[ s e a 11 JaylayM cuor3ew.Iojrr! l!atji u o pasaq <Su!p!sap jo ruayqo~datis ssnss!p euaprus 'qef aql JO pua a43 IQ .sawosaq faporu lerp suo~.js z p ayl a~!lem.roju!slow a111 c l qJ E~~ I I~~ ! I~eJ jo E~ ~ayur!l a y l j o axe .Cay] aleme alour ayl Gstro~durnsse 4~1!Ljilf!~dllr~s ~ O A fCm3 E Xay t191101.p I E ~ Iaztiza.1 01 auros sluapnls kro!~ct.tz!s 1a83ei at11 01 1apow E % K I ! . I ~ ~ I L I O ~ssasold ~O s!q$ q 8 n o . q ~-lapour pagydurfs 3!ayl Xq pa3npoauy sasc!q j o vopsarfp arp $s!pa~d uayo rres Xayl 'sjrp oy ~ ; u e oXaxp j~ - ( s l i d asex-awes u! auass arp uo a g u e axjod aneq <.%-a) zo~s-ejleuo!~ppeua mnomz o3rir aym 01lapow aql rasp uc3 Xay~sases aulos u j .uo!zans!s Tea1 at11 13ajjE suo!lt.lap!suos sno?a.e,z ~ o r PIIBJ! l a p a p ~SRW smapn$s 's1jnsa.i rro!l.elnur!s 3 q 3 30 anph anys!pald aql au!urJauy 01 JayJo 111 .S~A~OI\U!%~!lapoxu 30 ssaso~daqs I ~ C ~prrurrslapun M 01 u$aq sxuapn3s xcql ppout Xue o ~ u !pa~e-tod~osrr! suo!~du~nssc ayl Bu!ssarrppe qSno.rrp st 21 .sl[nsaa .rfaylj o suorlt.s!~druy arp s s n x p pue srro!~drunssenzau awyl r p p JrraisIsuoa spgoru rrnJ pue pl!nq Xarfi 'ues Xatp 21 arues arp jo aq 01 pual S J ~ U I j! J ~r~o!~sanb II! L~!I!qcqo~d ay3 ~ s a j j e1y8!w S J ! E ~ u! a~!jreusijo sraq.~oa q o d 3erp 1 3 E j ayi ~ o J3p!SU03 q 01 L a ~ d t u ~~: ~ 9a' $p a p a."" s~uaprls~ .apzxu ale s u o ! ~ d r u u s ssno!.rxh ~ '%u!p~!nq rapom j o s s a l o ~ dayl u~ IASE/ISI Satellite, 1993: Clifford Konold IASE/ISI Satellite, 1993: Clifford Konold tl~istype, because it cat1 be used to model engaging problems too complex for students to tackle fortnally. Prob Sirn's simplicity, and dze speed with which suff~cientdata can be generated, gives students the time to design a sampling model, collect adequate data, draw conc~usions,and discuss implications of the Gndings. In rhe next section, I show Izotv we build on ideas introduced in modeling probabilistic situations when we move on to data analysis. 3. Data analysis usir~gDatascope We designed a data-analysis program called "DaraScope" for use in introductofy courses stressing exploratory data analysis in which studerlts work with real data. Our primary objective is for students to learn basic data-analysis techniques for exploring relationships anlong various variables. By using mulriple-variable data sets, we give students the opporcr~tlityto explore questions of particular interest to them. DataScope encourages students to make initial judgments of relationship by visually comparing plots. This is demonstrated below using data obrained from a tluestiorlrtaire administered to 84 students in two high schools in western Massachusetts: Amherst Regional and Holyolce High. Andlerst is a snxall coilege rown, while )-IolyoIre is a larger, industrial city. The i~lfornzatioilcollected on tach studenr included !gender, age, birth order, family size, inaritai status of parents, religious activity, rating of school ~erformnnce,educational level of parents, curfew times, working hours and wages, and time spent on homeworlc. Students spend about two weeks exploring various questions irt this data set, among them thc question of whether holding an &r-school job adversely affects school perfornimce. DataScope encourages exploration by allowing students to form subgroups of some variable based on the values of some other, related, variable. For example, Fig. 4 shows the box plots for hours of homework (HWHRS) and hours spenr working (JOBHRS) for Amherst and Hoiyokce students (in the case of JOBHRS at Amherst, the median is the same as the 1st quartite, as indicated by the double-thick line). The nulnber of cases in each box plot is displayed to the right of the plot. This is imporrant to irrclude as students wiIl frequently draw concfusio~iswithout considering the number of cases represented by each box plot. The plots below suggest that Amherst students spend more time on homeworlc than working apart-time job, and that the opposite is true of Holyoke students. T%istmnd is consistent ryjtl, common (v-6efdsrereo qpes oLche cawrrs. l'c is tempci11g to ailclude from fig. 4 &at those wit& jobs spend less time on homework than those without jobs. However, whetl HWHRS are IASE/ISI Satellite, 1993: Clifford Konold "grouped" by the categorical variable JOB, students discover the reverse appears to be true - students with a job studied an average of three hoursper-week mare t h m those without a job. HWHRS, AMHERST n=3 1 *@%$d+gq <+ a. -6. .".s\ j! HWHRS, HOLYOKE +=I- n=S1 + n=5 1 JOBHRS, WOLYOKE e Hours per week Figure 4. Box plots oJbonzemrk. hocm nn~fjobhoztrsfir Amhcrst md FIo~okcsnldents job = no n=26 * ;. -/f#&4&.7p@&3$$*iq **,A job = yes .U>.' c %q*"a<f,.i .. n=56 Homework I-~oursper week Figure 5. Box plots of honzetuor.4 hottrrfir strrdmt wrri~am1 zuzrl~wt jobs At this point, students reconsider their expectarions and develop theories that might explain these data. One possibility that those who worlc also study Inore because they have learned to eFfective1y manage their rime. Or maybe the students with jobs are alder and have more homework assigned. Sorrle of these explanations can be ir~ve-sti~aced by Looking at other variabfes IASE/ISI Satellite, 1993: Clifford Konold in the data set. However, among the explanations to consider is the possibility that rhe difference was due to the "Xuck of tile draw" - chat with a better or larger sample, no difference \vould be found. In my cxpcriencc, stladents do not spontaneously raise this possibility, even when they have been using resanipling techniques, as described above to estimate probabilities. 'I'hey will judge differelices between two medians as important in one case, and uniniporralit in another, apparently rnalcing the judg~nentbased on the distance between the two medians, as ir appears on the computer screen. Thcy do not sponraneousty evaluate tliu difference with respect to rhe variability (i.e., to the IQI3s as shown in thc box plots). TIrerefoi.e, before of a difference showing them how we might determine the occurriilg by chance, I typically must remirld them that rhis is a possibility. 3.1 Rdn~i'omixatz'onm t s in DataScope Below is a demonsrration of how Datascope is used ro estimate the onetailed probabiliry of observing a differellce at least as large as the one observed in the sample above. The method is based on the rantfon~k~tion procedure as origir~allydeveloped by Fisher (see Barbelfa ct n l , 1390). This involves randomly reordering one of the variables (without replacement) to estimate the probabiliry of the observed difference under the 1lul1 hypothesis. With Datascope, the studenr can first do this "manually", to develop a sense of what the computer is doing. A "reorder" con~rnand randomly reorders rhe values of one of the selccted variables (in chis case, e of HWHKS are ra~ldornly the values for I-IWHRS). That is, d ~ values reassigned to cases. Once reassigned, rbe variable name appears in the data table with the synnbol O 011 both sides to remind the student that the column has been randomized (another command will restore the original order). A box plot of this randomly ordered variable, again grouped by the job variable, can be viewed. Given that the values of HWHRS have been randomly assigned, any difference between the medians of the two groups (with or without jobs? is due ro chance. In the example shown in Fig. 6, this difference is -1 (sub-tracting the tnedian of the tipper plot from the median of the lower Once students undersrand the procedure, the computer can be instructed to repeatedly reorder the variable and compure the difference between medians of the job and no-job groups, recording these in a new data table as iilusrrated in Fig. 7. lin this case the computer has been instructed to draw 100 random samples. The first few differences that were obtained are shown in Fig. 7 in the "Resampling" data table. The first value is the observed difference, -3. In the background you can see the variables IASE/ISI Satellite, 1993: Clifford Konold selected (Vl and GI) in the primaly data rable. The values i t 1 HWHRS are bcing randomly reordered. job = no -+-f n=26 -2":-*t',5Pti. ") W$$!FI X job = yes n=56 * ---*ggJ ' $ x d"" SF&+'aa*? b I 1 I 1 I 1 0 5 10 15 20 25 OWHESO Figrrre 6. Horrrrzuo~khozir~mndom<~ nfr&t~ra'm job nnB nojob growpr HS Survey 9D Figure 7. 7irble oj+di~etvncrs Itenuecn medidn IWNI?$fir job (nzd ~zo+b gror<ppslrn stlccefrive resamplig mns. After [he 100 r a ~ ~ d odifferences m have been obtained, the results can be displayed in a histogram as shown in Fig. 8. In this instance, 25 of the samples had a difference at Ieasr as large as -3. Thus, an estimate of idle onetaiied p value is .25. Additional repetitions could be cond~tcredfor a more precise estimate. This same procedure is used in DaraScope to test the IASE/ISI Satellite, 1993: Clifford Konold statistical significance of a value of I; and of frequency counts i n a 2 X 2 table. Difference between medians Figtrre 8. Satfipling I~i~rogram dzspliyig rrm/t.r of IOO resdmpIii?t+s 4. Educational outcomes I have found, as have Simon and Bruce (1991), that students are enthusiasric a b o ~ ~roba t ability and statistical infererlce when approached througl~resampling. Bur, do studenrs using this approach learn more than they d o in a traditional course! Simon ft.ul. (1976) compared student performance in cotlrses taught t~singresan~~ling vs. conventional methods. Give11 problems that could be solved using methods taught in either course, students in courses using the resampling approach consistentiy outscored students using the traditional approach. Many or most students who rake an introductory course will never need to collduct a statistical test or dererminc a probability precisely. They do, however, as members of a complex and increasiligly technological society, require a basic ur~derstaridin~ of uncertainty and the savvy to cvaluare "research" claims in the mass media. Accordingly, tllough sttldents should be able to solve problenls using methods they Iiave been taught, it is even inore important that they understand basic concepts .rvhich underlie these methods. Konofd and Garfield (1993) have developed items ro assess understanding of these basic ideas. Below is orle of our problems, adapted from an itern in Falk (1933, p. 11 1). The Springfield Meteorologic~lCenrer wanted to derer~ninethe accuracy of their weather forecasts. They searched their rccords for those days when the forecaster had teporred a 70% chance of rain. They compared these faiecasrs ro record? ofwhether or not it actually rained on rhose particular days. IASE/ISI Satellite, 1993: Clifford Konold cast of 70% chance of rain can be considered t'ety accurate if i t rained blem as .I0 this means ozttToPize of a single trial, I have referred ro this as the "outcome approach" (Konolcf, 1989). Figure 9. Frequency of reepomes 6&re iiztiuctiopt of193 s d e n t f ro the zo~atLvpro6lem IASE/ISI Satellite, 1993: Clifford Konold 210 UNDI?RSliV.iDlKG PROBADILITYAND STATISTICS T1iROUCI-I RES&fPLINC One of my instructional objectives, therefore, is for students to realize that a probability value typically tells us little or notlling about resulrs in the short run, but a great deal about results in the long run. Fig. 10 compares the results on this same problem before (black) and after (gray) instructioit. Correct responses increased only 6% with instruction. The results are similar across the majority of our assessment items. I.it a deeper level, many students after instruction using resampfillg appear tlnaware of the fundamental nature of probability and data analysis. Selected range Figitre 10. Frequency ofrefponsesbefjrc (black) arzd njer CYrny) insirz~ctiono f f 99 students to tlx wrn~berpro6lcnz. formalisms. Perhaps lookirig at development over a singIe course is too srrtall a unit of analysis in the case of and we sltoutd be rhinliing about series of courses over which we can expect to effect and observe conceptual change. IASE/ISI Satellite, 1993: Clifford Konold Bibliography Barbella P., Denby L. and Landwehr J.M. (1990), Beyond exploratory data analysis: The randoznization test, Mathemtics Teacher, 83, pp. 144-149. Cobb G.W. (1992), Teaching statistics: More data, less lecturing, in L. Steen (ed.), Needing the CallforChunge, (MMA Notes # 22, pp. 3-43], Matl~ematicalAssociation of America. Faik R. (1993), Understandingprobabitity and st~~tistics: Problems involving J;k??&erztal concepts, Wellesley, MA.: AK Pc ters (to appear). Falk R, and Konold C. (1392), The psychology of learning probability, in F.S. Gordon and S.P. Gordon (&cis.),Stztisticsfir the tweudq-first ccntwy (MMA Notes .# 26, pp. 1 5 1-J 64), Mathematicid Associz~ionof America. l~h real proble~ns, Konold C. (1993), Teaching probability t l ~ r o ~ rnoddi~lg Mrzthernrtticr Teacher (to appear). Konold C. (1969), Informal conceptions of probability, Copition and Instniction, 6, pp. 59-98. KonoId C. and Garfield 1. (1 993), StatistiCdl Reasoning Assess7nent. Part I: I~ztuilive7%izhjng,Scientific Reasoning Research Instittite, University of Massachusetts, Amherst. Scheaffer 11. L,. (1930), Toward a more quantitatively literate citizenry, The A~lilericanStfitistichn, 44(1), pp. 2-3. Simon J.L. and Bruce P. (1991), Resampling: h tool for everyday statistical work, Chance: Nezu Directio~issforStatistics and Computirzg, 4(1), pp. 2232. Simon J.T., Atkinson D.T. and Shcvokas C. (1976), Prol.ahitity and statistics: Experirnenral results of a radically different teacil~ngmethod, Anzcrica?~Mat/~einmticalMonthly,83, pp. 733-739. Watkins A., Burril G., Landwehr J. and Schaeffer R. (1992), Rernedial Sratistics?: Thc implications for colleges of the changing secondary scl~oolcurriculum, in F.S. Gordon and S.P. Gordon (eds.), Strltisticsjir the twenty-first certtury ( M M A Notes # 26, pp 45-55), Mathematical Association of America. Note The curricuium materials and research described in this articie were supported by a grant from the National Science Foundation (Grant #MDR8954626). The opinions expressed here are the author's and not necessarily those of the Foundation.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download understanding probability and statistical inference through resampling