Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Astronomical Laboratory 29:137 Fall 2007 Project 1: Introduction to Probability, Statistical Inference, Instrumentation, Data Analysis, and Model Fitting The goals of this lab are to introduce the student to: simple statistical ideas, with illustrations in experimental astronomy use of laboratory data acquisition and instruments utilize elementary data analysis techniques, especially fitting models to data and determining associated uncertainties and goodness of fit criteria. Experimental Procedure Part 1: Probability and Statistics Problems 1. Someone [e.g. the State of Iowa] offers the following wager: Place a $1 bet and choose 5 numbers between 1 and 30. If all 5 numbers are guessed correctly (in any order), you win $1 million. The numbers do not repeat (they are unique). a. Is this a good bet? [calculate the odds] b. Suppose you only had to choose 4 numbers correctly. Is this a good bet? [odds?] 2. Find a coin in your pocket. a. Toss it 100 times. Record the number of heads. b. Collect the results from other students in your team; enter these in your lab notebook. c. Calculate the mean and standard deviation. Compare with the expected results (for a large number of trials). d. Calculate the probability that the number of heads exceeds 60. e. If you had flipped a coin 106 times, what is the probability that the number of heads would exceed 600,000? 3. Suppose you have a drawer infinitely full of socks, red, green, blue in equal numbers. a. What is probability of randomly choosing a pair of same color? b. Now suppose there is exactly one pair of each color. What is the probability of a matching pair now? c. Suppose there are 5 pairs red socks, 10 pairs green, 20 pairs of blue socks. What is probability of picking 3 socks, all the same color? 4. Galaxies have three morphological types: elliptical (40%), spiral (40%), and irregular (20%). Suppose they are randomly distributed in the Universe. a. Pick a random elliptical galaxy. What is the probability that its nearest 3 neighbors are all spirals? b. Pick three random galaxies. What is the probability that none of the three are irregular? c. Consider sampling volumes containing ten galaxies each. How many boxes would you need to sample before having a 50% probability that none of the galaxies was a spiral? 5. [Poisson statistics]. Suppose you observe a faint star, counting photons every second. The mean number of photons per second is five. a. b. c. d. What is the probability of receiving 10 photons in a given second? 20 photons? Zero photons? What is the probability that 100 seconds will pass without any one second interval having a count of exactly zero photons? 6. Card tricks. a. b. c. d. What is the probability of dealing an ace as the first card of a full deck? Probability of dealing two straight aces? Probability of dealing half the deck [26 cards] without a single ace? Probability of a blackjack (ace plus a face card: K, Q, J or 10) in first two cards of a full deck? e. Probability that a poker hand (5 cards) dealt from full deck will contain exactly one pair? 7. An astronomer observes an optical and x-ray flare simultaneously in an x-ray binary system. It is known that both optical and x-ray flares occur about once a day for an hour. The astronomer detected the simultaneous flare after observing the system continuously for 10 days What is the probability that this is a coincidence (i.e., that the simultaneous flares are physically unrelated)? Note: 'simultaneous' means that there is some overlap in the flare times. 8. What is the probability that in a room of 30 people, at least 2 people share the same birthday? Part 2: Experimental Data Acquisition and model fitting The goal of this part of the lab is to measure the response of two ‘square-law’ detectors measuring the input power from -5 dBm to +20 dBm and measuring the output voltage. For each detector, fit the input power (mW) vs. output voltage using a linear model. By assigning a nominal uncertainty to each measured voltage, calculate the reduced chisquare of each fit, and hence the probability that a linear model fits the data. Use a RF frequency of 2 MHz. The report must include: (1) a picture and diagram of the experimental setup, (2) table of measure values, (3) plots of the linear fits, and (4) analysis of the goodness of fit for each detector. Useful facts: 1. A square-law detector is a simple device consisting of a diode and RC circuit whose output voltage is supposed to be proportional to the square of the RMS input voltage, or input power level of an applied RF signal (since power is proportional to V2). 2. RF power is normally measured in dBm units. A Bel is a (dimensionless) ratio unit, the ratio being 10x, where x is in Bel. A decibel (dB) is 1/10 of a Bel, expressing the ratio 10dB/10, so for example, a signal that is 25 dB stronger than another signal has a power ratio 102.5 = 316 times. 3. A dBm is a [logarithmic] power unit with 0 dBm = 1 milliWatt (mW). Hence, e.g., +10 dBm = 10 mW, +30 dBm = 1 W, -30 dBm = 10-6 W, etc. 4. An oscilloscope can be used to determine RF power by measuring RMS voltage and knowing the input impedance. The oscilloscope you will use has a characteristic input impedance Z = 50 ohms. Hence power P = V2/R, for e.g., an RMS voltage of 500 mV is a power P =5 mW (7 dBm). Data Analysis Hints 1. Use a suitable graphing and statistics program (MathCAD, Excel, Graphical Analysis, etc), plot the output voltage (mV) as a function of input power level (mW). 2. Fit a linear function to the data points. Calculate the uncertainty in the fitted parameters (slope, intercept) and the chi-square and of the fitted model. Is a linear model an adequate representation of the data? (i.e., determine the degree of confidence of acceptance of rejection of the model.) Fitting models to experimental data is an essential part of all experimental science. A critical question is always: Does the model adequately represent the data? In other words, with what confidence can one accept (or reject) the model? This can be quantified by determining the chi-squared value of the fit, along with the associated confidence level for acceptance (or rejection) or the model. The value of a fitted parameter is meaningless without an estimate of its uncertainty. Unfortunately, most data plotting and fitting programs (e.g. Graphical Analysis, Excel) do not automatically calculate these uncertainties. The linear fit is most easily done using Logger Pro 3. 3. The program Logger Pro assume that the uncertainties of all data points are equal, and reports a ‘RMSE’ fitting parameter, which is the summed root-mean-square (RMS) difference between the model and the data summed over all data points. To convert this number to a goodness-of-fit estimate, we need to: a. Determine the number of degrees of freedom (df). This is the number of data points minus the number of free parameters of the fitting function. For example, suppose you have 10 data points and a linear fit (2 free parameters, intercept and slope). For this case, df = 8. b. Estimate the uncertainty of the y values. Call this . c. The chi-square of the fit is 2r = RMSE/. The probability that the model and the data are consistent is given by the chi-square integral, which can be found in any statistics textbook (e.g., Bevington), or online. 4. As an example, suppose the RMSE = 0.08 V with 11 data points, model a linear fit, and an estimated uncertainty per point = 5 mV. Derived parameters: The number of degrees of freedom df = 9, chi-square = 16, probability of acceptance of model 9% (only 9% of a large number of trials would result in a chi-square as large as that observed, assuming that the model [linear fit] was correct.