Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Problem 3: ML Estimation of velocity dispersion Assignment for ASTM11 Statistical and Numerical Tools in Astrophysics I LL 2010–11–16 1 Introduction The purpose of this exercise is to provide training in formulating and solving a data analysis problem by means of the Maximum Likelihood (ML) principle. Relevant background information is found in [1] (Chapter 6.1) and in the Lecture Notes (especially L3, pp. 7– 17). 2 The problem The velocity dispersion of a stellar cluster or galaxy is an important dynamical quantity since it can be used to calculate the density of matter (including dark matter) in the system. A relatively simple way to estimate the velocity dispersion is to use the measured radial velocities for a sufficiently large sample of stars. These determine both the mean (systemic) velocity of the cluster or galaxy, and the (internal) velocity dispersion. Dispersion should here be interpreted as the standard deviation of the velocity components along the line of sight, even though the true velocity distribution in general is not gaussian. When the measurement uncertainties are small compared to the velocity dispersion of the system, the sample mean and sample standard deviation of the measured velocities may be good enough estimates of the systemic velocity and internal velocity dispersion. However, if the uncertainties are not negligible compared to the internal velocity dispersion, it is necessary to make a statistical correction for the uncertainties. Although one can think of several possible ways to do this correction, it is not immediately obvious which is the best method. A good way to approach such a problem and find a solution in a systematic manner is to use the Maximum Likelihood (ML) principle. This requires that the data are modelled as the outcome of a probabilistic experiment depending on one or more model parameters. Given this model, it is possible to compute the likelihood for any combination of parameter values. The ML principle says that the ‘best’ estimate of the parameters is obtained by maximizing the likelihood function within the allowed range of parameter values. The data to be used in this problem are found at http://www.astro.lu.se/Education/ utb/ASTM11/projects.html and consist of the measured radial velocities, and their uncertainties, for 18 different stars in a dwarf galaxy. The text file contains two columns with the measured velocity in the first column and its uncertainty (standard deviation) in the second. All values are in km/s. 1 3 How to solve the problem 1. Formulate a reasonable probabilistic model for the data. Which are the underlying assumptions? Which are the model parameters? 2. Derive an expression for the likelihood function (or the log-likelihood function). 3. Suggest a method to calculate the ML estimate of the parameters and their uncertainties. 4. Try out the method on the given data. What are the results? The first three points can be solved by pen and paper; the last point probably requires some programming. 4 Reporting Students are encouraged to discuss and work together on the problem. However, each student must produce a written report with his/her own solution including a description and evaluation of the methods used. Computer code used for solving the problem should be listed in full as part of the report. It should be well structured and include a sufficient number of comments to be easily understood without further explanation. References [1] Wall J.V., Jenkins C.R., 2003: Practical Statistics for Astronomers, Cambridge University Press 2