Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ENGR 104: Lecture 2 Statistical Analysis Using Matlab Lecturers: Dr. Binh Tran © 2003-09 The Catholic University of America Dept of Biomedical Engineering Definitions Statistics: Science that deals with collection, tabulation, analysis, and interpretation of data (qualitative or quantitative) in order to make objective decisions and solve problems. © 2003-09 The Catholic University of America Dept of Biomedical Engineering Statistical Measures of Data Average/(Arithmetic) Mean: The average value of all observations Median: Middle observation Mode: Value where highest number of observations occurs Range: Difference between max and min values (rough measure of data dispersion) Standard Deviation: Special form of average deviation from the Mean © 2003-09 The Catholic University of America Dept of Biomedical Engineering Average/(Arithmetic) Mean n Mean: X X i 1 n Advantage: Easy to compute Disadvantage: Distorted by extreme values (outliers) © 2003-09 The Catholic University of America Dept of Biomedical Engineering Median: Middle Observation Definition: Median value is middle item when items are arranged according to size Advantage: Not distorted by outliers Disadvantage:Must be rearranged according to size © 2003-09 The Catholic University of America Dept of Biomedical Engineering Mode & Range Mode: Most common value occurring in set of data Advantage: Most typical value and independent of the extreme items Disadvantage: If values are not repeated and amount of data is small, then the significance of the mode is limited Range: Difference between min/max values in series Advantage: Easy to compute & simplest measure of dispersion Disadvantage: No info regarding distribution of data © 2003-09 The Catholic University of America Dept of Biomedical Engineering Standard Deviation Definition: X n 1= 68.3% 2= 95.5% i X 2 1 n Advantage: Show the degree of dispersion and variability Disadvantage: Not trivial to compute © 2003-09 The Catholic University of America Dept of Biomedical Engineering Presentation of Data Frequency Plot: Histogram of # of occurrences. Curve Fitting: Polynomial fitting of experimental data Time Series Analysis or Trend Plots:: – Analysis of trends in data © 2003-09 The Catholic University of America Dept of Biomedical Engineering Data Presentation: Frequency Plot or Histogram Definition: Graphic representation of frequency distribution Advantage: Quick visualization of data Disadvantage: Difficult to analyze data, unless data is grouped systematically © 2003-09 The Catholic University of America Dept of Biomedical Engineering Data Presentation: Polynomial Curve Fitting Best fit curve for data Polynomial Equation: y a xm a xm 1 0 1 a xa m 1 m Advantage: Large set of data can be represented by a known equation Disadvantage: m>2, process becomes very laborious © 2003-09 The Catholic University of America Dept of Biomedical Engineering Data Presentation: Ex:Polynomial Curve Fitting Example: y a x 2 a x1 a 0 1 2 Where, a 0.0155 0 a 2.1411 1 a 58.4165 2 © 2003-09 The Catholic University of America Dept of Biomedical Engineering Data Presentation: Time Series (Trend) Analysis Definition: Graphic representation consisting of description & measurement of various changes or movements of data during a period of time. Types of trend measurement • Semi-average • Moving average © 2003-09 The Catholic University of America Dept of Biomedical Engineering Data Presentation: Semi-Average Definition: Split data set into two equal parts; take average; draw straight line through two average points Advantage: Very simple to calculate Disadvantage: Only gross representation of data trends © 2003-09 The Catholic University of America Dept of Biomedical Engineering Data Presentation: Moving Average Definition: A series of successive group averages Advantage: Simple to calculate; more accurate representation of local changes Disadvantage: Cannot be brought up to date © 2003-09 The Catholic University of America Dept of Biomedical Engineering Data Presentation: Ex: Three-Item Moving Average Values Total Moving Average 3 5 15 5.00 7 22 7.33 10 29 9.67 12 36 12.00 14 41 13.67 15 46 15.33 17 © 2003-09 The Catholic University of America Dept of Biomedical Engineering Questions ? © 2003-09 The Catholic University of America Dept of Biomedical Engineering Lab #2: Telemedicine Analysis Lab Report Due: 9/29 Download Telemedicine data for 6 study subjects (txt files) – http://faculty.cua.edu/tran/engr104/Datafiles.htm Using Matlab, statistically analyze the data and report your observations See handout © 2003-09 The Catholic University of America Dept of Biomedical Engineering LAB QUESTIONS: Is there a noticeable trend/pattern in the data? Across the datasets? Is there a correlation between the blood glucose and high blood pressure measure over time? Examine this using a time-series analysis (30-day epochs). Explain your findings. Use curve fitting techniques to estimate the regression line best fitting the data for each subject. Is there a difference between the effects of tele-monitoring on diabetics vs. hypertensives (i.e. those with high blood pressure)? Explain. – Is there any useful information in the histogram? © 2003-09 The Catholic University of America Dept of Biomedical Engineering