* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Sheet20,Dr.Mahmoud
Survey
Document related concepts
Transcript
Community lec no. 22 In Today’s lecture we will start discussing “Descriptive Statistics” We have already taken 3 lectures as an introduction to Biostatistics. We will Divide Descriptive Biostatistics into: Stat 1, Stat2 and Stat 3, Stat 4. Stat 1 We will explain All the Descrptive Statistics in general. Stat 2 We will explain the Shapes and Normal distribution. Stat 3 we will explain Realtive Risk, Arboration (Not Sure about this word exactly) and there will be answer for sample statistic. Stat 4 We will give a type from each different analysis. By this we have given Nominal test for 2 groups, Ordinal Test for 2 groups and Relational Interval test for 2 groups. Now what is Biostatistics and what does it mean? Biostatistics: It is a branch of applied math that deals with collecting, organizing and interpreting data using well-defined procedures and techniques. We have 2 types of Biostatistics: 1. Descriptive Statistics It involves organizing, summarizing & displaying data to make them more understandable. 2. Inferential Statistics It reports the degree of confidence of the sample statistic that predicts the value of the population parameter. Here we take a sample, and through this sample we can generalize on the population. )(نأخذ عينة و من خالل هذه العينة نستطيع أن نعمم على باقي العينات او السكان There are certain things that we have to take in consideration while we are collecting the data when doing a STAT study: 1. Accuracy The data must be accurate, by the following: Look for the missing data Investigate the results 1 Consider the results of other studies Perform data analysis Hypotheses ( )الفرضياتcan be either Null hypothesis or Alternative hypothesis Null Hypothesis Means that the hypothesis has no relationship between the variables (whether dependent variable or independent variable) and in Arabic it means ()الفرضية الصفرية In other words we can say that it is only stated by chance ( أطلقت )بالصدفة A simple example about this hypothesis “Your attendance to a lecture does not reflect your mark in the exam”(No Relation). It is presumed to be true until statistical evidence nullifies it for an alternative hypothesis. Type 2 Errors are considered with this hypothesis. {We will discuss the Errors shortly} The Null hypothesis can also be called “Statistical Hypothesis” Alternative Hypothesis It means that there is a relationship between the variables. Example on this hypothesis: “When you study the biostatistics subject and how it will reflect your performance in the exam”. There is a relationship. There are two types of errors that the researcher may encounter during his research (Type 1 Error and Type 2 Error): Type 1 Error o It either rejects or not rejects the Null Hypothesis according to the Alpha. o Alpha is the term used to express the level of significance we will accept. 2 o Alpha is considered as the Limit of chance that through it we can accept or reject the null hypothesis (Ex: the researcher considered the probability is 5% or 1%) o The researcher results appear by chance. (The Dr demonstrated it as Flipping a coin and by chance we can get one side) o Each time we flip the coin we will get the chance to have 50% head and 50% tail. o If we flip the coin twice, the probability will be 25% and so on, (Each time we flip the coin we will multiply the probability by 0.5) o The Null hypothesis is related to probabilities. o Usually the researcher sets the α according to certain value in order to eliminate the “chance” in where probabilities play a role. o Suppose the Alpha was 1% {The researcher determines his own value} o If the probability was 1% or more The researcher cannot reject the Null hypothesis, because the result is by chance. If the probability was less than 1% The result is for the tweet {Sorry, I didn’t know what the dr meant by this} o the null hypothesis: that there is no relationship between two measured phenomena o U have a medicine u need to try on a group of people there is a slight chance that some will get better by chance (not because of the medicine) o Type 1 error (alpha error) : is the incorrect rejection of a true null hypothesis o Type 2 error (beta) : is the failure to reject a false null hypothesis o The alternative hypothesis (research hypothesis) : the results you’re getting from the research. o Null and the alternative are two rival hypotheses, in ur research u should include the null hypothesis by writing the percentage under the name “data that do not support the research hypothesis” while if u r rejecting the null hypothesis u say that “data support the research hypothesis” o Example : (u will get confused bas I tried to write what the doctor said aha mesh last 2 lines) 3 o Testing 1000 bulb if they are working properly assuming that they should work 1000 hrs, we took 100 for testing and we put chance of error (sample error) 5% the results gave us 850 hrs not 1000 hrs , in this case we are not sure that the mean for the 100 is the same for the 1000 so we sth called standard error of the mean to check if the sample mean equal the population mean, null hypothesis says that there is no different between the population mean and the sample while the alternative hypothesis says they are different now u did ur calculation and and the null hypothesis is 6% which is bigger than 5% rejecting it is type 1 error but if it was less than 5% rejecting is type 2 error o 1-beta= power of the study Alfa the leveled of significant use for establishing F1F , Beta the probability of type 2 to R ( 1 – beta ) = power of the study .. Power of the study is the independent variable cause for the dependent variable .. Big size , how much we can say that the intervention with the significant difference ? Meaning of the result >> translation result from number to words and make them understandable .. importance translation the significant finding into practicable finding >> how we translation and we say less than 0.5 and we do rejection to non-hypothesis and so on .. generalization , how can we make the foiling to use for all population ( ) كيف ممكن نعمم, implication , what have we learned related to what has been used during study ,, and all this will be explained point by point in details later on .. Definitions : Data: is any type of information .. First step : Raw data is a data collected as they receive .. ( ) ملء االستبيان Second step : Organize data is the data that organized either in ascending, descending or in a grouped data .. 4 Example : ( In the slide 4 ) Weight in pounds of 57 school children at a day-care center : these numbers are not arranged, this information like that called >> raw data , but if we arranged them from largest to smallest, we will called them >> arranged data.. Descriptive Statistics: -In this lecture we will discuss using descriptive statistics, as opposed to inferential statistics.. -Here we are interested only in summarizing the data in front of us, without assuming that it represents anything more.. as like when we talking about students outcome on exam of dental students , we don’t generalize to all students of the University of Jordan .. -We will look at both quantitative and graphical techniques .. (high charge , medium charge , low charge) -The basic overall idea is to turn data into information .. So what we will taking is Numerical data may be summarized according to several characteristics .. numbers that collected from questionnaires or interviews , we will summarized in several ways :: measures of location or measures of dispersion or measures of shape or skewness .. 5 Measures of location: we take -Measures of central tendency: we take Mean; Median; Mode.. -Measures of non-central we take: tendency – Quantiles which is Quartiles; Quintiles; Percentiles.. Measures of dispersion: we take Range, Interquartile range, Variance, Standard Deviation, Coefficient of Variation.. Measures of shape: we take positive score, negative score, Beta, normal distribution, normal standardize distribution and standardizing data as scores .. First :: Measures of location: ( Mean , Mode , Median ) -Measures of location place the data set on the scale of real numbers. -It refers to the location of a typical data value-the data value around which other scores to cluster.. -Measures of central tendency (i.e., central location) help find the approximate center of the dataset.. We want to see how the data focus in the center.. so we are taking about the average but we cannot say the average because meaning of the word average is not specific .. -Researchers usually do not use the term average, because there are three alternative types of average. Average = mean , median and mode .. -These include the mean, the median, and the mode. In a perfect world, if the data is perfect normally distributed the mean, median & mode would be the same ( equal ) .. -However, the world is not perfect & very often, the mean, median and mode are not the same .. Let us to talk about Summation signs ( segment if we want to sum from 1 to 10 ) Now we will taking about the Mean .. Mean: 6 Is the sum of varies of the data divided by the number of observations .. if you have data ( 1,2,2,4,5,10) Mean =( 1+2+2+4+5+10 )/6 = 4 Mean sensitive to extreme values e.g. : if the DR gave us an exam and the data as the following : - two of us got 100 - one of us got 90 - recent got between 70-80 . The mean will be 98 because we have two extreme values 100 (y3ni bykon el mean a8rab ll extreme ) . But if all of student got 70-80 and someone got 10 , the mean will go down to 60 . So we cant calculate the mean from skewed data . E.g. : if the average income of some people is between 500 – 800 , but we have 2% have income from 5000-10000 , if we took the people whose their income more than 5000 with people whose their income from 500-800 , the mean will be skewed positively to the right , so the mean will be bigger than median . Extreme value pulled the mean to a larger value so the data called skewed . E.g. : the mean of death at age of 75 , if there is people die with age below of one year , the mean will decrease to the left will go negatively , so the mean will be smaller than median and it will be skewed negatively . The data not normally distributed , it means that the mean and the median and the mode are the same . 7 - 34% >>>>> one standard deviation above the mean . - 34% >>>>> one slandered deviation below the mean . - 96% means that we have two standard deviation . - if it not like this it will be positively (+) or negatively (-) skewed . Sample Mean Also called sample average( although it's a wrong name ) or arithmetic mean , it differs from geometric mean ( advanced formation for data , the data not normally distributed , we will make transform data , it differ in calculation from the arithmetic mean ) . Mean for the sample = X or M, Mean for population = mew (μ) Sensitive to extreme values which means One data point could make a great change in sample mean. E.g. : We have a sample : 1 1 1 1 51 - the mean = 55/5 = 11 ( but its not actual mean because there is extreme values in analytic researches , these values called outliners we should make adjustment mean , if the data not normally distributed . Good luck and sorry for any mistakes or misunderstood points Done by: Sumaya Abuodeh Khaldoon AlQaddumi Tareq Al-Amad Alaa Ali Alaa Mohammed Yousif 8