Download p - values - Squarespace

Hypothesis Testing • A decision-making process for evaluating claims about a population • Based on information obtained from samples • Hypotheses are made a priori before the experiment was conducted • Usually tests a prediction that some kind of effect exists in the population • A method for scientists to test the scientific questions they generate • Usually conjecture about a population parameter • A numerical value that characterizes some aspect of a population • Can never be completely sure that a hypothesis is true; instead, we work with probabilities • Generally, we will calculate the probability that results we have obtained in an experiment occurred by chance • We are studying whether the results we see are a random occurrence or due to some mechanism Testing a Statistical Hypothesis Steps: 1. Define the population to study 2. State the hypothesis that will be investigated 3. Give a statistical level 4. Select a sample from the population 5. Collect the data 6. Calculate statistical test based on sample data 7. Draw a conclusion • Null Hypothesis • Usually written as H0 (H-knot) • A statement that there is no difference between a population parameter and a specific value (or no difference between two parameters) • Alternative Hypothesis (Research Hypothesis) • Usually written as HA • A statement that there is a difference between a population parameter and a specific value (or there is a difference between two parameters) Two types of hypothesis • A hypothesis test that is directional is called a onetailed test • Examples: • Older than age 21 • H0: μ = 21 versus HA: μ > 25 • Treatment A will do better than treatment B • H0: μA = μB versus HA: μA > μB One and Two Tailed Tests • A hypothesis test that is non-directional is called a two-tailed test • Examples: • The population mean age is 21 • H0: μ = 21 versus HA: μ ≠ 25 • Treatment A has the same effect as treatment B • H0: μA = μB versus HA: μA ≠ μB One and Two Tailed Tests • Notice the statement of equality – a statement of no effect – is written as the null hypothesis • Preview: If we reject the null hypothesis, this is the same thing as saying have detected an effect One and Two Tailed Tests • Significance level is the probability we use as our criterion for an unlikely outcome, supposing that H0 is true • Denoted by α (alpha) • Must be decided a priori, this means that α must be determined before conducting a statistical test • Choosing α after the data as been analyzed lacks objectivity Significance Level of a Statistical Test • α is our criterion for rejecting the null hypothesis • Reflects how careful the researcher wishes to be • The smaller the α that is specified, the stronger the evidence needed to reject H0 • Scientific and medical literature commonly test hypotheses with α levels of 0.05 or 0.01 Significance Level of a Statistical Test • Remember, statistical inference uses sample statistics to make inferences about population parameters • We use sample data to calculate a test statistic • A test statistic is a univariate statistic (a number) • Sample data is used to calculate the test statistic • Test statistic is used in hypothesis testing to make decisions • To use the test statistic, we have to determine its sampling distribution under the null hypothesis (assuming the null hypothesis is true) Calculating a Statistical Test • Once we know the distribution of the test statistic, we use this to calculate its p-value • Each time we calculate a test statistic, we will couple it with a corresponding p-value • In the literature, statistical test results are often presented in the form of (test statistic, p-value) • Examples: (t=-1.98, p=0.04) (χ(2)=12.32, p<0.001) (t=2.5, p<0.013) (F=6.17, p=0.0001) Calculating a Statistical Test • Examples: (t=-1.98, p=0.04) (t (28)=2.5, p<0.013) (χ(2)=12.32, p<0.001) (F=6.17, p=0.0001) • In these examples, the letter denotes the sampling distribution of the test statistic • For example: • t denotes a t-distribution • t(28) denotes a t-distribution with 28 degrees of freedom • χ(2) denotes a Chi-squared distribution with 2 degrees of freedom • F denotes a F-distribution Calculating a Statistical Test • The p-value is calculated using the test statistic and its sampling distribution • To calculate the p-value, we calculate the area under the curve for specific values of the test statistic and its sampling distribution • We will focus on how to use a p-value to make decisions about statistical significance Calculating a Statistical Test • On the slides that follow, we: • Define a p-value • Describe its properties • Clarify what it does and does not tell us • Following this, we will then go on to Step 7 of Hypothesis Testing • In Step 7, we will talk about Significance Testing • Significance Testing describes how the p-value is used to make decision p - values • Let’s not confuse • The definition of a p-value • The process of using a p-value in making a decision p-values • A p-value is the probability of obtaining the same sample statistic (mean value) or a more extreme value if the null hypothesis is true • The p-value is the most commonly reported result of a significance test • It enables us to judge the extent of the evidence against H0 Definition of p-value • Ranges from 0 to 1 • Summarizes the evidence in the data about H0 • A large p-value (e.g., 0.58) indicates the observed data would not be unusual if H0 were true • A small p-value (e.g., 0.0003) indicates the observed data would be very doubtful if H0 were true Properties of a p-value P (A | B) = “the probability of A given B” No Is P(A|B) = P(B|A) ? • Example; A = dramatic overdose of medication B = death • P(A|B): Given a person dies, what is the probability of death due to overdose? (this probability will likely be very low) • P(B|A): Given a person overdoses, what is the probability of this person dying? (this probability will likely be very high) • Is P(A|B) = P(B|A) here? p-value • A p-value does not give information about trend, direction, strength of an association, size of an effect, or magnitude of a difference • p=0.04 is not less significant than p=0.0000000001 • Statistical significance is influenced by the sample size • With a large enough sample size anything can be statistically significant • When reporting p-values, footnotes such as *p<.05, **p<0.01 suggest a trend and are misleading Properties of p-values • A smaller p-value does not indicate a more important result • Magnitude of the p-value is not a guide to clinical significance • p-values do not take Into account the size of the effect • A small effect in a large study can have the same p-value as a large effect In a small one • Conclusions should not be based only on p-values • p=0.06 is not evidence of 'marginal significance' or proof of a 'trend towards significance'; these types of conclusions are untrue Properties of p-values • p-value does give the probability the result observed is due to chance • The p-value does give the chance of obtaining the effect we have observed, assuming the null hypothesis is true (in other words, assuming there is no real effect) • Whenever a p-value is reported, it is recommended to also report a measure of effect size • • • • Confidence Interval Correlation coefficient Regression parameter Etc. Properties of p-values • We have not discussed how the p-value is used in making a decision • Significance Testing describes the process for using the p-value to make a decision • Next, we talk about Step 7; drawing conclusions and making inferences • We need to discuss the decision we are making, and possible errors in making that decision p-values • In hypothesis testing, we make a decision • Only two possible outcomes • We decide to either 1) Reject H0 2) Fail to reject H0 • Errors in inference describes each of these two possible incorrect decisions Errors of Inference "Accepting the null hypothesis" is NOT the same conclusion as "Failing to reject the null hypothesis" Hypothesis Testing Language • If we do not reject the null hypothesis • We do not conclude it is true • We can only recognize the null hypothesis is a possibility • Showing the null hypothesis is true is not the same thing as failing to reject it • Failing to find an effect is not the same thing as showing there is no effect Hypothesis Testing Language • These two errors are given specific names: • Type I Error • We draw a conclusion that H0 is false when it is in fact true • Occurs when we conclude there is an effect in our population, when in fact there is not • Type II Error • We draw a conclusion that H0 is true when it is in fact false • The probability of a type II error is denoted by the Greek letter β ('beta’) • In other words, P(Type II Error)= β Errors of Inference • We draw a conclusion that H0 is false when it is in fact true • Occurs when we conclude there is an effect in our population, when in fact there is not • Example 1: Incorrectly conclude an HIV prevention intervention is effective (Reject H0, conclude HA is true) in preventing HIV infections through behavioral changes when, in fact, it is not (H0 Is actually true) • Example 2: Incorrectly conclude a new chemo agent is effective (Reject H0, conclude HA is true) in reducing tumor size when, in fact, it is not (H0 Is actually true) Type I Error • We draw a conclusion that H0 is true when it is In fact false • The probability of a type II error Is denoted by the Greek letter β ('beta') • Example 1: Incorrectly conclude an HIV prevention intervention is not effective (Fall to reject H0, conclude HA is false) in preventing HIV infections through behavioral changes when, in fact, it is (HA is actually true) • Example 2: Incorrectly conclude a new chemo agent is not effective (Fall to reject H0, conclude HA is false) in reducing tumor size when, in fact, It is (HA is actually true) Type II Error Decision Null Hypothesis True False Fail to Reject H0 Correct Type II Error (False Negative) Reject H0 Type I Error (False Positive) Correct • False Positive: We reject H0 and conclude there is an effect (Positive; Alternative Hypothesis) when, in fact, there is not an effect (False Conclusion) • False Negative: We fail to reject H0 and conclude there is not an effect (Negative; Null Hypothesis) when, in fact, there is an effect (False Conclusion) Type I and Type II Errors • This describes the process of using the p-value to make a decision • We compare the calculated p-value from Step 6 to the predetermined significance level (α) from Step 3 • If p < α, we reject H0 and conclude the result is statistically significant • If p > α, we fail to reject H0, and conclude we have failed to find statistical significance Significance Testing • Essential to understand the difference between statistical and clinical significance • Statistical Significance - the likelihood that the difference could have occurred by chance alone • Clinical Significance - the smallest clinically beneficial and harmful values of the effect; in other words, the smallest values that matter to the patient Conclusion: Significance • A research article gives a p-value of .001 in the analysis section. Which definition of a p-value is the most accurate? a. the probability that the observed outcome will occur again. a. the probability of observing an outcome as extreme or more extreme than the one observed if the null hypothesis is true. b. the value that an observed outcome must reach in order to be considered significant under the null hypothesis. c. the probability that the null hypothesis is true. Question Reference: Dr. Matt Hayat Statistics Lecture – Rutgers University 2013 Plichta SB, Kelvin E. (2012) Munro’s Statistical Methods for Healthcare Research, 6th Edition

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download p - values - Squarespace