* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download summary
Survey
Document related concepts
Transcript
SUMMARY β’ Z-distribution β’ Central limit theorem Statistical inference If we canβt conduct a census, we collect data from the sample of a population. Goal: make conclusions about that population Confidence interval π for π β₯ 30: π₯ ± π × π critical value kritická hodnota margin of error moΕΎná odchylka π for π < 30: π₯ ± π‘πβ1 × π What a confidence interval does tell us? β’ When we say "we are 95% confident that the true value of the parameter is in our confidence interval", we express that 95% of the observed confidence intervals will hold the true value of the parameter. β’ After a sample is taken, the population parameter is either in the interval made or not. It is not a matter of chance! β’ A confidence interval does not predict that the true value of the parameter has a particular probability of being in the confidence interval given the data actually obtained. neco × β’ Just to summarize, the margin of error depends on the confidence level (common is 95%) 2. the sample size π 1. β’ β’ 3. the variability of the data (i.e. on Ο) β’ β’ β’ as the sample size increases, the margin of error decreases For the bigger sample we have a smaller interval for which weβre pretty sure the true population lies. more variability increases the margin of error Margin of error does not measure anything else than chance variation. It doesnβt measure any bias or errors that happen during the proces. β’ It does not tell anything about the correctness of your data!!! π π HYPOTHESIS TESTING Aim of hypothesis testing β’ decision making Engagement ratio β’ Hopefully, you like this course so far. β’ How to measure this? number of minutes awake Engagement ratio = total minutes avilable Engagement distribution π = 100 π = 0.077 Ο = 0.107 π = ππ π = π. ππ 0.0 0.5 1.0 π = 100, π = 0.077, Ο = 0.107 β’ If all students attended to my lesson with singing, what is our point estimate for the engagement ratio? β’ 0.13, number of minutes awake = 100 × 0.13 = 13 min β’ Interval estimate β’ What is the standard error of the mean that we would to use to compare this sample mean (0.13) with the means of other samples of the same size? β’ ππΈ = 0.107 30 = 0.019 95% 0.077 0.13 95% 95% M 0.13 95% M 95% 0.13 M π = 30, π₯ = 0.13, Ο = 0.107 Confidence interval π π₯±π× π β’ But in our case we know the population standard deviation π = 0.107. So instead of the sample π we can use the population π. β’ What is a 95% confidence interval in our case? 0.13 ± 1.96 × 0.107 30 βΉ 0.090 β¦ 0.170 Confidence inteval β’ Our interval estimate for 95% confidence interval has a lower bound of 0.090 and an upper bound of 0.170. β’ Remember, what these numbers mean. This is the ratio of the βminutes awakeβ during a lesson to the βtotal minutes available in a lessonβ (which is 100 minutes). Engagement ratio ER = number of minutes awake 100 β’ So weβre predicting if I incorporate this musical lesson then the entire population of 100 students will be awaken between 9 minutes and 17 minutes. β’ Without my song the population of 100 students was awaken 7.7 minutes. β’ Thus weβre pretty sure a singing works, it will keep you awaken at least 9 min, but possibly up to 17 min. Self-assessment β’ The engagement ratio may not be perfect, it may have few flaws. β’ For example a fact youβre not sleeping does not necessarily mean youβre engaged. β’ Another option β you can self-report how engaged you think youβre (at the scale between 1 and 10). β’ And you can also self-assess how much you think you learnt (at the scale between 1 and 10). Results Measure of βEngagementβ Measure of βLearningβ β’ ππΈ = 7.5, ππΈ = 0.64 β’ ππΏ = 8.2, ππΏ = 0.73 β’ π₯πΈ = 8.94 β’ π₯πΏ = 8.35 Quiz β’ Ultimately we want to know if incorporating a song about the concepts in the lesson will lead to higher engagement and learning. What statistics should we calculate to determine this? 1. 2. 3. 4. Note if the sample means are less than or greater than a population mean. Calculate the actual difference between each sample mean and population mean. Find where each sample mean falls on the distribution of sample means for their respective populations. Find how many πs each sample mean is from the population mean. Measure of βEngagementβ Measure of βLearningβ β’ ππΈ = 7.5, ππΈ = 0.64 β’ ππΏ = 8.2, ππΏ = 0.73 β’ π = 30 β’ π = 30 β’ ππΈ = ? β’ ππΏ = ? β’ ππΈπΈ = ? β’ ππΈπΏ = ? ππ¬ = π. ππ β π. π π. ππ ππ = ππ. ππ Measure of βEngagementβ ππ³ = π. ππ β π. π π. ππ ππ = π. ππ Measure of βLearningβ β’ ππΈ = 7.5, ππΈ = 0.64 β’ ππΏ = 8.2, ππΏ = 0.73 β’ π = 30 β’ π = 30 ππ¬ = π. ππ β’ ππΈ = 7.5 β’ ππΈπΈ = 0.64 30 ππ¬ = π. ππ β’ ππΏ = 8.2 = 0.12 β’ ππΈπΏ = 0.73 30 = 0.13 Probability of getting a given mean β’ ππΈ = 12.32, ππΏ = 1.12 β’ What is the probability of randomly selecting a sample of size 30 and getting a mean at least 8.94 for an engagement and 8.35 for a learning? β’ For ππΈ the probability is really low. β’ For ππΏ the probabilty is 1.0 β 0.8686 ~ 0.13. β’ So what does this mean, what can we conclude? Check all what applies. Conclusions? 1. The song seems to have had an effect on learning, but not engagement. 2. The song seems to have had an effect on engagement, but not learning. 3. The song caused and increase in bot engagement and learning. 4. The song caused an increase in engagement, but not in learning. Summary of our findings Dependent variable (scale from 1 to 10) Sample mean π (n=30) engagement 8.94 π βͺ 0.01 learning 8.35 π β 0.13 Probability Likely or unlikely? Summary of our findings Dependent variable (scale from 1 to 10) Sample mean π (n=30) engagement ??? π = 0.05 learning ??? π = 0.10 Probability Likely or unlikely? Levels of likelihood - πΌ levels 0.05 (5%) 0.01 (1%) 0.001 (0.1%) β’ Three conventional levels of (un)likelihood β’ If the probability of getting a sample mean is less than 0.05 β 0.01 β 0.001 then it is usually considered unlikely. β’ These are called the πΆ levels. Or significance levels (hladiny významnosti). β’ πΌ level is our criteria for deciding if something is likely or unlikely. Quiz β’ Focus on πΌ = 0.05 Sampling distribution Z* β’ Which of the following are true? 1. If the probability of getting a particular sample mean is less than πΌ, it is unlikely to occur. 2. If a sample mean has a Z-score greater than Z*, it is βunlikelyβ to occur. 3. If the probability of getting a particular sample mean is βunlikelyβ, the sample mean is in he orange region. 4. The alpha level corresponds to the orange region. Z-critical value If the probability of obtaining a particular sample mean is less than alpha level then it will fall in this tail which is called the critical region. Z* Z-critical value If the Z-score of the sample mean is greater than the Zcritical value we have an evidence that this mean is different from the regular population (the population that had not watched the musical lesson). Critical regions β’ What is the Z-critical value for πΌ = 0.05? β’ Using Z-table you find Z-value for 0.95 probability. Which is 1.65. β’ What is the Z-critical value for πΌ = 0.01? β’ 2.33 β’ What is the Z-critical value for πΌ = 0.001? β’ 3.08 β’ We take a sample mean from a sample size π. β’ Then we calculate its Z-score π₯βπ π=π π β’ And we get a Z-score of 1.82. β’ We say that this is significant at π < 0.05. β’ 1.82 is somewhere in the red region at the previous picture. It is less than 0.05, but not less than 0.01. β’ It means that a probability of obtaining this sample mean is less than 5%, but is not less than 1%. β’ And remember, 0.05 is the alpha level. Significance quiz Z-score 3.14 2.07 2.57 14.31 Significant at p< p< p< p< πΌ level Z-critical value 0.05 1.65 0.01 2.32 0.001 3.08 Significance quiz Z-score 3.14 2.07 2.57 14.31 Significant at p < 0.001 p < 0.05 p < 0.01 p < 0.001 πΌ level Z-critical value 0.05 1.65 0.01 2.32 0.001 3.08