Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability Probability: is a numerical measure of the likelihood that an event will occur An experiment: is any process that generates welldefined outcomes Sample space (S): is the set of all possible outcomes of an experiment An event (A): is an outcome or set of outcomes that are of interest to the experiment. An event (A) is a subset of the sample space (S) The probability of an event A {P (A)}: is a measure of the likelihood that an event A will occur Example: Tossing a coin Experiment: Toss a coin and observe the up face S{ } S= {H, T} H (head) T (tail) Example: Tossing a coin twice Experiment: flip a coin twice and observe the sequence (keeping track of order) of up faces. S= {HH, HT, TH, TT} A= {Tossing at least one head} A = {HH, HT, TH} Example = Tossing by a dice Experiment: Tossing a six-sided dice and S= {1, 2, 3, 4, 5, 6} A= {roll an even number} A = {2, 4, 6} Methods of assigning probability Classical probability: Each outcome is equally likely It is applicable to games of chance In the cases, if there are N outcomes in S, then the probability of any one outcome is 1/N If A is any event and nA is the number of outcomes in A, then: P (A) = nA N Example: Tossing a dice: S= {1, 2, 3, 4, 5, 6} P (1) = P(2)= P(3)=P (4)=P(5)=P(6)= 1 6 A= {roll an even number}= {2, 4, 6} P (A) = 3/6 = 0.5 Empirical probability is simply the relative frequency that some event is observed to happen (or fail). Number of times an event occurred divided by the number of trials: n P (A) = N Where: N= total number of trails nA Number of outcomes producing A A Relative frequency example Children No. 0 1 2 3 4 5 Sum Frequency 40 80 50 30 10 5 215 Relative frequency 40/215 = 0.19 80/215 = 0.37 50/215 = 0.23 30/215 = 0.14 10/215 = 0.05 5/215 = 0.02 215/215 = 1.00 Basic concepts of probability: Probability values are always assigned on a scale from 0 to 1 A probability near 0 indicates an event is unlikely to occur A probability near 1 indicates an event is almost certain to occur A probability near of 0.5 indicates event is just as likely as it is unlikely The sum of the probabilities of all outcomes must be 1 Definitions Mutually exclusive events: occurrence of one event precludes the occurrence of the other event Independent event: occurrence of one event does not affect the occurrence or nonoccurrence of the other event Complementary events: all elementary events that are not in the event A are in its complementary event. P (Sample space) P (A') = 1-P (A) Laws of Probability The addition rule: The probability of one event or another P (A or B) = P (A) + P (B) – P (A and B) If A and B are mutually exclusive events (A and B can not occur at the same time), then P (A or B) = P (A) + P (B) Examples: Gender Type of position Managerial Professional Technical Clerical Total Total 8 31 52 9 100 3 13 17 2.7 55 69 31 100 0.645 P (T C) = P (T) + P (C): 155 155 155 11 44 69 31 155 Law of multiplication: The probability of both the A and B occur together P (A and B) = P(A) × P(B/A) If A and B are independent (the occurrence of one does not affect the occurrence of the other): P (B/A)= P(B), and then P (A and B) = P(A) × P(B) Probability of at least one = 1- Probability of non Probability Distribution Defined: It is the distribution of all possible outcomes of a particular event. Examples of probability distribution are: the binomial distribution (only 2 statistically independent outcomes are possible on each attempt) (Example coin flip) the normal distribution other underlying distributions exist (such as the Poisson, t, f, chi-square, ect.) that are used to make statistical inferences. The normal probability distribution The normal curve is bell-shaped that has a single peak at the exact centre of the distribution. The arithmetic mean, median, and mode of the distribution are equal and located at the peak The normal probability distribution is symmetrical 1 about its mean (of2 the observations are above the 1 mean and are below). 2 It is determined by 2 quantities: the mean and the SD. The random variable has an infinite theoretical range (Tails do not touch X – axis). The total area under the curve is = 1 Figure 68% of the area under the carve is between 1 SD 95% of the area under the carve is between 1.96 SD 99% of the area under the carve is between 2.58 SD Why the normal distribution is important? A/ Because many types of data that are of interest have a normal distribution Central Limit theorem sampling distribution of means becomes normal as N increases, regardless of shape of original distribution Binominal distribution becomes normal as N increases N.B: Normal distribution is a continuous one Binomial distribution is a quantitative discrete Standard normal distribution (curve) A normal distribution with a X of zero and SD of 1 is called standard normal distribution Any normal distribution can be converted to the standard normal distribution using the Z-statistics (value) Z-value (SND): is the distance between the selected value, designated X, and the population mean (M), divided by the population SD ( ) M Z= The standard normal distribution curve is bell-shaped curve centered around zero with a SD=1 Z- score Z-score is often called the standardized value or Standard Normal Deviate (SND). It denotes the number of SD.s a data value X is distant from the and in which direction. A data value less than sample mean will have a z-score less then zero; A data value greater than the sample X will have a z-score greater than zero; and A data value = the will have a z-score of zero Normal curve table The normal curve table gives the precise percentage of scores (values) between the (zscore of zero) and any other z-score. It can be used to determine: 1. proportion of scores above or below a particular z-score 2. proportion of scores between the and a particular z–score 3. proportion of scores between two z–scores By converting raw scores to z-scores, can be used in the same way for raw sources. Can also used in the opposite way: Determine a z-score for a particular proportion of scores under the normal curve. * Table lists positive z-scores * Can work for negatives too * Why? Because curve is symmetrical Steps for figuring percentage above or below a z-score: Convert raw score to z-score, if necessary Draw a normal curve: - indicate where z-score falls - Shade area you are trying to find Find the exact percentage with normal curve table Figure Steps for figuring a z-score or raw score from a percentage: Draw normal curve, shedding an approximate area for the percentage concerned Find the exact z-score using normal curve table Convert z–score to raw score, if desired Figure Example: For = 2200, M = 2000, = 200, Z = (2200-2000)/200=1 For = 1700, M = 2000, = 200, Z = (1700 – 2000)/200= -1.5 A z-value of 1 indicates that the value of 2200 is 1 SD above the of 2000, while a z-value of -1.5 indicates that the value of 1700 is 1.5 SD below the of 2000. Example: For M= 500, = 365, determine the position of 722 in SD units Figure X M = 0.61 = 722 500 365 222 = 365 We can also determine how much of the area under the normal curve is found between any point on the curve and the Once you have a z-score, you can use the table to find the area of the z-score 0.61 (from table A) = 0 .2291 = 0.23 Therefore, 22.9% or 23% Q/ How much of the population lies between 500 and 722? A/ 0.5 – 0.23 = 0.27 Q/ How much of the population is to the left? A/ 0.5 + 0.23 = 0.73 Example: The daily water usage per person in an area, is normally distributed with a of 20 gallons and a SD of 5 gallons Q1/ About 68% of the daily water usage per person in this area lies between what 2 values? A/ About 68% of the daily water usage will lie between 15 and 25 gallons Q2/ What is the probability that a person from this area, selected at random, will use less then 20 gallons par day? A/ P (X < 20) = 0.5 Q3/ What percent uses between 20 and 24 gallons? The z-value associated with X=24: z = (24 -20)/ 5 = 0.8 From the table, the probability of z= 0.8 is 0.2119. Thus, P (20 < × < 24) = 0.5 – 0.2119 = 0.2881 = 28.81% Figure What percent of the population uses between 18 and 26 gallous? A/ The z-value associated with X = 18: z = (18-20)/5= -0.4 and for X=26: z= (26-20)/5 = 1.2 Thus P (18 <× < 26) = P (-0.4 < Z < 1.2) =0.6554 – 0.1151 =0.5403 Example: Height of young women: The distribution of heights of women, aged 2029 years, is approximately normal with X =64 inch and SD= 2.7 inch Q/ Approximately, 68% of women have height between ……………. and …………. Q/ ~ 2.5% of women are shorter than …….. Q/ Approximately, what proportion of women are taller then 72.1=?