Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
www.studyguide.pk Mathematical Modelling A model is a simplification of the real thing. It will be both quicker and cheaper to produce than the real one and will help us to understand the real world object or situation. Mathematical models require the use of probability. A statistical experiment is a test, investigation or some process adopted for collecting data to provide evidence for or against a hypothesis. An event is a sub-set of possible outcomes of an experiment. We can vary parameters if we wish. A disadvantage is that a model does not replicate real-world situations in every detail. www.studyguide.pk Collecting Data Collecting data is important as a method must be used to avoid bias. One source of bias is using data from responses to questions as people may lie about personal questions such as age and weight. Another source of bias is when using data that does not properly apply to the problem. eg. Using published unemployment figures to investigate the number of people looking for work, but they don't include students, people past retirement age etc. but they may include people who are not looking for work. To check data is unbiased ask: Where has the data come from? Who is supplying the data and why? How was the data collected? Is it all the relevant data or a sample? If a sample is used, how was the sample chosen? Is the data relevant to the investigation? Does the conclusion follow from the investigation? Types of Data Qualitative Data These are non-numerical values such as attitudes, gender, colour, football shirt number Quantitative Data These data have valid numerical values such as shoe size, number of broken eggs, height, time ● ● Discrete data come from variables which can only take particular values such as shoe size. Continuous data come from variables which can take any value within a given range. Summarising Data The reason that a sample is taken is to make deductions about the population. Graphical and numerical summaries are essential in order to help us analyse the data collected. The purpose of these summaries is to condense the data to reveal patterns ans to enable comparisons to be made. Summarising can lead to a loss of accuracy. StudyGuide.PK A-Level Maths S1 Notes Page 1 www.studyguide.pk Ungrouped Frequency Distribution Data must be sorted before any sense can be made of it. This is often done using a frequency distribution with a cumulative frequency column. Stem and Leaf Diagrams One way of ordering and presenting data is a stem and leaf diagram. The benefits are that it retains all the original data and yet it is 'grouped' into classes. We must arrange the leaves in numerical order and give a key. A stem and leaf diagram gives a quick visual impression of the shapes of distribution. Both integers and decimal can be represented though the data is usually to 2 sig fig. It may be necessary to round data to meet this constraint. www.studyguide.pk If a large number of leaves are associated with one line then it is usual to use two lines. We can also improve our diagrams by showing the number of leaves on each stem in brackets. If direct comparison of two data sets is required, a back-to-back stem and leaf diagram can be drawn. Grouped Frequency Distributions We can summarise data into grouped frequency tables. The information becomes more concise, but the original information has been lost. It allows summaries and estimates to be made. Both continuous data and discrete data can be grouped. The boundaries of the groups must be matched, even if this results in a negative starting point. Groups are usually referred to as classes. Age is a special case, the boundaries are matched to complete years ie. 21-24, 25-28 is actually 21-25, 25-29. Cumulative Frequency Curves and Polygons for Grouped Data When data is grouped (discrete or continuous) we consider the cumulative frequencies to be the total frequency up to the upper class boundary of each interval. To draw a cumulative frequency curve, we plot the ucb of each interval against its cumulative frequency (cf) and join with a smooth curve. For a cumulative frequency polygon, we join the points with straight lines as opposed to a smooth curve. Histograms If the data available is for a continuous variable and it is summarised by a grouped frequency distribution, then the data can be represented by means of a histogram. There are no gaps between the bars of a histogram. Thus boundaries must be matched. There is an important relationship between the area of a histogram bar and the frequency that it is representing. Area is directly proportional to frequency. Total area is directly proportional to total frequency. StudyGuide.PK A-Level Maths S1 Notes Page 2 www.studyguide.pk Frequency density = frequency /class width Frequency = Frequency Density * class width There are times when it is useful to draw a histogram based on relative frequencies rather than frequencies. The relative frequencies are obtained by expressing the frequencies as a proportion of the total frequency. Methods of Summarising Sample Data Measures of Location (averages) These are sometimes called measures of central tendency which attempts to locate a typical value about which a distribution clusters. Methods of Dispersion These are used to represent the spread or variation within the data since it is unlikely that all the values in a data set will be the same. All these measures are generally numerical quantities. Measures of Location The Mode The mode is the value that occurs most often. It is not always unique (can be bi-modal) and there may not be a mode. In the case of grouped frequencies, the mode is not always useful, but there are ways to estimate the mode using a histogram. Usually, the modal class would be sufficient. www.studyguide.pk It is easy to calculate and is not affected by any extreme values. It is useful to shops to know what sizes to stock. The Median The middle value of an ordered set of data. If there are n observations arranged in order of size, the median value is the n + 1 th observation. 2 To find the median, we use the cumulative frequency. We can estimate the median of grouped data using linear interpolation: Median Q2 = L + n+1 -fL 2 *c f L = Lower class boundary of median class n = total frequency fL = cumulative frequency up to the median class f = frequency in the median class c = class width of median group Similar advantages and disadvantages to the mode. Other Quantiles Can be done using the formula above but with n+1 over 4 for quartiles, 10 for deciles and 100 StudyGuide.PK A-Level Maths S1 Notes Page 3 www.studyguide.pk for percentiles, and then multiplied by which quantile it is e.g. the 43 rd percentile would be 43(n+1/100) in the place of n+1/2 www.studyguide.pk The Arithmetic Mean The mean is the most widely used measure of location and is often used in conjunction with the standard deviation (a measure of spread) If x1, x2, x3, ...xn are a set of numbers then x= Σx /n For a frequency distribution this formula is re-written as x = Σfx /Σf where Σf = n Always state the appropriate values in your answer ie. Σfx, Σf, n When given two means and the frequency you must find the totals and add these together and divide by the total frequency to get the new mean (weighted mean) For grouped data we use the midpoint. Remember age is special: If you have the groups 0-9 10-19 then you consider the first group as 0-10 therefore the midpoint would be 5. Advantages and disadvantages The mean is influenced by extreme values; it is sensitive to the presence of outliers. It is not as easily calculated as the median All the values are used directly when calculating the mean. The mean has important mathematical properties. Even if we have grouped frequency distributions of unequal intervals, this makes no difference to the calculation of the mean. Remember that for grouped data, the mean is only an estimate. Calculating the Mean Using the Method of Coding use this method if asked to do so y= x–a /b alters the original x values a = the midpoint of the modal class b = the class width (if class widths are not equal then use the smallest class width) From this we can calculate the mean of y and decode to find the mean of x x = by + a Weighted Mean When we wish to place greater emphasis on some of the values we use a weighted mean Range ● ● Measures of Dispersion The simplest measure of spread Based entirely on extreme values StudyGuide.PK A-Level Maths S1 Notes Page 4 www.studyguide.pk ● ● ● ● Smallest value is subtracted from largest value. For grouped frequency distributions, an estimate of the range is the difference between the lower class boundary of the first group and the upper class boundary of the last group. Does not lend itself to mathematical use Used only with small data sets in conjunction with either the mode or the median Interquartile Range ● range of the middle 50% IQR = Q3 – Q1 ● ● Not affected by extreme values ● If the median is the measure of location used then the IQR is the appropriate measure of dispersion ● Often used when data has extreme values or has open-ended classes or is not symmetrical ● Used extensively in conjunction with box plots ● Can help us identify outliers and examine the skewness of a distribution Semi-Interquartile Range SIQR = IQR/2 Standard Deviation and Variance Standard deviation is used in conjunction with the mean. Uses all the data values The population variance is denoted by σ2 The sample variance is denoted by s2 The standard deviation is the positive square root of the variance. The population sd is denoted by σ The sample sd is denoted by s σ2 = Σx2 - x2 n σ = Σx2 - x2 n Where x = Σx n For most distributions, the bulk (95%) of the distribution lies within 2sd's of the mean The units of sd are the same as the original data We can never get a negative variance (as its sqrt is the sd) For similar sets of data it is useful to compare the sd's When there is a frequency distribution we use the formula: σ = Σfx2 - x2 StudyGuide.PK A-Level Maths S1 Notes Page 5 www.studyguide.pk Σf We can code and decode like before but when decoding, you do not need to +a as this does not alter the spread. See purple notes for Combining sets of numbers www.studyguide.pk Skewness Symmetrical Bell-Shaped Distribution mean=median=mode Normal Distribution Positively Skewed Distribution mean>median>mode The mean is pulled in a positive direction Negatively Skewed Distribution mean<median<mode The mean is pulled in a negative direction Measures of Skewness Pearson's Measure of Skewness Pearson's Measure of Skewness = mean – mode standard deviation If this value is positive then we have positive skewness. If this value is negative then we have negative skewness. Generally skewness can take any value between -3 and 3 This can be rewritten as: 3(mean – median) standard deviation Quartile Coefficient of Skewness Normal Distribution Q3 - Q2 = Q2 – Q1 Quartile skewness = 0 Positively Skewed Distribution Q3 - Q2 > Q2 – Q1 Quartile skewness > 0 Negatively Skewed Distribution Q3 - Q2 < Q2 – Q1 Quartile skewness < 0 Box Plots illustrates the dispersion or spread of the distributions, as well as the average (median) it uses the highest and lowest values of the data, and the three quartiles the box encloses the middle 50% (the IQR) The whiskers extend to the upper and lower values (the range) When commenting on box plots you must give all the summary statistics (median, IQR, range) StudyGuide.PK A-Level Maths S1 Notes Page 6 www.studyguide.pk comment on the skewness of the given distributions with justification calculations make comparisons of the two or more distributions Always draw box plots on graph paper and label your axis clearly. Use a suitable scale. Symmetrical Bell-Shaped Distribution The whiskers are of equal length and the median is in the middle of the box. Positively Skewed Distribution The right hand whisker is longer and the median is nearer to the lower quartile. Negatively Skewed Distribution The left hand whisker is longer and the median is nearer to the upper quartile. Use of Box Plots to Identify Outliers Extreme values are known as outliers There may be good reason for these results but they are often due to errors They may need to be highlighted They are often considered as points lying more than 1.5 times the IQR above Q 3 or below Q1 Procedure Find the value of the quartiles Evaluate Q1 – 1.5(Q3 – Q1) and Q3 + 1.5(Q3 – Q1) and note any values that fall outside this range Draw a box based on the quartile values. If there are any outliers, label them with crosses. The whisker is usually drawn to the next value towards the median Only calculate these outliers if the question specifically asks you to do so Correlation the relationship between two variables x and y bi-variate data produce a bi-variate distribution There may be a relationship but you cannot necessarily expect to find a law/formula relating them We initially look for basic associations Scatter Diagrams Bi-variate data is conveniently displayed through scatter diagrams They help to assess correlation and regression. We can use to help show linear correlation Even if we find a mathematical relationship, this does not imply that there is a relationship in reality, or indeed that an increase in one variable causes an increase in the other. Correlation measures the relationship and the strength of this relationship between the two variables. If both variables increase together we say that they are positively correlated. If one variable increases as the other decreases we say that they are negatively correlated. If no relationship can be seen we say there is no correlation. When drawing scatter diagrams it doesn't matter which axis is used for which variable, however it does when measuring regression. www.studyguide.pk StudyGuide.PK A-Level Maths S1 Notes Page 7 www.studyguide.pk If a horizontal line and a vertical line are drawn through the mean point (x, y), you can see the association between the two variables in a different way: For a postive correlation most points lie in the first and third quadrants (top right and bottom left respectively) For a negative correlation most points lie in the second and fourth quadrants (top left and bottom right respectively) If there is no correlation the points are randomly distributed in all four quadrants. Product Moment Correlation Coefficient, r PMCC The pmcc r is a numerical value that indicates the degree of scatter. It measures the relationship between the two variables and its strength. We must calculate this value and interpret its meaning. The value of r lies between -1 and 1 It is a useful measure because it is independent of the units of the scale of the variables. The calculation of r should only follow after a scatter diagram has been drawn in reality. It should only be calculated if the scatter diagram reveals some degree of linear correlation. If correlation is non-linear than pmcc is not appropriate. Outliers, or rogue results, should be identified as they may upset the general trend. If r = 1 there is perfect positive linear correlation between the two variables. If r = -1 there is perfect negative linear correlation between the two variables. If r = 0 (or close to 0) there is no linear correlation; this does not, however, exclude the existence of another type of relationship. Calculation r= Sxy √(SxxSyy) where Sxy = Σxy – ΣxΣy n where Sxx = Σx2 – (Σx)2 n where Syy = Σy2 – (Σy)2 n We must find n, Σx2, Σx, Σy2, Σy, Σxy And then use above formulae Calculator must be in linear regression mode. Using A Method of Coding for Correlation The beauty of coding for the PMCC is that we do not need to decode at the end. It makes the values of x and y smaller. You can subtract any number from the x values, since this only moves the axis. You can divide the result by any number since this only changes the scale. The correlation coefficient is unaffected by either of these operations. You can rewrite the variables x and y as: X= x-a /b StudyGuide.PK A-Level Maths S1 Notes Page 8 Y= y-c www.studyguide.pk /d where a, b, c and d are suitable numbers to be chosen. Note: Just because two variables have a linear correlation does not necessarily mean that they are related. Thus, you should have some reason to believe that there might be a relationship before calculating the PMCC, unless your aim is to prove that they are unrelated. Data can be distorted by an outlier, so the information should be plotted on a scatter-graph first. Note: A quadratic graph would give a PMCC of 0, as it has correlation, but it is non-linear. Often variables are linked only through a third variable. Particularly changes that take place over time. Regression Purpose: to find a law connecting two variables, so that we can make predictions about the value of y for any given value of x. Explanatory and Response Variables The value of x is controlled. It is known as the explanatory or independent variable whilst y is called the response or dependent variable. The response variable will be subject to some level of error or natural variation. To see if there is a relationship, we plot a scatter diagram. The explanatory variable is always plotted horizontally and the response variable is always plotted vertically. By examining the scatter diagrams for data, we can see if a straight line would be a good or appropriate model for the relationship between x and y. The Straight Line Law In statistics, instead or writing y= mx + c, we use y = a + bx This can be rearranged to y - y = b(x - x) Having assumed the linear regression model, the results are used to find a regression line. This line is known as the regression line of y on x, since y is the response variable for a given value of x. If you assume a linear regression line, each point with coordinates (x i, yi) will have a vertical distance ri from the regression line. These are known as residuals. If the residuals are very small, a line may be drawn by eye, however a much better solution is to find the line of best fit using the method of least squares. Legendre formulated this method. The resulting line is known as the least squares regression line. The Least Squares Regression Line Making the sum of the squares of the residuals as small as possible. ie Σ r i 2 is minimised. We substitute the mean point (x, y) into the equation y - y = b(x - x) and rearrange to get y = a + bx The gradient m is given by the letter b and is called the regression coefficient of y on x. We will StudyGuide.PK A-Level Maths S1 Notes Page 9 www.studyguide.pk need to calculate b using the formula; b= Sxy Sxx x= Σx n Σy n y= To draw this line, we choose three points: the mean point and one point whose x value is at the low end of the observed values and another point whose x value is at the high end of the observed values. We can use our regression line to obtain estimates of y given values of x under appropriate conditions. Application and Interpretation To make estimates of the response variable within the range of the observed values of the data is know as interpolation. You do not know what happens outside the range of our values of our experimental data. We are assuming a linear relationship within our observed values and for all we know the relationship between the variables outside of the range of values may be non linear. Therefore it is dangerous to make predictions or estimates for the response variable based on values outside the range of observed values. The process is known as extrapolation. You will also be asked to give interpretation for the values of a and b from your regression lie within the context of the question. While regression is concerned with finding a linear law between the two variables in question, the value of the response depending for its value upon that of the explanatory, correlation is concerned with how strongly two variables are linearly associated (not a law) Probability Venn Diagrams and Probability Definitions ∩ = intersection AND U = union OR A| = NOT A OR in maths means the probability of both P(A) = 1 - P(A|) P(A|) = 1 - P(A) P(AUB) = P(A) + P(B) - P(A∩B) P(A|UB) = P(A|) + P(B) - P(A|∩B) P(AUB|) = P(A) + P(B|) - P(A∩B|) P(A|UB|) = P(A|) + P(B|) - P(A|∩B|) P(A|∩B|) = 1 - P(AUB) P(A|∩B) = 1 - P(AUB) - P(A) StudyGuide.PK A-Level Maths S1 Notes Page 10 www.studyguide.pk P(A∩B|) = 1 - P(AUB) - P(B) Mutual Exclusivity Two events A & B are said to be mutually exclusive (m.e) if they cannot occur at the same time. In this case, in the Venn Diagram, A & B do not overlap Thus P(A∩B) = 0 P(AUB) = P(A) + P(B) for these events Exhaustion If two events A & B are such that AUB makes up all the possible outcomes P(AUB) = 1 We say that A & B are exhaustive P(A) + P(B) - P(A∩B) = 1 Conditional Probability (Dependent Events) If A & B are any two events where P(A) ≠ 0 and P(B) ≠ 0 then the probability of A given that B has already occurred is written as P(A|B) P(A|B) = P(A∩B) / P(B) P(B|A) = P(A∩B) / P(A) Conditional probability reduces the sample space Note: If events A & B m.e then we know P(A∩B) = 0 so P(A|B) = P(B|A) = 0 Note: We can extend this basic conditional probability definition to things like P(A||B) = P(A|∩B) / P(B) Note: P(A||B) = 1 - P(A|B) P(A||B|) = 1 - P(A|B|) * * * * without replacement is conditional probability. with replacement is independent event Use common sense where possible Resort to definitions when common sense fails Independent Events 2 events are independent if the probability that 1 of them occurs is no way influenced by whether or not the other has occurred. Thus In this case P(A|B) = P(A) P(B|A) = P(B) P(A∩B) = P(A) * P(B) Discrete Random Variables The following are examples of discrete random variables. ● the score when a die is thrown ● the value of a prize awarded ● the profit in a game of chance etc The set of all possible values of a r.v. together with their probabilities is called a probability distribution (probability disn) StudyGuide.PK A-Level Maths S1 Notes Page 11 www.studyguide.pk Also, the function that describes how the probabilities are assigned is called the probability function. For an r.v, X the probability function is denoted by P(X=x) Remember Σ P(X=x) = 1 Random variables are denoted by capital letters and the particular values they take are denoted by lower case letters. www.studyguide.pk Whatever the question is, always define what the random variable is. The function that is responsible for allocating the probabilities P(X=x) is also known as the probability density function (pdf) Sometimes it can be expressed in a tabular form or in a formula. The cumulative distribution function (cdf) F(x) = P(X≤x) F(last number) = 1 Expectation E(X) E(X) = Σ x P(X=x) E(X) is the expected value, the mean of the probabilities. We obtain this value of the expected mean by multiplying each score by its corresponding probability and summing them. This is a theoretical approach (the mean of the frequency distribution is a experimental approach). Note: Some probability distributions are symmetrical about a central value. In this case the E(X) is the middle value. A discrete random variable with pdf P(X=x) = k , for all given values of x, where k is a constant is said to follow a Uniform Distribution The Expectation of Any Function of X The definition of expectations can be extended to any function of the r.v X, such as X 2 , 9X, X-4, 3X2 - 5X In general, if g(x) is a function of X, a discrete random variable, then E[g(x)] = Σ [g(x)] P(X=x) The following results hold when X is a discrete random variable and when both a and b are constants 1. E(a) = a 2. E(aX) = aE(X) 3. E(aX + b) = aE(X) + b StudyGuide.PK A-Level Maths S1 Notes Page 12 www.studyguide.pk The Variance of X Var(X) = E(X2) - [E(X)]2 where E(X) is the mean μ Var(a) = 0 Var(aX) = a2 Var(X) Var(aX + b) = a2 Var(X) Var(aX ± bY) = a2 Var(X) + b2 Var(Y) The Discrete Uniform Distribution If the discrete random variable X is defined over the set of distinct values. {x 1 , x2 , x3 ... xn} and each value is equally likely, then X has a discrete uniform distribution and P(X = xr) = 1/n r = 1, 2, 3 ... n X = the value of next outcome If X is the discrete uniform variable and x n = n (ie. x values start at 1 and progress up consecutively) μ = E(X) = n+1/2 σ2 = Var(X) = (n+1)(n-1) /12 The Normal Distribution Most important continuous distribution in statistics. Seen in heights, masses, age etc. The probability density function of the normal random variable is very complicated. The shape of the curve depends on two parameters, mean and variance. X ~ N(μ, σ2) The distribution is bell shaped and symmetrical about the mean Mean = median = mode 95% of the distribution lies within 2 sd's of the mean. 99.8% lies within 3 sd's of the mean It is a two parameter distribution. The probability of X relies only on μ and σ 2 Area under curve = 1 We must standardise X to get the standard normal random variable (Z) Z ~ N(0, 1) Areas under the Curve Use the tables to find values of ф(a) in the interval 0 to 4 For values between -4 and 0 we use the symmetry of the normal distribution to find appropriate probabilities. P(Z < a) = ф(a) P(Z > a) = 1 - P(Z < a) = 1 - ф(a) P(Z < -a) = ф(-a) StudyGuide.PK A-Level Maths S1 Notes Page 13 www.studyguide.pk = 1 - ф(a) (by symmetry) P(Z > -a) = ф(a) P(a<Z<b) = ф(b) - ф(a) P(-a<Z<a) = P(|Z| < a) = 2ф(a) - 1 P(|Z| > a) = 1 - P(|Z| < a) = 2 - 2ф(a) Use all four decimal places from table. Special Probability Table This contains z values for the normal variable Z~N(0,1) such that r.v exceeds z with probability p. P(Z>z) You can use both tables in reverse to find the value of z, given a probability. Transformation of any Normal Random Variable to a Standard Normal r.v. If X~N(μ,σ2) then Z =X-μ σ Where Z~N(0,1) This is called standardising X to the normal r.v Z www.studyguide.pk StudyGuide.PK A-Level Maths S1 Notes Page 14