* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CML 2 - Nicole De Langen
Survey
Document related concepts
Transcript
Nicole Delangen CML 2 Statistics and Probability Nicole De Langen CONTENTS Data The Box and whisker plot : Statistic Central tendency: Continuous Data- Probability Distributions Misrepresent- Lattice diagram: Column Graph- Venn diagram: Bar graph- Tree diagram: Pie Graph- Frequency table : Scatter plot- Probability: Histogram- Characteristics of shape Five summaries Quartile Test and reflections 1|Page YEAR 10 LEARNING OUTCOMES CML 2 – Probability and statistics: I think that this CML may be easier to do, because there isn’t many formulas (I think) that is used in statics and probability. It would good to understand most of this CML because it will help in other subjects when asked to explain a graphed information, or even work out certain data points when writing a practical report. This time I would like to improve on the reflective side of the journal and including a better description on the modelling task. Data What is smarter a cat or dog? Cats Dogs And this is the amount of people that chose cats - A collection of facts, such as values or measurements Results 10 20 Then this is the amount of people that choice Dogs The Information from the question (Cat and Dog) Statistic - A numerical fact or datum, especially one computer from a sample, How to collect, summarize and present it. Day 1 2 3 4 Height 6.5 6.6 6.8 6.7 The statistic lays out the results or data into a readable format. Giving an easier Average Height= 6.65 Minimum Height= 6.5 Maximum Height= 6.8 Quantitative Data- Counted (called discrete data) or Measured (called continuous data) Data Qualitative Qualitative Discrete Continuous Discrete data- Data that can only take certain values, For example: the number of students in a class (you can’t have half a student). It also means to round up or down the number. Instead of 4.7 of a person you would round it up to 5. 5 2|Page Continuous DataData that can take any value (within a range), For example: People’s heights could be any value (within the range of human heights), not just certain fixed heights. 3–4 1-2 MisrepresentTo represent incorrectly, improperly, falsely or unsatisfactory Age, gender Male, 15 Female,15 Male, 16 Height 180cm 170cm 678cm This is a misrepresentation because it is virtually impossible to have a 16 year old Male with a height of 6m and 78cm. Column GraphA graph that shows data with a vertical bar. This is an example of a column graph! Number of students School Pets 80 60 40 20 0 A column graph has 4 important components. 1. 2. 3. 4. survey. The heading / title of your experiment The Number of students The Data put into columns Finally the name of the pets or the options given during a Types of pets 3|Page Bar graphThis is virtually the same as a column graph but the other way around. Instead of being vertical columns it is horizontal bars. Animals of choice School Pets 1. 2. 3. 4. Snakes Fish Mice Birds Dogs Cats 0 10 20 30 40 50 60 70 Heading/title Number of students Data in bars The names of the pets Cats Dogs Birds Mice Fish Snakes Axis Title DID YOU KNOW? You can make any graph with ease using Microsoft Excel: 20 50 13 8 63 6 Enter in your results like this, Then highlight the table , Go into INSERT, CHARTS choose an appropriate graph and it should automatically show. Pie Graph- This pie graph has the same information then column graph but value. it shows A pie chart is a circular chart divided into sectors, each sector shows thethe relative size of each different information. The pie graph is better if the user is looking for the percentage of the results. School Pets Each sections of the pie = the different choices given Snakes 4% Cats 13% For example : Dogs = 50 = 31% Fish 39% Dogs 31% Birds 8% Mice 5% So 50 people have dogs or 31% of the total people surveyed have dogs. 4|Page Scatter plotA graph of plotted points that show the relationship between two sets of data. School 1 100 Number of students with Pets 90 Linear 1 y = -1.0286x + 30.267 R² = 0.0064 80 70 School 2 Age, gender Male, 15 50 Linear (Series1) 40 Linear (Series2) 180cm Female,15 170cm Male, 16 60 Height 678cm 30 Linear 2 y = -7.2571x + 49.067 R² = 0.1635 20 10 0 0 2 4 6 8 Linear – An equation that makes a straight line when it is graphed. Often written in the form: y=mx+b So in this instances Linear 1 y= 1.0286x+30.267 and Linear 2 y=-7.2571x+49.067 R2- the R2 is the correlation coefficient. The R2 is mainly used to determine how much scatter is on the graph. The R2 value gives a percentage form the formula. R2 = Ʃxi2 x Ʃyi2 / Ʃ (xi2 x yi2) To get this on the graphs made in excel you need to : The scatter plot is good if you have two set of information to compare. This graph has the original school pets’ information as well as data from another school. 1. 2. 3. 4. 5. click on the chart Go into layout Select Trend line then choose the appropriate line Double Click on the new formed line Click display equation and display R2 The scatter plot allows for these schools to be compared . Scatter plots are also good for the linear line or tread line 5|Page HistogramThis graph is like a column graph but each group of numbers showing a range between them. A Histogram is a great way to show results of continuous data like: Weight, height and time. The Histogram is also different because of the way the columns are laid out The History of the Histogram Average height of students 600 <-> 700 700 100 <-> 200 100 <-> 200 600 Height in (m) The word ‘Histogram’ is of Greek origin meaning a form of writing in long vertical shapes, But it wasn’t used until 1859 when K, Pearson introduced Population and sample it into statistics 800 500 400 The columns are joined together 300 200 100 Population: A group from which the sample or data is taken. 0 Male 1 Female 1 TEst students Male 2 Sample: Gathering of data from a large group or “population” so it can be examined to find out something that affects the whole group more accurately. E.g. If there was an outbreak how much of the sample group was infected. Five summaries Quartile Q1: The lowest number in the set Q2: Lower Quartile- The lower quartile is the median of the lower half of the data. (E.g. In the data set below, what is the upper quartile?) 2 2 3 3 3 4 4 5 6 8 9 Work out the median then the higher numbers to the right find out the middle number, 4 is the median overall, the higher numbers median is 6 Q3: The median of the set of numbers. Q4: Upper Quartile- The upper quartile is the median of the higher half of the data. (E.g. In the data set below, what is the lower quartile?) Work out the median then the higher numbers to the right find out the middle number, 4 is the median overall; the lower number median is 3. Q5: The highest number in the set of numbers 6|Page The Box and whisker plot : To draw a box and whisker plot you need your numbers to be in ascending or descending order. Then follow these steps (five quartile summary see page 4): 1. 2. Find the median of the numbers Then find the median for the lower numbers 3. And again find the median for the higher numbers (the five quartile summary) 4. Now you have three points, joining with the data work out the highest point and the lowest point ( this then gives us the five quartile summary) 5. Draw a number line that matches your data 6. 2 2.5 3 3.5 … Mark of the highest point, lowest point and the three data points from previous 7. Box the three points Quartile/ point 3 Quartile/ point 1 Quartile/ point 2 8. And whisker the endpoint To generate the box and whisker plot online go to and follow the prompts: The end product should look like this: http://www.mathwarehouse.com/charts/b ox-and-whisker-plot-maker.php Central tendency: Mean: Sort out the number in ascending order, add them together and divide the outcome by the total amount of numbers overall in the line. 2 +2 +3 +3 +3 +4 +4 +5 +6 +8 +9= 196 196/11= 17.81 7|Page Median: The median is the middle number; Sort out the numbers in ascending order. Count the number of digits in the list. 2 2 3 3 3 4 4 5 6 8 9 (11+1)/2=12/2=6th number therefore the median is 4 Mode: The mode is the number that is repeated more than any other number. In this example three is most used number. 2 2 3 3 3 4 4 5 6 8 9 Range: The range is the highset number from the lowest number. 2 2 3 3 3 4 4 5 6 8 9 9-2= 7 Probability Distributions Lattice diagram: A Lattice diagram is used when determining all the possible outcomes of a situation that involves two sides. One set of outcome is written at the top of grid and the other down the side of the grid. The combined outcome is then filled in the crossing grid boxes according to the consistent outer labels. Main Meal Chicken Chicken, Fruit Salad Chicken, Ice cream Beef Beef, Fruit Salad Beef, Ice cream Lamb Lamb, Fruit Salad Lamb, Ice cream Fruit Salad Ice cream Desert Venn diagram: The Venn diagram was originally invented by John Venn. The diagram is a way to picture the connection between different groups of things. The diagrams consist of two overlapping circles. E.g. 8|Page For example: Swan Platypuses Worms Horses This diagram reads really easy; the first circle in pink labelled small and fury has a number of small animals that are fury. In the middle between the two circles in green is the animal that has both of the two stated features, (Small, fury and duckbilled). In yellow there are the animals that have duckbill as a feature. Finally the two outside animals in blue, are some of the animals that would not fit in any of the diagram. Tree diagram: When calculating probabilities you have numbers that have been multiplied, subtracted, divided, and added so it can be very messy. In a tree diagram there are “Branches”, on the branches the probability of each branch is written on the branch. The outcome is written at the end of the branch. To work out the results you multiply probabilities along the branches, and you add probabilities down columns. E.g.: The numbers followed along the arrow you multiply the numbers, then after having all the numbers adding them together which gave overall probability. 9|Page Tables: To enter in or form into a list/table. Frequency table : Is constructed by arranging collected data values in ascending order of amount matching frequencies. The frequency of data value is the number of times the data value occurs. This example is four students have a score of 80 in mathematics, and then the score of 80 is said to have a frequency of 4. The frequency of a data value is often represented by f. 6 4 7 5 10 6 7 8 7 8 8 9 7 5 6 6 9 4 7 8 To construct a frequency table: 1. Construct a table with three columns. The first column shows what is being arranged 2. Go through the list of marks, and put tally mark against the arranged numbers in the second column. We continue this process until all marks in the list are tallied. 3. Count up the tally and put in under frequency in the table as follows. 10 | P a g e Probability: Probability measures or quantifies ‘how likely’ the event is likely to happen, Related to these types of experiment, will happen. The value of a probability is a number between 0 and 1 inclusive. An event that cannot occur has a probability (of happening) equal to 0 and the probability of an event that is certain to occur has a probability equal to 1. The probability can be written in originally in fractions: ½ This must be converted into a decimal: 0.5 Then into a percentage: 50% Characteristics of shape The characteristics of a shape shows the statistical techniques that have been developed to make decisions about the probability of the data based on its shape. Symmetrical Data: Symmetrical Data is the easiest type of data use. This is because many statistical techniques have been developed for the same. In fact symmetrical data is so common that it is called the normal curve. It also has other names like the bell curve. There are standard measurements available which can tell the probability of a data point occurring based on the number of standard deviations it is away from the mean. From a six sigma point of view it helps understand how the results of a process are likely to be distributed. Skewed Data: Many times the data is not symmetrical e.g. it is skewed towards one side. Data can be either positively or negatively skewed. There are statistical techniques available which help us find out the probability distributions of skewed data too. However such techniques are not very well developed. This is because most of the sample data being collected usually follows the normal distribution. Statistical analysis of skewed data is therefore not often performed. 11 | P a g e Question 1: Sort the following into: Data, Discrete or Continuous. The first is already done. Data Discrete Continuous Misrepresentation A) 14 cm a. b. c. d. e. f. g. h. 14 cm 2-4 178977 cm 4.7 or 5 7-9 3456789 mm 1.3 or 1 23 m Answers and Explanation Data is often used in values or measurements. Discrete is the data that has been rounded up to have a full number. Continuous data is for data that falls between a range (e.g. People’s heights could be any value (within the range of human heights), not just certain fixed heights.) And finally a Misrepresentation is when the data is presented incorrectly. Data Discrete Continuous Misrepresentation A) 14 cm D) 4.7 or 5 B) 2-4 C) 178977 cm H) 23 m G) 1.3 or 1 E) 7-9 F) 3456789 mm 12 | P a g e Question 2: Answer the following with use of the graph. a. b. c. d. e. How many students have rabbits? How many have snakes? How many have dogs or fish? How many students were surveyed all together? Which graph was easier to read? School Pets Number of students 70 60 50 40 30 20 10 0 Cats Dogs Birds Mice Fish Snakes Types of pets School Pets Animals of choice Snakes Fish Mice Birds Dogs Cats 0 10 20 30 40 50 60 70 Axis Title Answers and explanation: The column graph, shows data with a vertical bar and the bar graph shows data in a horizontal lines. a. 0 there are no records of rabbits b. Between 5–10. (6) c. 113 dogs and fish students all together 13 | P a g e d. 160 students surveyed altogether e. Is up to personal preference. Question 3: Answer the following questions about the pi graph. a. b. c. d. What is the percentage of the birds, mice and snakes? What is the percentage of fish and dogs? What is the full pi percentage? Is the pi graph the best to represent this data? School Pets Snakes 4% Cats 13% Fish 39% Dogs 31% Mice Birds 5% 8% Answer and explanation: A pie chart is a circular chart divided into sectors; each sector shows the relative size of each value. a. b. c. d. 17% 70% 100% Personal opinion 14 | P a g e Question 4: Answer the following from the data given. Q1= 6 Q2= a. b. c. d. Q3=16.5 Q4= Q5= 63 With five quartile summary given can you identify the Quartile 2 and Quartile 4? Find the average of the data given? Determine the quartiles for the following set of data, 13, 90, 3, 12, 24, 0? Looking at the box and whisker plot, below what are the Quartile 1 and Quartile 5? Answers and explanation: To draw a box and whisker plot you need your numbers to be in ascending or descending order. Then follow the five quartile summary steps. To work out the quartiles follow these steps: Q1: The lowest number in the setQ2: Lower Quartile- The lower quartile is the median of the lower half of the data. Q3: The median of the set of numbers. Q4: Upper Quartile The upper quartile is the median of the higher half of the data. Q5: The highest number in the set of numbers a. Q2= 8 and Q4= 50 b. 26.6 is the average. To work it out you add them all together than divided by the amount of digits. c. Q1=0 Q2=3 Q3=12.5 Q4=24 Q5=90 15 | P a g e Question 5: Answer the following questions with the set of numbers below. Find the: a. b. c. d. Range Median Mode Mean 3, 3, 3, 5, 19, 34, 49, 62 Find the: a. b. c. d. Median Range Mode Mean Answers and explanation: Mean: Sort out the number in ascending order, add them together and divide the outcome by the total amount of numbers overall in the line. Median: The median is the middle number; Sort out the numbers in ascending order. Count the number of digits in the list. Mode: The mode is the number that is repeated more than any other number. In this example three is most used number. Range: The range is the highset number from the lowest number. Find the following for: 2, 2, 6, 8, 10, 12, 14, 16, 18, 20 a. b. c. d. 18 12 2 10 Find the following for: 3, 3, 3, 5, 19, 34, 49, 62 a. b. c. d. 19 59 3 22 16 | P a g e Question 6: Answer the following questions using the information provided Animals With claws With stripes Tiger Quagga Cats Skunk s 1. How many animals have stripes? 2. How many animals have claws? 3. How many have both? Answers and explanation: The diagram is a way to picture the connection between different groups of things. The diagrams consist of two overlapping circles. 1. How many animals have stripes? 5 Look at the circle labelled with stripes and count how many results are entered into the circle. The same goes for the other circle. 2. How many animals have claws? 5 Then the middle (between the two circles ) gives the an impression on what has both of the traits given in the diagram. 3. How many have both? 3 17 | P a g e Question 6 part b: Main meal Fish Chicken Beef Desert Fish, Fruit Chicken, Fruit Beef, Fruit Fruit Fish, Lollies Chicken, Lollies Beef, Lollies Lollies a. b. c. How many chicken meals? How many different combination have Lollies? How many different combinations can you have all together? Answers and explanation: A Lattice diagram is used when determining all the possible outcomes of a situation that involves two sides. One set of outcome is written at the top of grid and the other down the side of the grid. The combined outcome is then filled in the crossing grid boxes according to the consistent outer labels. a. 2 b. 3 c. 6 18 | P a g e Question 6 part c: Fish 1/3 Fruit 1/2 Chicken 1/3 Beef 1/3 Main meal Fish 1/3 Lollies 1/2 Chiken 1/3 Beef 1/3 a. b. c. d. How many chicken meals? How many different combinations have Lollies? How many different combinations can you have all together? What is the probability of having a chicken meal with fruit? Answers and explanation: When calculating probabilities you have numbers that have been multiplied, subtracted, divided, and added so it can be very messy. In a tree diagram there are “Branches”, on the branches the probability of each branch is written on the branch. The outcome is written at the end of the branch. To work out the results you multiply probabilities along the branches, and you add probabilities down columns. a. b. c. d. 2 3 6 0.75 or ¾ or 75% 19 | P a g e Question 7: Read Dogs Cats Fish Rabbit the results in the frequency table and answer the questions. llll llll lllll lll llllll llll llll lllll llll a. How many dogs are there? b. How many people were surveyed all together? c. What mistake have the survey and table made? Answers and explanation: The frequency table is constructed by arranging collected data values in ascending order of amount matching frequencies. The frequency of data value is the number of times the data value occurs. This example is four students have a score of 80 in mathematics, and then the score of 80 is said to have a frequency of 4. The frequency of a data value is often represented by f. a. 13 b. 39 c. The 5th line needs to strike through. 20 | P a g e Question 8: Answer the following questions Probability definition of P ( E) = n (E ) /n (S) a. b. c. d. What is the percentage of the probability 0.34? What is the decimal if the probability is 2/5? Convert 75% probability into a fraction? If there is four different coloured balls, a yellow, a red , a blue and a green, in a bag. What is the probability of pulling out a green ball? Answer and explanation: Probability measures or quantifies ‘how likely’ the event is likely to happen, Related to these types of experiment, will happen. The value of a probability is a number between 0 and 1 inclusive. An event that cannot occur has a probability (of happening) equal to 0 and the probability of an event that is certain to occur has a probability equal to 1. The probability can be written in originally in fractions: ½ this must be converted into a decimal: 0.5 Then into a percentage: 50% a. 34% b. 0.20 c. ¾ d. ¼ or 0.25 or 25% Question 9: Name which data is being used in the following diagrams a. B. a. Skewed b. Symmetrical Answers and explanation: Symmetrical Data. In fact symmetrical data is so common that it is called the normal curve. It also has other names like the bell curve. There are standard measurements available which can tell the probability of a data point occurring based on the number of standard deviations it is away from the mean. Skewed Data: Many times the data is not symmetrical e.g. it is skewed towards one side. Data can be either positively or negatively skewed. Statistical analysis of skewed data is therefore not often performed. 21 | P a g e Test and reflections Before the marked test reflection: The test went really well but I’m glad that there was a practice test because before that it was hard to create situation that I could work out solutions too. The questions that I had chosen to create were too easy and they didn’t challenge me. I think it was because I spent a lot of time making a ‘text book’ and discovering new things. It’s taken a longer time because I feel that I have to learn the basics of the information before I can start learning something new. During the test, I had problems with the histogram, I’m not too sure that the answer is correct, because the journal entry for it was very brief and it isn’t a graph that I would use often. Another part of the test I was unsure about was the reasonableness and the possible limitations, it wasn’t included in my CML so I had answered with what I thought it meant, based on the supposed meanings. The rest of the questions I had a lot of practice with, (the workshop on drawing a box and whisker plot on the calculator helped a lot), so I felt pretty confident. After the marked test: The test went well like I had predicted in the previous reflection. I was very happy with my efforts when I reserved a B. This showed that I am slowly getting into a working study routine. The question that I had wrong was because I didn’t read the question correctly; this led me to using the wrong formula for the data output and getting the question wrong. I really benefited from the detailed journal entries and the practice testing that was given beforehand it allowed me to see what parts I was lacking and where I needed to improve on for next time. Looking at the rubric I think that I could of improved on Explaining the mathematical ideas and logical reasoning but writing out all of the formulas that I had used not just the ones that were needed. I had written all my answers in shortened form (e.g. 2/10 – 1/5) this may have pushed my grade down lower. Overall I’m really happy with this grade, Next time I want to improve on reading the rubric and having results compliant to it. Finished CML reflection: During this CML I focused on the reflective side of the journal a lot more than last time. I had even tried to do it electronically; this led to more flexibility when it came to layout and information. I had also taken more time to work on the modelling task and less time on the things that I had known previously. I am fairly confident currently with the topic of statistics and probability now, but the process of the CML is still trial and error currently. Bibliography http://easycalculation.com/statistics/box-plot-grapher.php http://www.eduplace.com/graphicorganizer/pdf/venn.pdf http://www.mathsisfun.com/ http://www.mathswarehouse.com http://www.mathsteacher.com http://www.purplemaths.com http://www.maths.andyou.com 22 | P a g e Remember When is a population normal? Understand Because theoretical means to estimate or guess were as experimental means to test the theory or guess. So usually the experimental probabilities is the better solution What can you tell about a population by considering; Shape of the overall distribution, Symmetry of sample distribution, Skewness of the distribution. Analyse Explain why theoretical and experimental probabilities can give different values for the same situation. SACE Stage 1 (Year 11) Compare the accuracy of the data that can be found using the entire population and the Central Limit Theorem to on a sample. Evaluate Pre-SACE (Year 10) Bloom’s cognitive stuff Conceptual checklists Discuss the benefits of using the empirical rule (68%, 95.4%, 99.7%) for a Normal Distribution. Consider accuracy, ease of use and anything else you can think of. Suggest several situations that can be described using each of the following: Discrete Quantitative Data When data can only use certain values that can be split. Like when calculating the number of students in a classroom, you can’t have 23.5 students because it’s impossible to have a half of a student. Continuous Quantitative Data When data is needed to be placed within a range of values. For example people’s heights can vary but when there is a lot of data you can create a range between the basic points. Categorical Data Categorical data is when a set of data is sorted into different categories according to the data’s features. So instead of having just girls the data may be sorted into girls with brown eyes/or girls with blonde hair. In what situations would you use each of the three measures of “average”, mean, median and mode, to show the different characteristics of a population or sample? When using central tendency, it can be used to discover the population mean. The other situation using the three measurements would be when working out the frequency distribution. When is it reasonable to use each of the following to describe a population? the five number summary used to describe a box-and-whisker plot, The box and whisker plot is often used to plot a large amount of numerical data. For example height. 23 | P a g e YEAR 11 LEARNING OUTCOMES Before the CML: I am quite disappointed with my last test result so this time I want my journal to be more affective in the way that I will be able to use the journal and be able to explain my understandings of the work done. This year the year ten things covered are a bit differently to the work done last year so I have looked into those outcomes to make sure that I have a stable understanding of this work. Looking over the dot points I think the hardest point would be the combination notation & formula and calculate the compound probabilities. Populations and sample size: Population: The group from which a sample has been taken from. It is the thing too big to be tested fully so samples have to be taken. For example: Having the ‘population’ of children at the royal Adelaide show. If an organiser wanted to see the statistics of children’s enjoyment, He/she would want to take a sample out of the population. Sample size: A selection taken from the population or larger group so to find something about that population or group. For example: a scientist wants to know the salinity of the sea, the scientist doesn’t test the whole sea instead he/she takes a sample from the population or the sea in this place and tests that. The sample size can vary depending on what’s been tested and the accuracy of the test itself. How do we determine the sample size? There are four way to ensure that we have a sufficient sample size from our population. 1. Population size- How much is there in total? If you don’t now it’s not strange most populations’ sizes aren’t often exact. 2. Margin of error- (also called confidence Interval) – This means to leave room for error, your sample will never be perfect so you have to have room for error that might occur. The confidence interval determines how much higher or lower than the population mean that you’re going to let you sample mean fall. 3. Confidence Level — How confident do you want to be that the actual mean falls within your confidence interval? The most common confidence intervals are 90% confident, 95% confident, and 99% confident. 4. Standard of Deviation — How much variation do you expect in your responses? If we don’t know the approximate variation the recommended variation to use is 0.5. To calculate the needed sample size, your level corresponds to a Z-score. This constant value needed for this equation. Here are the Z-scores for the most common confidence levels: 90% - Z Score = 1.645 95% - Z Score = 1.96 99% - Z Score = 2.32 Now use this equation to calculate the necessary sample size = (Z-score)² * StdDev*(1-StdDev) / (margin of error)² This example works when assuming the confidence level stands at 95% confidence level, 5 standar deviation and the margin of error of +/-5% 24 | P a g e ((1.96)² x .5(.5)) / (.05)² (3.8416 x .25) / .0025 .9604 / .0025 384.16 385 respondents are needed Independent VS Dependent events Dependent: The dependent events are those that depend on what happened before. For example: 2 blue marbles and 3 red marbles are in a bag, the chance of getting a blue marble out is 2 in 5 but if you take one of the marbles out then the chance changes. This gives us a dependent event because the outcome of the chance now depends on the marble that was taken out or on the event that happened before. Independent: Independent events are not affected by the previous events. For example: You toss a coin and it comes up with heads three times, what is the chance that the next toss will also be a head? The chance is still 50% just like any other toss of a coin. The previous times the coin was tossed has no effect on the current toss. Conditional statements A conditional statement is a statement with ‘if’ and ‘then’, like a hypotheses. If I add fertilizer to the plant then I will grow better. The ‘if’ is the hypothesis and the ‘then’ is the conclusion. This can also be known as a p->q, if p then q. A conditional statement is false if the hypothesis is true and the conclusion is wrong, for example: If I add fertilizer to the plant then it won’t grow. If the conditional statement is rearranged then it’s called a related conditional. For example: If a population consists of 50% men then 50% of the population must be women: p -> q but changing the statement to If a population consists of 50% women then 50% of the population must be men: q->p this is a converse statement. If both statements are true, or if both statement are false then the converse is true, a conditional and converse don’t mean the same thing. If both hypothesis and conclusion contradict then we get an inverse statement: For example if the population does consist of 50% male then the population doesn’t consist of 50% female. ~p -> ~q A pattern of reasoning is a true assumption if it always leads to a true conclusion. Most common patterns of reasoning are detachment and syllogism. Given: Know or independently determined: a given ratio Instead of “and” there is often the ∩ symbol (which is also the intersection symbol in venn diagrams). Instead of “or” you will often see the union symbol ∪ . For example: If the probability of scoring no goals is 20% (Event A) and scoring 1 goal is 15% (Event B). The probability of scoring no goals and 1 goal is 0 (impossible), but the probability of scoring no or 1 goal is 20% + 15% = 35%. This is written: 25 | P a g e P (A ∩ B) =0 P ( A ∪ B) = 20% + 15% = 35% Dog owners D and Cat owners C n(C) = 5+3 = 8 n (D)= 3+2=5 C∩D= Cat intersection dog = People who own both cats and dogs n(C∩D) = 3 C∪D = Cat union dog = people who own cats of dogs or both n(C∪D)= n (C) + n (D) – n (C∩D) <- Otherwise counted twice Dc or D’= The component of D = those who don’t own a dog = 5+10=15 n(C∩D’) = 5 n (C∪D’)= 18 n((C∪D)’) = 10 n (C’ ∪ D’)= 17 Mutually Exclusive events Mutually Exclusive: Means that it can’t happen at the same time. For example: You can turn left and right at the same time. This is because it’s impossible to turn left and right at the same time. The probability of a mutually exclusive event depends on the number of ways it can happen / the total number of outcomes. An example of this is having a deck of 52 cards, with 4 kings. What is the probability of picking a king? 26 | P a g e Number of ways it can happen= 4/ Total number of outcomes = 52 So the probability is equal to 4/52 or 1/13 When two events are mutually exclusive it is impossible for them to happen together:"The probability of A and B together equals 0 (impossible)" But the probability of A or B is the sum of the individual probabilities: P(A or B ) = P (A) + P (B) The probability of A or B equals the probability of A plus the probability of B For example: In a deck of 52 cards, the probability of a king is 1/13 so P(KING) = 1/13, The probability of an Ace is also 1/13 so P(ACE)= 1/13 When we combine the two events the probability of getting an Ace and a King is 0 or Impossible. But the probability of a card being a king or an Ace is 2/13. P(King and Ace)=0 P (King or Ace)= (1/13 )+ (1/13) = 2/13 Conditional Probability and Independence The probability theory, says that a conditional probability measures the probability of an event given that (by assumption, presumption, assertion or evidence) another event has occurred. If the events are A and B, this is said to be “the probability of A given B”. It is commonly denoted by P(AǀB) or sometimes PB(A). Both A and B are categorical variables, conditional probability table is typically used to represent the conditional probability. The concept of conditional probability is one of the most fundamental and one of the most important concepts in probability theory. But conditional probabilities can be hard and require careful interpretation. In statistical inference, the conditional probability is an update of the probability of an event based on new information. Incorporating the new information can be done as followed: 1. Start with a probability measure on a sample space 2. Measure the probability of the event A knowing that event B has or will have occurred we need to examine event A as it is restricted to event B as it is A B. 3. Whenever P(B) > 0 with the original probability measure on the original sample space, B must be the sure event in the restricted space and thus PB(B) must be 1. 4. To derive P(A|B)= PB (A) so that P(B|B) =1 we re-scale P (A B) by dividing by P(B) 5. This gives P(A|B)= P(A B) |P(B) whenever P(B)>0 and 0 otherwise Everyday occurrences to illustrate probability The bus being on time: P(A and B) = P ( Bus being on Time and Bus not being on time) P(1/2) + (1/2) = 2/2 = 1 Probability of the event happening= 0.5 Probability of the event not happened= 0.5 27 | P a g e Waking up before my alarm: Probability of the event happening= 0.5 Probability of the event not happened= 0.5 P(a) = 1 P(b)= 0.5 P(A n B) = 1 P(A’)=0 P(B’)=0.5 P(A B) = 1 Having a relief teacher for class on Monday: Number of teachers on Monday: 4 Likelihood of having a relief teacher: 0.5 0.5/4= 0.125 Probability of having a relief for class on Monday: 0.125 – 12.5% Probability event occurs P (A)= 0.008 Probability the event doesn’t occur= 0.992 P(A)= 0.004 P(B)= 0.004 P(A’)= 0.996 P(B’)= 0.996 P (A U B) = 0.008 Combination of notation and formula( nr) = n!/r!( n – r)! To Calculate the combination for (nr)=n!/r!(n-r). For 0 <=r <=n N chooser = C(n,r) = n! / ( r! (n - r)! ) The formula shows the number of ways a sample of ‘r’ elements can be obtained from a larger set of ‘n’ distinguishable objects where order does not count and repetitins are not allowed. ‘The number of ways of picking r unordered outcomes from n choose k. Combination Problem 1: A teacher is going to choose 3 students from her class to compete in the spelling bee. She wants to figure out how many unique teams of 3 can be created from her class of 25. In this example, we are taking a sub-set of 3 students (r) from a larger set of 25 students (n). Looking at the formula, we must calculate “25 choose 3” C (25,3)= 25!/(3! * (25-3)!)= 2,300 Possible Teams 28 | P a g e Relative frequencies from data as point estimates of probabilities. The probabilities cannot be calculated by just looking at the situation, for example, you cannot work out the probability of winning a football match by assuming that win, lose and draw are equally likely, but we can look at previous results in similar matches and use these to estimate the probability of winning. For example: The bumbleton and Sickton village football teams have played each other 50 times. Bumbleton have won 10 times and Sickton have won 35 times and the teams have drawn times. Sickton have won 35/50 or 7/10 matches. This fraction is the estimate of the probability. So the relatibe frequency is equal to 7/10. We calculate the relative frequency of an outcome using this formula: Calculate compound probabilities using: P(A and B) = P (A) x P(B) We use P to mean Probability of, so when talking about Independent events we use: P(A and B) = P(A) x P(B) The probability of A and B equals the probability of A times the probability of B For example: You are going to a concert , and your friend says it is some time on the weekend between 4 and 12 but won’t say more. What are the chances it is on Sunday between 10 and 12? Day: there are two days on the weekend so P(Sunday)= 0.5 Time: Between 4 and 12 is 8 hours, but you want between 10-12 which is only 2 hours: P(Your time) = 2/8 = 0.25 P(Sunday and Your time) = P(Sunday) x P(Your Time) = 0.5 X 0.25 = 0.125 or 12.5% chance 29 | P a g e Year 11 reflections After CML: After this CML I feel a little more confident with the outcomes. The questions on the CML where well laid out making it easier for me to understand and make necessary adaptions to the CML where I needed it most. The confusion about the due dates made me procrastinate a bit, and I was a little slack with the test reviewing. I enjoyed not having a modelling task to go with this CML but at the same time I am quite curious as to what it would be. After the Test: I think that I did pretty well in this test, compared to other tests. I had a wide range of concepts covered in my CML and this test was better worded then last time. I felt I understood most of the questions, but wished that I had more time or the time was marked on the board so that I can balance the time out. Instead I ran out of time. ILM Last year my ILM had Blue score is 8, this hasn’t changed. This is good because the logical behaviour is used a lot within the school and maths included. My Score for being Holistic, Intuitive, Synthesizing and Integrating, is lower than last year. I think this is because I spend a lot of time being on my own, or working on my own. My emotional interpersonal self, has increase from 4 to 10. This is good because I can see that I have improved myself over the year that I have been at ASMS. I worked hard on trusting people and building relationships with them this has improved my weariness about people and I am able to interact on a better level. My organizational quadrant has also improved by one point. But I would like to have it improve further to minimize the stress of assignments and things. 30 | P a g e