Download 6. Students` self

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Ministry of Education and Science of the Russian Federation
Federal State Budgetary Educational Institution of Higher Professional Education
"National Research Tomsk Polytechnic University"
APPROVED
Vice-Rector, Director of IC
___________ Zamyatin A.V.
«___»________________ 2012
SYLLABUS OF THE COURSE
Theory Probability and Mathematical Statistics
BEP DEGREE PROGRAMME: 230100 “Computer science and technology”
DEGREE: Bachelor
CORE CURRICULUM FOR ADMISSION 2011
YEAR 2; TERM 4;
NUMBER OF CREDITS: 4
PREREQUISITES: Calculus, Linear algebra
EDUCATIONAL ACTIVITIES AND TIME RESOURCES:
LECTURES
36 hours (classroom)
PRACTICE
36 hours (classroom)
TOTAL CLASSROOM ACTIVITIES
72 hours
SELF-STUDY
72 hours
TOTAL
144 hours
MODE OF STUDY: Full Time
INTERIM ASSESSMENT: Test
DEPARTMENT:
(CSO)
HEAD OF DEPARTMENT:
HEAD OF BEP:
TUTOR:
Control System Optimization
O.B. Fofanov
V.I. Reizlin
A.V. Kitaeva
2012
1. Objectives
The key aim has been to develop the ability to construct probabilistic models and to use common statistical methods in a manner that combines intuitive understanding and mathematical precision.
Objective
Objective Statement
identification
code
C1
Preparation of the graduates for interdisciplinary scientific researches and innovations aimed at meeting professional challenges
in the sphere of computer science and technology.
C2
Preparation of the graduates for design work aimed at accomplishing professional projects in the sphere of computer science and
technology that are competitive on the world market.
C3
Preparation of the graduates for design and technological activities
in professional sphere of computer science and technology
C4
Preparation of the graduates for organizational and management
activities while accomplishing interdisciplinary projects in professional sphere, including work in international teams of transnational companies.
C5
Preparation of the graduates for science and education activities,
development of their abilities for self-study and professional selfimprovement.
2. Course within the structure of the Basic Educational Program (BEP)
“Theory Probability and Mathematical Statistics” course refers to “Mathematical, natural and scientific cycle” of the degree program “Computer science and
technology” (the B2 block).
Prerequisites: Calculus, Linear algebra.
3. Outcomes of mastering course
As a result a student should:
Know:
 the basic concepts and models of modern Probability and Statistics:
probability space, random event, random variable, distribution, independence,
sample space;
 the basic methods of probability calculation and statistical inference:
combinatorics, conditioning, sum and product rules, point and interval estimation, hypotheses testing;
 basic discreet and continuous distributions, their properties and the field of
application.
Be able to:
 apply probabilistic and statistic methods to solve various theoretical and practical tasks;
 calculate the probabilities of compound events;
 find numerical characteristics of random variables, sample characteristics and
describe their properties;
 formulate and test basic statistical hypotheses;
 construct basic point and interval estimators of distribution’s parameters and
investigate their properties.
Have skills of:
 elementary probability calculation;
 testing classic statistical hypotheses;
 parameter‘s estimating by the ML and MM methods.
While mastering the discipline the following competences (according to BEP)
are developed in students:
1.General
ОК-1
To analyze and generalize the information
ОК-2
To express the thoughts clearly
ОК-10 To use probabilistic and statistic methods in their professional activity
2. Professional
PK-5
To choose methods for solving management and design tasks in the
sphere of computer science and technology
PK-6
To justify the decisions, to prove their correctness
4. Course structure and content
4.1 Structure of the course by modules, forms of training and
academic progress monitoring
Table1
Modules
Classroom activity (hrs)
Lectures
Practice
Lab
SelfStudy
(hrs)
1. Basic probability
3
2
2
concepts
2. Combinatorics
2
3
6
3. Properties of proba-
2
2
6
bility
4. Conditional probability. Independence
2
3
6
Forms of
current monitoring and
assessment
Seminar
Total
(hrs)
7
Seminar
Task1
Seminar
Task2
11
Seminar
Task2
Case study2
11
10
Modules
Classroom activity (hrs)
Lectures
Practice
Lab
SelfStudy
(hrs)
Forms of
current monitoring and
assessment
Seminar
Case study1
Total
(hrs)
2
3
6
5
4
6
Seminar
Task3
15
5
4
6
15
8. Limit results for sequences of random
variables
9. Point estimation
4
4
6
Seminar
Task3
Calculation
task
Seminar
Test
4
4
6
14
10. Interval estimation
3
3
6
11. Hypotheses testing
4
4
6
12. Histogram, sample
distribution function
and box-and-whisker
plot
Total
0
0
10
Seminar
Test
Calculation
task
Seminar
Test
Seminar
Test
Calculation
task
Test
Calculation
task
36
36
5. The Monty Hall
problem and other
puzzles
6. Discrete random
variables
7. Continuous random
variables (RVs with
densities)
0
72
0
11
14
12
14
10
144
4.2. Content of the course modules:
Topic#1. Basic probability concepts
Random experiments. Events. Probability. Historical remarks
Topic#2. Combinatorics
Topic#3. Properties of probability
Topic#4. Conditional probability. Independence
Topic#5. The Monty Hall problem and other puzzles
Topic#6. Discrete random variables
Distributions. Moments, mean, variance. Multivariate discrete distribution. Independence. Covariance and correlation
Topic#7. Continuous random variables (RVs with densities)
Density and distribution functions. Density and distribution functions. Bertrand’s
paradox. Exponential distribution. Multivariate continuous distribution. Independence. Transformation of random variables. Sums, products, and quotients of ran-
dom variables. Moment generating function. Distributions concerning normal.
Conditional density and expectation. The median, quartiles, percentiles. Skewness
and kurtosis. Simulation of random variables.
Topic#8. Limit results for sequences of random variables
Convergence of random variables. Inequalities. The law of large numbers. Sampling from a distribution. Central limit theorem.
Topic#9. Point estimation
Sample mean and sample variance. Properties of the estimators. Sufficient statistics. Method of moments. Method of maximum likelihood.
Topic#10. Interval estimation
Confidence intervals for mean. Confidence interval for difference of two means.
Confidence interval for variance. Confidence interval for a proportion.
Topic#11. Hypotheses testing
Formulating the hypotheses ( H 0 and H 1 ). Test criterion and rejection region. Two
types of errors. Performing a test.
5. Educational technologies
Table 2.
Methods and forms of training
Types of learning activities
Lectures
Methods
Case study
IT methods

Teamwork
Learning based on experience

Leap ahead self-study
Projecting
Searching and investigating
Practice




Selfstudy





6. Students' self-study
Self-study is the most productive form of learning and cognitive activities of
a student during the course. To fulfill creative abilities and deeper course mastering
the following types of self-study are stipulated: 1) current and 2) creative problem
- oriented.
6.1
Current self-study
 work with the course book, search and review of literature and other electronic sources on a given problem individually,
 homework, home tests,
 leap ahead self-study,
 self-study of a particular subject,
 preparation for the test.
6.2
Creative problem-oriented self-study
 research, analysis, structuring and presentation of information,
 review of publications according to pre-determined subject.
6.3
Self-study topics
 histogram
 sample distribution function
 box-and-whisker plot.
6.4
Self-study check
 the individual calculation task’s performance,
 the home tasks’ discussion,
 the pre-determined topics’ presentation.
6.5
References for self-study (Internet resources)
http://www.random.org/
(offers true random numbers to anyone on the Internet.)
http://lib.mexmat.ru/catalogue.php
(books (for reading only)).
http://www.dartmouth.edu/~chance/index.html
(The goal of Chance is to make students more informed, critical readers of current
news stories that use probability and statistics.)
http://www.math.uah.edu/stat/
(The Virtual Laboratories in Probability and Statistics, a set of web-based resources for students and teachers of probability and statistics, where you can run
simulations.)
www-history.mcs.st-andrews.ac.uk/history/index.html
(historical information about math and mathematicians.)
http://oli.web.cmu.edu/openlearning/forstudents/freecourses/statistics
(The Open and Free Full Courses. The Probability and Statistics course is
comparable to the full semester course on Statistics taught at Carnegie Mellon
University.)
http://www.icoachmath.com
(…designed specifically to help students strengthen their math skills through selfpaced, guided, and online interactive practice.)
http://plus.maths.org/content/teacher-package-statistics-and-probabilitytheory
http://home.ubalt.edu/ntsbarsh/
See also the last page Additional Resources 1-9
7. PRE-COURSE, INTERMEDIATE AND FINAL TESTS
7.1 Final test questions
1. In the context of a university admission test discuss the trade-off between Type I
and Type II errors.
2. A cereal package should weigh W0. If it weighs less, the consumers will be unhappy and the company may face legal charges. If it weighs more, company’s profits will fall. Devise the hypothesis testing procedure. Explain the statistical logic
behind the decision rule (the choice of hypotheses and the use of the confidence
interval).
3. Give examples of the null and alternative hypotheses which lead to (i) a one-tail
test and (ii) a two-tail test. How are the critical values defined and what are the decision rules in these cases?
4. Normal population versus Bernoulli.
(a) Select two pairs of hypotheses such that in one case you would apply a two-tail
test and in the other – a one-tail test. Assuming that the population is normal and
the variance is unknown, illustrate graphically the procedure of testing the mean.
(b) What changes would you make if the population were Bernoulli?
5. Using the procedure of testing the sample mean (a one-tail test, σ is known) describe in full the procedure of assessing the power of a test. Answer the following
questions:
(a) What happens to the power when the real mean moves away from the µ0?
(b) What happens to 1 – β when α decreases?
(c) How does the sample size affect the power?
(d) How does 1 – β behave if σ increases?
6. Fill out the next form:
Comparison of population and sample formulas
Population formula
Sample formula
Mean Discrete and continuous
Variance
Standard deviation
Correlation
7. Suppose we sample from a Bernoulli population. How do we figure out the approximate value of p ? How can we be sure that the approximation is good?
8. Define unbiasedness and prove that the sample variance is an unbiased estimator
of the population variance.
9. Find the mean and variance of the chi-square distribution.
10. Define the uniformly distributed random variable. Find its mean, variance and
distribution function.
11. In one block give the properties of the standard normal distribution, with
proofs, where possible, and derive from them the properties of normal variables.
12. What is the relationship between distribution functions Fz and FX if X = σz + μ?
How do you interpret Fz(1) – Fz(–1)?
13. How is the central limit theorem applied to the binomial variable?
14 List and prove all properties of the mean.
15. List and prove all properties of the covariance.
16. List and prove all properties of the variance.
17. List and prove all properties of the standard deviation.
18. List and prove all properties of the correlation coefficient.
19. How and why do you standardize a variable?
20. Define and derive the properties of the Bernoulli variable.
21. Explain the term “independent identically distributed”.
22. Define a binomial distribution and derive its mean and variance with explanations.
23. Define a distribution function, describe its geometric behavior and prove the
interval formula.
24. Define the Poisson distribution and describe (without proof) how it is applied
to the binomial distribution.
25. What do we mean by set operations?
26. Prove de Morgan laws.
28. Prove P(AUB) = P(A) + P(B) − P(A∩B) .
29. Prove the formulas for combinations
30. Proof Bayes theorem.
7.2 Intermediate tasks (examples)
1. Suppose that we want to generate a random variable X that is equally likely to be either 0 or 1, and that all we have at our disposal is a biased coin that,
when flipped, lands on heads with some (unknown) probability p. Consider the following procedure:
i. Flip the coin, and let 01, either heads or tails, be the result.
ii. Flip the coin again, and let 02 be the result.
iii. If 01 and 02 are the same, return to step 1.
iv. If 02 is heads, set X = 0, otherwise set X = 1.
(a) Show that the random variable X generated by this procedure is equally likely
to be either 0 or 1.
(b) Could we use a simpler procedure that continues to flip the coin until the last
two flips are different, and then sets X = 0 if the final flip is a head, and sets X = 1
if it is a tail?
2. A fair coin is independently flipped n times, k times by A and n − k times
by B. Show that the probability that A and B flip the same number of heads is equal
to the probability that there are a total of k heads.
3. Consider n independent flips of a coin having probability p of landing
heads. Say a changeover occurs whenever an outcome differs from the one preced-
ing it. For instance, if the results of the flips are H H T H T H H T, then there are a
total of five changeovers. If p = 1/2, what is the probability there are k changeovers?
4. An individual claims to have extrasensory perception (ESP). As a test, a
fair coin is flipped ten times, and he is asked to predict in advance the outcome.
Our individual gets seven out of ten correct. What is the probability he would have
done at least this well if he had no ESP? (Explain why the relevant probability is
P{X  7} and not P{X = 7}.)
5. Let X be binomially distributed with parameters n and p. Show that as k
goes from 0 to n, P(X = k) increases monotonically, then decreases monotonically
reaching its largest value (a) in the case that (n + 1)p is an integer, when k equals
either (n + 1)p – 1 or (n + 1)p, (b) in the case that (n + 1)p is not an integer, when k
satisfies (n + 1)p −1 < k <(n + 1)p.
Hint: Consider P{X = k}/P{X = k − 1} and see for what values of k it is
greater or less than 1.
6. An airline knows that 5 percent of the people making reservations on a
certain flight will not show up. Consequently, their policy is to sell 52 tickets for a
flight that can hold only 50 passengers. What is the probability that there will be a
seat available for every passenger who shows up?
7. Suppose that two teams are playing a series of games, each of which is independently won by team A with probability p and by team B with probability
1 − p. The winner of the series is the first team to win four games. Find the expected number of games that are played, and evaluate this quantity when p = 1/2.
8. Suppose that each coupon obtained is, independent of what has been previously obtained, equally likely to be any of m different types. Find the expected
number of coupons one needs to obtain in order to have at least one of each type.
m
Hint: Let X be the number needed. It is useful to represent X by X =  X i
i 1
where each Xi is a geometric random variable.
9. A coin, having probability p of landing heads, is flipped until head appears for the rth time. Let N denote the number of flips required. Calculate E(N).
Hint: There is an easy way of doing this. It involves writing N as the sum of
r geometric random variables.
10. An urn contains 2n balls, of which r are red. The balls are randomly removed in n successive pairs. Let X denote the number of pairs in which both balls
are red. (a) Find E(X). (b) Find var(X).
11. Suppose that X takes on each of the values 1, 2, 3 with probability 1/3.
What is the moment generating function? Derive E(X), E(X2), and E(X3) by differentiating the moment generating function and then compare the obtained result
with a direct derivation of these moments.
12. Let X and Y be independent normal random variables, each having parameters μ and σ2. Show that X + Y is independent of X − Y.
Hint: Find their joint moment generating function.
13. Case study
Drug testing procedure
Your company is going to conduct a review of all staff members for the use
of drugs.
To estimate the cost (required for testing equipment and possible psychological problems) and the expected gains (increase productivity), it is need to analyze
the situation.
Testing procedure is not ideal. The laboratory staff informed you that if a
person uses drugs, the test is "positive" with a probability of 90 %. However, if a
person does not use drugs, the test shows "negative" (i.e. “not positive”) result in
95 % of cases. Based on an informal survey of some workers you can expect about
8 % of all personnel using drugs.
Analysis of the probability is in this case is important because it allows you
to transform existing information in a much more useful for decision-making probability.
Does it make sense to conduct such a test? Give a reasoned response based
on a suitable probability analysis.
7.3. Pre-course survey (examples)
1. A small object was weighed on the same scale separately by nine students
in a science class. The weights (in grams) recorded by each student are shown below.
6.2 6.0 6.0 15.3 6.1 6.3 6.2 6.15 6.2
The students want to determine as accurately as they can the actual weight of
this object. Of the following methods, which would you recommend they use?
_____a. Use the most common number, which is 6.2.
_____b. Use the 6.15 since it is the most accurate weighing.
_____c. Add up the 9 numbers and divide by 9.
_____d. Throw out the 15.3, add up the other 8 numbers and divide by 8.
3. Which of the following sequences is most likely to result from flipping a
fair coin 5 times?
_____a. H H H T T
_____b. T H H T H
_____c. T H T T T
_____d. H T H T H
_____e. All four sequences are equally likely.
4. Select the alternative below that is the best explanation for the answer you
gave for the item above.
_____a. Since the coin is fair, you ought to get roughly equal numbers of
heads and tails.
_____b. Since coin flipping is random, the coin ought to alternate frequently
between landing heads and tails.
_____c. Any of the sequences could occur.
_____d. If you repeatedly flipped a coin five times, each of these sequences
would occur about as often as any other sequence.
_____e. If you get a couple of heads in a row, the probability of a tails on the
next flip increases.
_____f. Every sequence of five flips has exactly the same probability of occurring.
7. Half of all newborns are girls and half are boys. Hospital A records an average of 50 births a day. Hospital B records an average of 10 births a day. On a
particular day, which hospital is more likely to record 80 % or more female births?
_____a. Hospital A (with 50 births a day).
_____b. Hospital B (with 10 births a day).
_____c. The two hospitals are equally likely to record such an event.
8. STUDY SCHEDULE
Course
Probability Theory and Mathematical Statistics
Term
Year
4
2
Lecturer
A.V. Kitaeva, professor
Weeks
Credits
Lectures, hrs
Practice, hrs
Labs, hrs.
Class work in total, hrs
Self-study, hrs
18
4
36
36
0
72
72
TOTAL, hrs.
144
Term Schedule
Theoretical material
Topic
Subject
1-2
Basic probability concepts
Random experiments. Events.
Probability. Historical remarks
2-3
Combinatorics
3-4
Properties of
probability
4-5
Conditional
probability.
Independence
Practice
Testing
Test 1
Points
8
Lab
Points
Subject
Points
Boundary
check
Points
Problemoriented tasks
Points
Total
Weeks
Schedule
Classical probability. Geometric
probability
2
2
Arrangement. Permutation. Combination
3
3
Sum and product
rules
3
3
4
12
Total probability
and Bayes formulas
Check point # 1 in total
20
Theoretical material
Topic
Subject
Practice
Testing
Points
Lab
Points
Subject
5-6
The Monty
Hall problem
and other puzzles
Conditioning.
Probability Puzzles
6-9
Discrete random variables
Distribution. Mean.
Variance. Basic
discreet distributions
9-12
Continuous
random variables
Test 2
Density and distribution
functions.
Density and distribution
functions.
Bertrand’s paradox.
Exponential distribution. Multivariate
continuous distribution. Independence.
Transformation of
random variables.
Sums, products, and
quotients of random
variables. Moment
generating function.
Distributions concerning
normal.
Conditional density
and
expectation.
The median, quartiles,
percentiles.
Skewness and kurtosis. Simulation of
random variables.
8
Density. Basic
continuous distributions. MGF.
Moments. Conditional distribution.
Simulation.
Points
Boundary
check
Points
4
4
6
Problemoriented tasks
Points
Total
Weeks
Schedule
4
12
6
Theoretical material
Topic
1214
Subject
Limit results
for sequences
of
random
variables
1415
Point estimation
1516
Interval
mation
esti-
1718
Hypotheses
testing
Testing
Convergence
of
random variables.
Inequalities.
The
law of large numbers.
Sampling
from a distribution.
Central limit theorem.
Sample mean and
sample
variance.
Properties of the
estimators. Sufficient
statistics.
Method of moments. Method of
maximum
likelihood.
Confidence intervals for mean. Confidence interval for
difference of two
means. Confidence
interval for variance. Confidence
interval for a proportion.
Formulating
the
hypotheses
and
Practice
Points
Lab
Points
Subject
Points
Boundary
check
Points
Problemoriented tasks
Points
Total
Weeks
Schedule
Types of RV’s
convergence.
Chebuchov’s inequality. De
Mouvre’s approximation
6
6
Method of moments and method
of maximum likelihood for classic
distribution parameters’ estimation
6
6
Construction of
confidence intervals
6
6
Hypotheses testing
Pearson’s lemma
6
6
( H0
H 1 ). Test
cri-
terion and rejection
region. Two types
of errors. Performing a test.
Theoretical material
Topic
1718
Descriptive
Statistics
Practice
Subject
Testing
Points
Histogram, sample
distribution function and box-andwhisker plot
Test 3
10
Lab
Points
Subject
Points
Boundary
check
Points
Problemoriented tasks
Points
Total
Weeks
Schedule
10
Check point # 2 in total
56
Check points # 1, 2 in total
76
Test
24
Points in total (all course)
100
9. Resources
Textbook: We will use a (free) text that is available online in pdf form.
Additional Resources
1. Chung K. L. and AitSahlia F. Elementary Probability Theory. With Stochastic Processes and an Introduction to Mathematical Finance. – 2003. – 402 p.
2. Elias J. Syllabus //www.montana.edu
3. Johnson R.A. and Bhattacharyya G.K. Statistics: Principles and Methods. –
2011. – 686 p.
4. Grinstead C.M., Snell J.L. Introduction to Probability. – 1997. – 510 p.
(available free http://www.freebookcentre.net/Mathematics/Probability-TheoryBooks.html)
5. Soong T.T. Fundamentals of Probability and Statistics for Engineers. –
2004. – 408 p.
6. Stirzaker D. Elementary Probability. – 2003. – 520 p.
7. Suhov Y. and Kelbert M. Probability and Statistics by Example. – 2005. –
360 p.
8. Vrbik J. Lecture Notes. Probability //www.brocku.ca/
9. Vrbik J. Lecture Notes. Mathematical Statistics //www.brocku.ca/
This program is made in accordance with TPU Standards and Federal State
Educational Standards (FSES) requirements in the study major of 230100 “Computer science and technology”
This program was approved during Control System Optimization (CSO) department meeting
(protocol № ____ from «___» _______ 2012 г.).
Author ____________________Kitaeva A.V.
Reviewer _________________________