Download Collection of data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability interpretations wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Mohtrma Benazir Bhutto Sindh campus
Dadu
Course Title: Probability and Statistics
Department of Information and Technology
Course Code: ITEC-426
Class: BS P-II
Subject Teacher: Kalsoom Khan Babar
 Syllabus outline

Probability and random variables, interpretation of probability as a relative frequency,
subjective probability, probability density function Functions of random variables, expectation
values.

Probability functions: Binomial and multinomial distributions, Poisson distribution, uniform
distributions, exponential distribution, log-normal distribution

Statistical Tests: Hypothesis, test statistics, significance level, choice of critical region using the
Neymon-Pearson lemma, constructing test statistics, linear test statistics, non linear test
statistics, Goodness of fit test

Parameter estimation: samples, estimators, bias, Estimators for mean, variance, covariance,
MLE(maximum likelihood estimator), maximum likelihood of binned data, relation between ML
and Bayesian estimators,

Statistical errors, confidence intervals and limits, the standard deviation as statistical errors,
classical confidence interval (exact method)
 Book reference
Introduction to statistical theory (part 1)
By prof. Sher Muhammad Chaudhry & prof. Dr. Shahid Jamal.
 Basic Statistics
 Statistics

Statistics may be defined as the science of the collections, presentation, Analysis and
interpretation of numerical Facts or data.

Scope of statistics: In early stage statistics was used by the state to collect information on
public affairs for administration.
Gradually its use was extended to all scientific experiments where collection, presentation and analysis
of data were made for valid conclusion. The significance of hypothesis, prediction of future course on
the bases of past experience can be estimated by this tool.

Collection of data:

From a practical point of view, first step with which statistics deals is the collection of numerical
data. These data are needed in different fields of human activity.
According to the source, statistical data may be classified in two types, namely:
1.Primary data
2.Secondary data
o
Data which are collected for first time for significant purpose called Primary data.
o
Any data which use for investigation which have been originally collected by some one else
called secondary data.
 METHODS TO COLLECT PRIMARY DATA
1. Direct personal investigation
2. Personal interviews
3. Collection through Questionnaires
4. Collection through Enumerators
5. Collection through local Sources
6. Computer interviews

Variables

A variable is that factor whose value changes time to time, place to place, individual to
individual. Examples of variables are height of student of a collage, earning of factory workers. A
variable is usually denoted by the capital letter “X”.

There are two types of variables
1. Discrete variable
2. Continuous variable
Variables may be either discrete or continuous. A discrete variable is one which takes only discrete
values or values in whole numbers.

Example:
X= 0,1,2,3,4,…….

Continuous variable
A continuous variable may be defined as one which can take on any value within given interval such as
height of person, weight of baby, temperature at a place, etc..
 Presentation of data
To put the data in such a way that one can get more information in less time is known as presentation of
data.
There are two methods which may be used for the presentation of collected data.
1. Frequency distribution
2. Graphical presentation
A frequency distribution is a statistical table which shows the arrangement of data according to the
magnitude or size of the data , either individual or in groups with there corresponding number of values
side by side.

Class limits

Class boundaries

Class marks or Mid point.

Class width or interval
CONSTRUCTING A GROUPED FREQUENCY DISTRIBUTION:
1. Decide on the number of classes into which data are to be grouped.
2. Divide the range of variation by the number of classes.
3. Decide where to locate the class limits.
4. Determine the remaining class limits.
5. Distribute the data in to appropriate classes.
6. Total the frequency column.

GROUP DATA: The data presented in the form of group called grouped data.

UNGROUP DATA: The data which have not been arranged in systematic order are called raw
data.

The arrangement of raw data in an ascending or descending order of magnitude is known as an
“Array”.

26 28 52 55 43 46 46 51 43 40 43 42 46
Above data called ungrouped data

The data Array
The following table shows the marks of 25 students of IT class formed in to Array of ascending order
26
35
42
43
51
28
36
42
43
51
29
39
42
46
52
31
40
43
46
52
32
40
43
46
52

Probability and probability distribution

Probability

A measure of degree of belief in a particular statement or problem.
Types of probability
1. Tossing a coin , Draw a card, Throw a die etc…

Example:1
Toss a coin once, what is the probability it will head?
Probability=p(H,T)
Symbolically: P(A)=n(A)/n(S).
P(H)=1/2, P(T)=1/2

Probability is nothing more than percentage (relative frequency), In other words, probabilities
are computed using the following simple formula, which we refer to as
f/N rule.
Probability of an event=f/N.
f= No. of an event occur.

Properties of probabilities
1. The probability of an event is always between 0&1
2. Probability of an event that can not occur is 0.(an event that can not occur that is called
impossible event).
3. Probability of an event that must occur is 1.(an event that must occur that is certain event).
4. Probability must be > 0 and ≤1.

GENERALIZED PRINCIPLE OF COUNTING

Consider an experiment n1,n2,n3,…,nr are outcomes of an experiment exp1,exp2,..,expr. If r
experiment performed together, there are n1*n2*..*nr outcomes.
Example:2
If an experiment consists of throwing a die and then drawing a latter at random from the English
alphabet. How many points are there in sample space.
Throwing a die =n1=6ways
Drawing English alphabet=n2= 26ways
Total #of outcomes =n1*n2=6*26= 156ways
Example:3
If a multiple choice test consists of 5 question each with 4 possible answers of which only one is
correct.
a. In how many different ways can a student check off one answer to each probability?
b. In how many different ways can a student check off one answer to each question and get all the
answers wrong?
Solution:
a. All questions have 4 possible answers= n1=n2=n3=n4=n5 = 4ways
Total # of outcomes= 4*4*4*4*4=4^5= 1024 ways
Therefore
In 1024 ways a student can check off one answer to each question.
b. All question has 4 possible answers of which one is correct and 3 are wrong.
n1=n2=n3=n4=n5=3ways
Total no. of outcomes=3^5=243ways
Therefore,
A student can check off in 243 ways to get wrong answer.
Permutation

A permutation is an ordered subset from a set of distinct objects. The number of permutations
of r objects, selected in definite order from n distinct objects.

The arrangement in which repetition is not allowed and order is relevant is known as
permutation.

ⁿPᵣ=n!/(n-r)!
Where n!=n(n-1)(n-2)…(n-r+1)!
When n=r
n!= n(n-1)(n-2)…3*2*1.

1!=1 & 0!=1

2!= 2*1= 2

3!= 3*2*1=6

5!=5*4*3*2*1

Combination

A combination of r objects from a collection of m objects is any unordered arrangement of r of
the m objects, in other words, any subset of r objects from the collection of m objects.
Note

Order matter in permutation but not in

The # of possible combination of r objects that can be formed from a collection of m objects.

ⁿCᵣ= n!/(r!*(n-r)!)

Example:
combination
Consider the collection consisting of the five letters a,b,c,d,e
a. List all possible combination of three letters from this collection of five letters
b. Use part (a) to determine the # of possible combination three letters that can be formed from
the collection of five letters, that is find ⁵C₃
Solution:
a. For this part we need to list all unordered arrangements of three letters from the first five
letters in the English alphabet.
{a,b,c} {a,b,d} {a,b,e} {a,c,d} {a,c,e} {a,d,e} {b,c,d} {b,c,e} {b,d,e} {c,d,e}
b. It can find by ⁵C₃=10 (check by using calculator).

Some rules of probability

Two mutually exclusive events,
Where each event has equal prob-bability.

Not mutually exclusive event.
THE SPECIAL ADDITION RULE

To find the probability of two events (event A) and (event B) just add there probabilities,
P(E)=P(A)+P(B) where E tend to event and E=A+B.
P(A+B+C+…)=P(A)+P(B)+P(C)+…
P(A or B or C or …)=P(A)+P(B)+P(C)+…
Where (+) can change in (or) both condition meanings are same.

Example
According to the congressional Directory, the age distribution for senators in the 104thU.S. Congress
is as following
For a senator selected at random, let
I.
II.
Determine the probability of No. of senators of each age.
P(A or B or D)?
Age (yrs)
No. of
senators
EVENT
Under 40
1
A
50-59
41
B
60-69
27
C
70 and over
17
D
Total
100

Example
According to the congressional Directory, the age distribution for senators in the 104thU.S. Congress
is as following
For a senator selected at random, let
I.
II.

Determine the probability of No. of senators of each age.
P(A or B or D)?
THE COMPLEMENTATION RULE

The second rule of probability is the complementation rule, which stats that probability of an
event occurs equals to 1 minus the probability it does not occur.

P(E)=1-P(not E)

This rule mostly use in probability density functions,
Binomial distribution, Bernoulli distribution geometric distribution and many more pdf…
RANDOM VARIABLE

A random variable is a quantitative variable whose value depend on chance.

A random variable is a function of sample space. It is basically device for transferring probability
from complicated sample space to simple sample space, thus a random variable assigns a real
number value.

There are two types of random variables
I.

I.

Discrete random variables
A discrete random variable X is a random variable whose possible values form a finite(or
countable infinite) set of numbers. E.g.: 1,2,3,…….
Continuous random variables
A random variable X is defined as if it can assume every possible value in an interval [a,b],a<b
where a and b may be -∞ to +∞
Examples: Height of person, the temperature at place, the amount of rain fall, time to failure any
electronic system etc.

Probability function

Probability density function

The function f(x) is called Probability density function abbreviated to p.d.f, or simply density
function of the random variable x.

Properties of p.d.f:
I.
f(x)≥0, for all x
II.
͚∫f(x) dx=1
III.
The probability that X takes on a value in the interval [c,d],c<d is given by
P(e<x≤f)= F(f)-F(e)
ₑᶠ∫ f(x)dx

Example:1
A random variable is a continuous type with pdf
f(x)=2x 0<x<1
= 0, elsewhere
1) P(X≤1/2)
2) P(X>1/4)
3) P(1/4 ≤X<1/2)

Example:2
A random variable is a continuous type with pdf 2(x-1), 1<x≤2