Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to Statistics Parameter: measurable characteristic of a population. Population: all members of a definable group. For statistical purposes a population must have definable characteristics even if it is not possible to measure the variable or even count the number of members in the population. Sample: subset or subgroup of a population. Usually obtained by random sampling of a single population. Statistic: measurable characteristic of a sample. E.g., height, weight, political affiliation, ethnicity, aerobic capacity, strength, power, .... Data or Data Set: collection of numerical and/or nonnumerical values (plural of datum). Datum: single measured value (singular of data). Statistics Statistics: 1. plural of statistic, 2. science of conducting studies to collect, organize, summarize, analyze and draw conclusions from data. Descriptive statistics: collection, description, organization, presentation and analysis of data. Inferential statistics: generalizing from samples to populations, testing of hypotheses, determining relationships among variables and making decisions, uses probability theory to make decisions. Hypothesis: “less than a thesis”, a testable conjecture based on a theory. Thesis: a dissertation or learned argument which defends a particular proposition or theory. Qualitative measurements: typically non-numerical, subjectively measured, judgmentally determined, categorical. E.g., religious affiliation, teacher/professor evaluations, emotional states, flavour, gender. Quantitative measurements: typically numerical, objectively measured, reliability (repeatability or precision) and validity (accuracy) can be evaluated against a criterion. E.g., salary, course grade, foot size, IQ, age, girth. Types of quantitative measures: Constants: quantities with fixed characteristics. Physical constants: G, c, h (Planck’s constant) Mathematical constants: p, e, i Variables: quantities whose characteristics vary. Discrete variables: numerical variables that have finitely many possibilities (usually integers), countable many possible values Examples: value of $ bills or coins, card count Continuous variables: numerical variables that have infinitely many possible values within a range of values (numbers between –1 and +1) or unbounded (Real numbers, numbers greater than 0). Examples: height, duration, angle (only a fixed number of significant figures are reported). Significant Figures: When reporting numerical information, especially when obtained by a calculator, usually only 3 or 4 digits are required. The general rule that is accurate to 0.5% holds that only 4 significant figures are needed if the first nonzero number is a 1 and 3 when it is not. Examples: 234 000, 1.234, 2.45, 0.003 45, 0.1234, 8910, and 56 100. Exceptions are frequencies and counts when all digits are reported and financial numbers, hich are too nearest dollar or nearest cent depending on the amount. Measurement Scales Nominal: classifies data into mutually exclusive (nonoverlapping), exhaustive categories in which no ordering or ranking of the categories is implied. E.g., colour, flavour, religion, gender, sex, nationality, county of residence, postal code. Ordinal: classifies data into categories that can be ordered or ranked (highest to lowest or vice versa), precise differences between categories does not exist. E.g., teaching evaluations, letter grade (A+, A, A–, ... F), judges scores (0–10), preferences (polls), skill rankings. Interval: numerical data with precise differences between categories but with no true zero (i.e., zero implies absence of quantity). E.g., IQ (0 means could not be measured), temperature (degrees Celsius), z-scores (0 is average value), acidity (pH, 7 is neutral). Ratio: interval data with a true zero, true ratios exist E.g., height, weight, temperature (in Kelvins), strength, price, age, duration. Methods of Sampling Random: subjects are randomly selected from a population, all subjects have equal probability of being selected, subjects may not be selected twice. Systematic: subjects are numbered sequentially and every nth subject is selected to obtain a sample of N/n subjects (N is number of people in population). Stratified: population is divided into identifiable groups (strata) by some relevant variable (income, gender, age, education) and each strata is sampled randomly in proportion to the strata’s relative size in the population. Cluster: subjects are randomly sampled from representative clusters or regions of the population. Economical method if subjects are widely dispersed geographically. Convenience: typically used in student projects and by journalists, uses subjects that can be conveniently polled or tested. Not suitable for pollsters or medical research.