Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Department of Business Administrative IVE-HW QT1-Exam revision note Quantitative Techniques 1 2004-2005 Revision Note (1St Term) Chapter 1: Introduction to Statistic Statistics is the science of: Collecting(收集), Organizing(組織), Presenting(陳述), Analyzing(分析), Interpreting(解釋) Purpose of statistic: Decision making, Decision understanding and Prediction Descriptive Statistics (敘述統計學): Methods of organizing, summarizing, and presenting data in an informative way. Purpose: To present data by tables, charts, statistical measures Inferential Statistics (推論統計學 ): Draw inferences about a population from a sample Make statements about the population's characteristics from the information contained the sample Purpose: Make Decisions About Population Characteristics which Involves estimation and hypothesis testing Data Source: Data Type: Key words to remember: Population - set or collection of items of interest Population Parameters/Parameters - Numerical characteristics of population Sample - Portion of Population which are selected from the population Numerical characteristics of samples - Referred as Sample Statistics, or Simply Statistics Data Set (Data Files) - Collection of data which organized to facilitate data analysis Item - Any entity of interest (e.g. person, company …). In practice, item always refer as elements, objects, or units Data value - A measurement of a variable. Data set may include several variables. See the example in lecture notes Observation - Make up by the data values of a single item from all of the variables. Referred as a case, a row or a record in the practice of statistics. Refer to example in lecture notes Qualitative or Categorical variable - Can be classified into single category, Nonnumeric. E.g. Good, Bad, Average Quantitative variable - Measured on a numerical scale. E.g. Age, Weight, Height Experimental Studies - Investigator directly controls/determines which subjects or experimental items or materials receive treatments that are thought to affect variables of interest. Observational Studies - We test the variables of interest by using observed or historical data. We do not directly control/determine which subjects/items receive treatment that is thought to affect the variables of interest in the study Level of measurement: 1. Nominal level (grouping) - Lowest level of measure. Any data that you may have can be grouped or categorized in some way. E.g: [Man; Woman]. 2. Ordinal Level (Grouping & Ranking) – The measure shows the information about order. No exact value is presented. E.g.: [Superior; Good] 3. Interval level: Includes the exact distance between measures, but never contain a zero ( 0 ) as a starting point. No exact value is presented. E.g. [A > B by 1] 4. Ratio level: Data has a meaningful zero point. The ratio of two data values is meaningful. Exact value is presented. E.g. [John’s height is 5”8] Chapter 2: Sampling Methods Reasons for not using the whole population: Time Consuming, Costly, Inefficiency, Destruction of the data nature and the full membership is unknown Sample design - To obtain a representative sample from the population. Survey method - The study for collecting useful information from the selected sample. Representative sample - contains the relevant characteristics of the population in the same proportion Probability Sampling: Characteristics: Allows each possible item to have a known and equal probability of being included in the sample. The selection of one item does not affect the chance of any other item being selected. Random Sampling: Use of random number table: Refer to lecture note. Systematic Random Sampling: Elements are arranged in some ways, a starting point is randomly selected (by the random number table), and then other elements are selected from the Date: 16 Dec, 2004 Edited by: Jacky Wong Page:1 Department of Business Administrative IVE-HW QT1-Exam revision note population at an uniform interval. Adv: Spread more evenly over the entire population Disadv: Possible presence of hidden periodicity Stratified Random Sampling: The population is divided into numbers of non-overlapping homogeneous groups, called Strata. Elements with similar characteristics are grouped together, called homogeneous group. Proportional: the number of items selected from each stratum be in the same proportion as in the population Mean = X 1 X 2 ... X n n Non-Proportional: Equal numbers of elements are selected from each stratum, and give weight to the results according to the stratum's proportion of the whole population. Mean = N1 X 1 ... N n X n N1 ... N n Adv: Stratified random sampling are used to get rid of bias in sampling Cluster Random Sampling: The total area of interest (population) is divided into numbers of small, non-overlapping blocks (or clusters) a number of these blocks (clusters) are then randomly selected for inclusion in the overall sample assume that these individual blocks are representative of the population as whole. Adv: Cluster Sampling are usually not as reliable as estimates based on Simple Random Sampling of the same size, they are usually more reliable per unit cost Comparison between Stratified and cluster sampling: Stratified random sampling: small variation within group, but wide variation between the groups. Cluster random sampling: considerable variation within each group, but groups similar to each other Non-Probability Sampling: Characteristics: Used primarily as a matter of convenience, it may produce quite accurate estimates of population parameters, but the drawback is that since the sample is not chosen using probability methods, there is no valid way of determining of the resulting estimates. Judgement Sampling: Personal judgement plays a significant role in their selection. E.g. Testing markets for new products Quota Sampling: Interviewers are simply given quotas to be filled. Once the quota is set, interviewers are granted flexibility in the choice of sample members. Bias samples: Occurs when we chose unsuitable sampling method, or collect insufficient number of samples. Chapter 3: Survey Methods Primary Data - Data that are used for the specified purpose for which they are collected Secondary Data - Data that are being used for some purpose other than that for which they were originally collected Internal Data - Data are generated from the activities within a firm. External Data - Data are obtained from sources outside the firm. Different kinds of survey methods: Direct Observation: Observe a phenomenon with your own eyes. It is concerned with what people do rather than why they do it. Provide accurate data, free of biases introduced by Interviewers. Useful in continuously collecting data about routine consumer behaviour Adv: 1. Actual actions or habits of person are observed. 2. Applicable when it is undesirable for people to know an experiment is taking place. 3. Provides one of the most reliable methods of data collection. Disadv: 1. Result of observation depend on the skill of the observe 2. Opinions and attitudes cannot be obtained by observation 3. Some forms of behaviour cannot be obtained by 'one-time' observation 4. Expensive to tie up personnel Date: 16 Dec, 2004 Interview: Adv: 1. Generates very rich data sources, both quantitatively and qualitatively 2. Normally achieves a high response rate 3. May assess the person being interviewed in terms of age and social class, and even sometimes assess the accuracy of the information given Disadv: 1. Probably the most expensive 2. Interviewers must also be well trained 3. People may not like to give embarrassing information 4. Some types of people are more difficult to locate and interview Edited by: Jacky Wong Page:2 Department of Business Administrative IVE-HW QT1-Exam revision note Phone Interview: Adv: 1. Speed and relative economy only a limited amount of information is required 2. Computer-assisted telephone interviewing - increases data input accuracy and saves on labour costs Disadv: 1. Refusing to answer questions is easier 2. Time may be wasted in phoning people who are not in home Postal Questionnaires: Adv: 1. Speed and the cost 2. No interviewer bias 3. Respondent has enough time to consult Disadv: 1. Design of questionnaires requires great care 2. Poor response rate and incomplete or wrongly completed forms 3. Spontaneous answers cannot be collected 4. "Wrong" person may complete the questionnaire Questionnaires Brevity - The questionnaire should be as brief as possible. Simplicity - A complicated form may well conceal the real point of the questionnaires. It is not necessary to use four or five words when one would suffice. Ambiguity -The respondent must be in no doubt as to what a question means. E.g. Have you ever been involved in an accident in the past? Leading Questions - It is unwise to lead the respondent to provide a certain response to a question you have posed. E.g. Responsible jewellers always use the machine guards, do you use guards? Personal Questions - Avoid the use of personal questions unless they are absolutely necessary Important points for survey studies: Decide your objectives, your target interviewee and questions. Try to use closed-ended answers or multiple choice. Try to collect personal data at the end of your questionnaire. “All information is for statistical purpose only” People are lazy to think and write, find out all answers as possible (or the most common ones) and set them out Identify yourself before you talk to your target interviewees If the necessary information is either already available or impossible to obtain, there is no point in carrying out the survey Is the relevant population available? There is no unique way to go about providing the "best" sampling scheme The investigator will want to obtain answers from as high a proportion as possible of the sample members. Collect answers that are as accurate and as honest as possible. There is an art in designing questions. Respond rate could be improved by including: covering letter, post-paid envelope and gift Sampling Error: Resulting from the fact that information is available on only a subset of all the population members Non-Sampling Error: unconnected with the kind of sampling procedure used Reasons for Sampling Error: 1. The population sampled is not the relevant one 2. Survey subjects may give inaccurate or dishonest answers, in the worst case non-response Action for non-response: Use a good approach to conduct the survey. The characteristics of respondents and non-respondents should be compared, in such matters as age, sex, and race, to see if there are obvious differences between two groups. Try to contact non-respondents, some of who may well be prepared to provide answers to a few key questions. Chapter 4: Graphical Presentation Understanding of the following diagrams: Scatter diagram - provides insights into the nature of the relationship between the two variables. Line Chart - shows the magnitudes/trends for two quantitative variables or for one variable over time. Bar chart - shows the magnitude of data for different qualitative categories. Grouped bar chart - shows the magnitudes of two or more grouped data items for different qualitative categories or over time. Multiple Bars - a number of single bars superimposed on top of each other. Component Bar Charts - Different shading is used to distinguish one set of bars from another. Combination Charts – use both lines and bars to show the magnitudes of two or more data values. Pie Chart - show the proportion or percentages of a total quantity. Exploded Pies – a pie chart that has one or more segments slightly removed. Three-dimensional Pie - Using 3D in an exploded pie makes the picture much more eye-catching. Comparative Pies - compare relative proportions at two different times. Date: 16 Dec, 2004 Edited by: Jacky Wong Page:3 Department of Business Administrative IVE-HW QT1-Exam revision note Characteristic of different chart: 1. To show the relative sizes of data: Bar Chart 2. To show the proportional sizes: Pie Chart 3. To show the change in data over time: Line Chart 4. For casual reader: Pictorial Charts Chapter 5: Frequency Distributions Frequency distribution (or frequency table): A table summary of a set of data that shows the frequency or number of data items that fall in each of several distinct classes Cumulative Frequency Distributions: It enables us to see how many observations lie below or above certain values Relative frequency distribution: Expressed the frequency as a fraction or a percentage of the total number of observations. The sum of all the relative frequencies equals to 1.00 or 100%. Cumulative Relative Frequency Distributions: It enables us to see what is the cumulative fractions or percentages of observations lie below or above certain values, rather than recording the percentages of items within intervals. Quantitative class: class that can be measured on a numerical scale (e.g. Height). Qualitative class: class that classifies information according to qualitative characteristic (e.g. feelings). Open-ended class: Consists of either the upper or the lower end of a quantitative. Close-ended class: Consists of BOTH the upper or the lower end of a quantitative. Discrete Class: Separate entities that progress from one class to the next with a break. Continuous Class: Progress from one class to the next without break. Stem-and-leaf display: Use “leading digits” and “trailing digits” to separate data. Both can be single digit or multi digits. (You MUST be able to draw the stem-and-leaf diagram!). Leafs are NOT sort in order Revised Stem-and-leaf Display: Leafs are sorted in order. Stem-and-leaf display will show the following information: The shape of data distribution, the maximum and minimum values, the central tendency and dispersion and the actual data value. Class Limits: Lower and upper values of the classes (e.g. 5-10) Class Boundaries: Lower and upper values mark as the common points between classes (e.g. 4.5-10.5) Class Width/Interval: Upper class boundaries - Lower class boundaries Class Mid-points: Midway between upper and lower class boundaries Approximate number of classes: 1 + 3.322 log(number of data) Approximate class width/interval: (Largest value – smallest value) / No.of classes **Please remind that: 1. If raw data are grouped into classes, a certain amount of information is lost, since no distinction is made between observations falling in the same class. 2. The larger the class interval is, the greater is the amount of information lost. 3. The smaller the class interval is, the little is the amount of information lost. 4. If the class interval is too small, the small irregularities in the histogram merely reflect the accidents of sampling. Histogram: For unequal class interval, the area of the bar over a class interval must be proportional to the frequency of the class. Frequency Polygon: Plotting the class frequencies versus the class mid-points. The polygon should touch the horizontal axis at both ends of the distribution. Ogive: For cumulative frequency distribution, should have “Less Than” or “More Than” Chapter 6: Descriptive Statistics Know how to find the followings parameters for both Grouped/Ungrouped data: Parameters: Advantages Disadvantages Mean/Weight Mean: It is calculated by summing all the observations in a batch of data and then dividing the total by the number of items involved. One number representing a whole data set Each data set has one and only one mean Every observation is taken into account Useful as comparing the means from several data sets Affected by extreme values Takes time to compute Cannot compute a mean value with open-ended class Median: Middle value in an ordered sequence of data Extreme values do not affect the median Easy to understand and can be calculated from any kind of data Able to find the median even data are qualitative descriptions More complex time-consuming for any data set with a large number of elements Date: 16 Dec, 2004 Edited by: Jacky Wong Page:4 Department of Business Administrative IVE-HW QT1-Exam revision note Mode: A measure of central tendency. The value that is repeated most often in the data set “Bimodal Distribution”- a data set contain two mode. Used as a central location for qualitative as well as quantitative data. Not affected by extreme values Can be used even when one or more of the classes are open-ended Not used as often - no modal value - every value is the mode, is useless measure Difficult to interpret and compare Grouped data cannot reflect the mode Range: The difference between the largest and smallest values The range is easy to understand and to calculate. Ignores the nature of the variation among all other observations, it is heavily influenced by extreme values. Open-ended distributions have no range The range is less stable of measures. As the number of observations is increased, the range generally tends to become larger Midrange: The range is easy to understand and to calculate. It is heavily influenced by extreme values. Open-ended distributions have no midrange because no “highest” or “lowest” value exists in the open-ended class. The midrange is less stable of measures. for example, in repeated samples taken from the some sources, the midrange will exhibit more variation from sample to sample than the other measures. Midhinge: Ignore extreme values by using only the middle half of the data. Thus distinct advantages over the range, which is affected by the extreme values. Like the midrange, the midhinge is based on only two values from the data set Mean Absolute Deviation (MAD): Takes every observation into account. It weights each item equally It is difficult to use in the mathematical operations Standard Deviation (SD): It takes into account every observation in the data set Not as easy to calculate as the Range. Cannot be computed from open-ended distributions. Extreme values in the data set distort the value of the standard deviation, although to a lesser extent than they do the Range Interquartile Range: Measures approximately how far from the median we must go either side before we can include one half of the values of the data set Ignore extreme values by using only the middle half of the data More complicated to calculate than the range Based on only two values from the data set Coefficient of Variation: Relative measure of dispersion, expressed as a percentage rather than in terms of the units of the particular data Useful when comparing the variability of two or more batches of data that are expressed in different units of measurement Ignore extreme values by using only the middle half of the data Coefficient of Skewness < 0, negatively or left skewed. Coefficient of Skewness > 0, positively or right skewed Quartiles: First quartile, Q1: 25% of the observations are smaller and 75% of the observations are larger Q1 = value corresponding to the (N+1)/4 th observation Second quartile, Q2: 50% of the observations are smaller and 50% of the observations are larger Q2 = median = the value corresponding to the (N+1)/2 th observation Third quartile, Q3: 75% of the observations are smaller and 25% of the observations are larger Q3 = value corresponding the 3(N+1)/4 th observation Outlier: is defined as a value that is more than 1.5 times the interquartile range larger than Q 3 or smaller than Q1. (Q1 – 1.5 interquartile range) < Outliers < (Q3 + 1.5 interquartile range) Date: 16 Dec, 2004 Edited by: Jacky Wong Page:5 Department of Business Administrative IVE-HW QT1-Exam revision note Box-and-Whisker Plot (Known as five-number-summary): (Outliers will not be put in this plot) Two important theories about standard deviation: Chebyshev’s Theorem Empirical Rule No matter what the shape of the distribution: The interval ( ± 2) will contain at least 75 % of the measurements. The interval ( ± 3) will contain al least 89 % of the measurements. The interval ( ± 4) will contain al least 94 % of the measurements. Given a Symmetrical and Bell-Shaped distribution: The interval ( ± ) will contain approximately 68 % (68.26 %) of the measurements. The interval ( ± 2 ) will contain approximately 95 % (95.44 %)of the measurements. The interval ( ± 3 ) will contain all or almost all (99.73 %) of the measurements Chapter 7: Basic Probability Basic concept: Complement Union (A B) Intersection Mutually Exclusive (A B) Mutually Exclusive & Collectively Exhaustive Three approaches for probability study: Classical Approach: Probability of an event = Number of outcomes favourable to occurrence of the event / Total number of possible outcomes (E.g. toss coin/dice) Relative Frequency Approach (Empirical Concept): Probability of an event = The proportion of times an event occurs in the long run under uniform condition (E.g. Statistic of a ball game) Subjective Approach: Probability of an event = The degree of belief or degree of confidence placed in the occurrence of the event by a particular individual based on the evidence available. (E.g. A judge is deciding whether to allow the construction of a nuclear power plant) Counting Rules If there are k1 mutually exclusive and collectively exhaustive events on the first trial, k2 events on the second trial, ..., and a kn events on the nth trial, then the number of possible outcomes is: (k1)(k2) … (kn) Factorial: n! = n (n-1) (n-2) … (2) (1) ; 0! Is defined as 1 Permutations: nPr = n! / (n-r)! Combinations: nCr = n! / r! (n-r)! Given 4 students, Peter, John, Sue and Mary. Three students are randomly selected from them What is number of arrangement (Order is concerned)? Ans: 4P3 = 24 What is number of combination (Order is NOT concerned)? Ans: 4C3 = 4 Probability rules: 0 P(A) 1 For any event A P(S) = 1 S is the sample space P(A) + P(A’) = 1 or, P(A’) = 1 - P(A) For any event A Addition Rule: P(A or B) = P(AB) = P(A) + P(B) - P(AB) for A & B are not mutually exclusive Date: 16 Dec, 2004 P(A or B) = P(AB) = P(A) + P(B) for A & B are mutually exclusive Edited by: Jacky Wong Page:6 Department of Business Administrative IVE-HW QT1-Exam revision note Conditional Probability P(A|B) means the probability that event A will occur, given the condition that the event B has occurred, or simply the probability of A given B. Formula for conditional probability: 1. P(A|B) = P(AB) / P(B) 2. P(AB) = P(A|B) X P(B) 3. P(B) = P(AB) / P(A|B) Independent events: E and F are independent events if P(E|F) = P(E) or P(F|E) = P(F) Multiplication Rule: (By formula 2) For Dependent event: For Independent event: P (A and B) = P(AB) = P(A|B) P(B) = P(B) P(A|B) P( A and B) OR = P(AB) = P(A|B) P(B) = P(A) P(B) P (A and B) = P(AB) = P(B|A) P(A) = P(A) P(B|A) = P(BA) = P(B|A) P(A) = P(B) P(A) = P(A) P(B) Law of Total Probabilities: Suppose that the sample space S consists n mutually exclusive and collectively exhaustive events, B1, B2, ..., Bn , then the probability of any event A, consists of the joint probability of event A occurring with event B1, and the joint probability of event A occurring with event B2, and up to the joint probability of event A occurring with event Bn. P(A) = P(AB1) + P(AB2) + ... + P(ABn) = P(A|B1) P(B1) + P(A|B2) P(B2) + … + P(A|Bn) P(Bn) Bayes’ Theorem: Chapter 8: Probability Contribution A probability distribution: a specification (in a form of graph, a table or a function) of the probability associated with each value of the random variable. Probability Mass Function (p.m.f.): A probability distribution involving only discrete value of x Cumulative Mass Function (c.m.f.): The sum of values of the probability mass function for all values of the random variable x that are less than or equal to x. Expected value of X: E[X] = x p(x) for all x The variance of a discrete random variable: V[X] = 2 = E[ (x-)2 ] = (x-)2 p(x), S.D = V[X] Binomial Distribution: Conditions: Each observation can be classified as one of two mutually exclusive events. (i.e. success or failure) The probability for the two possible outcomes must be constant from observation to observation. The result of any observation is independent to the result of any other observations. P(x successes in n trials) = P(X = x \ n, p) = nCx px qn-x, Notation =>X B ( n , p ) or b(x : n, p) Mean() = Expected value = E[X] = np Variance = V[X] = 2 = npq, S.D = npq Poisson Distribution Determine the probability of x occurrence per unit time. Only parameter is the mean rate lambda ( ). Four basic assumptions: Possible to divide time interval of interest into many sub-intervals. Probability of an occurrence remains constant through the time interval. Probability of two or more occurrences in a sub-interval is small enough to be ignored. Independent of occurrences. Date: 16 Dec, 2004 Edited by: Jacky Wong Page:7 Department of Business Administrative IVE-HW QT1-Exam revision note Mean = = Variance = 2 = , S.D = General formula: Mean = = t Variance = 2 = t, S.D = t Poisson Approximation to Binomial Distribution Necessary condition: n is large, normally greater than 100, and p is small, preferably close to zero. If the condition is holds we can approximate the binomial distribution by poisson distribution using, = = np Normal Distribution The curve is completely symmetrical about the mean Two parameters describe the Normal Distribution, representing the mean, and representing the standard deviation. Notation: X N(, 2) or n(x: , 2) Standard Normal Distribution: = 0, = 1 We are able to transform all the observations of any normal random variable X to a new set of observation of a normal random variable Z with mean 0 and variance 1. By the transformation: Z = (X - )/ Normal Approximation to the Binomial Distribution: condition => np 5 and nq 5= np, = np, = npq Correctional factor: Normal Approximation to the Poisson Distribution: condition => 5, = , = Chapter 9: Linear Regression & Correlation Analysis Liner regression: Concentrated on describing the nature of the relationship between two variables Understanding of the dependent (Y) and independent variables (X) The regression equation: **Remarks: The estimated regression equation is valid only over the same range as the one from which the sample was taken initially. Date: 16 Dec, 2004 Edited by: Jacky Wong Even there is a Page:8 Department of Business Administrative IVE-HW QT1-Exam revision note relationship between X and Y, it does not imply X causes Y. The Standard Error of the Estimate: SEE: Linear Correlation: Determine the strength of the linear relationship between these variables. Correlation Coefficient: Coefficient of Determination: r must range from -1 to +1. Negative values corresponding to lines with negative slopes Positive values corresponding to lines with positive slopes If r 0.7, then a strong linear relationship can be concluded, otherwise weak relationship is concluded. r2, is the percentage of data variation explained by the regression equation. If r2 0.5, it means at least 50% of data variation is explained by the estimated regression line. The regression equation is concluded to be good-fit for the sample data. Spearman’s Coefficient of Rank Correlation Rank Correlation Coefficient, rs, measures the degree of correlation that exists between two sets of ranks rather than their actual numerical values. To calculate rs: 1. Rank the X’s among themselves, giving rank 1 to the largest (or smallest), rank 2 to the second largest (or smallest), and so on. 2. Then rank the Y’s similarly 3. Find the sum of the squares of the difference, d, between the ranks of X’s and Y’s . d = (x-y)2 rs is from -1 to 1, it is interpreted as same as r. Chapter 10 Index Number An index number measures change in time series variable in comparison to a base year Price Index: Compares levels of price from one period to another Consumer Price Index (CPI): Measures overall price change of variety of consumer goods and services, and is used to define the cost of living Quantity Index: Measures how much the number of quantity of a variable changes over time Value Index: The value index measures changes in total monetary worth. Composite Index: A single index that reflect a composite, or group, of changing variables (e.g CPI) Objectives of using Index Numbers: 1. Show changes in a series of data values over time 2. Compare data values for different periods 3. Compare the growth of manufacturing output Index: ( (Value of that year) / (Value of base year) ) X 100 Market basket: the total number of items of food with the quantities they were purchased Criteria to determine base period: Date: 16 Dec, 2004 Edited by: Jacky Wong Page:9 Department of Business Administrative IVE-HW QT1-Exam revision note the base period should be fairly recent, since an index number should help people compare present values with past values. If the comparison is to be meaningful, the past (base period) should be recent enough that make people remember its conditions. It is meaningless to tell that prices are 200% above what they were in the Middle age. Base period should be a period of normal condition for the series whose index is sought. If a year of war is chose to be the base year, the consuming pattern may be abnormal in that year. Select a base period that of comparability. For comparisons to be valid, the indexes should have the same base period. e.g. Company A said the index of material cost was 105, company B said that it was 120. the comparison is meaningful unless the base period is the same. Select a base period that of the availability of data. The base period should be a period for which accurate and complete data are available. Sometimes people will choose the census year to be the base year. Compare Laspeyre with Paasche Indexes: Paasche Index requires the quantities to be measured each year and this can be a costly exercise. Laspeyre Index only requires them for the base year. The denominator p0qn in the Paasche Index changes each year, we can only compare one year’s Paasche Index with the base year. For Laspeyre Index, the denominator p0q0 is fixed then each year’s index can be compared with any other year’s index. Because of (ii) above, Laspeyre Index number for several different year can be directly compared, whereas with the Paasche Index comparisons can only be drawn directly between the current year and the base year. Paasche Index keeps current purchasing patterns updated as it continually updates the items in the shopping basket. The weights for Laspeyre Index becomes out of date. Limitation of index number Index Numbers are usually only approximation of changes in price or quantity over time, and must be interpreted with care. Weightings become out of date as time passes. Unless a Paasche Index is used, the weightings will gradually cease to reflect current reality. New products or items may appear, and old ones cease to be significant. for example spending has changed in recent years, to include new items such as domestic computers and video recorders, whereas demand for black & white televisions has declined. These changes would make the weightings of a retail prices index for consumer goods out of date and the base of the index would need revision. Sometimes, the data used to calculate index numbers might be incomplete, out of date, or inaccurate. For example the quantity indices of imports and exports are based on records supplied by traders which may be prone to error or even falsification. The base year of an index should be a normal year, but there is probably no such thing as a perfectly normal year. Some error in the index will be caused by untypical values in the base period. the “basket of items” in an index is often selective. For example the Retail Prices Index (RPI) is constructed from a sample of households and, more importantly, from a basket of only about 600 items. A national index cannot necessarily be applied to an individual town, or an region. for example if the national index of wages rises from 100 to 115, we cannot assume that the wages of people in Glasgow have gone by 15%. It does not reflect the quality of products. Different kinds of index Laspeyres Index: Paasche Index: Laspeyres Quantity Index: : Paasche Quantity Index: Chain base index: The base year progresses a year at a time, so that each index is measured relative to the previous year. It shows how the rate of change is changing as well as the extension of the change over the pervious week. It is calculated with respect to the immediately preceding time point. This approach must be used when the basic nature of the commodity (or the components of the index) changes over the whole time period. Date: 16 Dec, 2004 Edited by: Jacky Wong Page:10