Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ENGINEERING STATISTICS 2009 1 Engineering System Analysis • Engineering systems analysis is the process of using observations to qualitatively and quantitatively understand a system. • The use of mathematics to determine how a set of interconnected components whose individual characteristics are known will behave in response to a given input or set of inputs. 2009 2 1. What is meant by “understanding a system”? • The ability to predict future outcomes from the system based on hypothetical inputs. 2. How do we go about formalizing an understanding of a System? • Our understanding of a system is formalized by a model that maps input signals to output signals. 3. Why is this important? • 2009 A system model is a key component in the systems engineering design cycle. 3 Systems Engineering Cycle Problem System Analysis Cycle Identify model factors System Design Cycle Estimate model parameters Conceptual design MODEL Optimize design parameters Acquire data Evaluate prototyp e 2009 Simulate model response Final Design Build Prototype 4 • A technician is involved in the implementation of engineering designs. An experienced technician can extrapolate from previous designs to obtain effective solutions to similar problems • An engineer uses the tools of modeling and optimization to generate system designs. An experienced engineer should be able to tackle problems that are completely novel and should provide solutions that are optimal. 2009 5 How not to solve a design problem Problem System Analysis Cycle Identify model factors System Design Cycle Estimate model parameters Conceptual design Simulate model response MODEL Optimize design parameters Acquire data Evaluate prototyp e Build Prototype Final Design 2009 6 The Importance of models Dictionary definition of “model” • A representation of something (usually on a smaller scale) • A simplified description of a complex entity or process. • A representative form or pattern. A comprehensible simplified description of a real world system that captures its most significant patterns or form. The key property of a model for systems engineering design is the ability to predict outcomes from the system. 2009 7 Models • Engineering systems analysis can be thought of as the process of using observations to identify a model of a system. • The process of modeling a system is one of finding correlations or patterns in the observed signals. 2009 8 Statistical framework • Measuring real signals is a statistical process. – Observed signals will be noisy and this noise must be included in the modeling process. Thus, all modeling is inherently a statistical process. – Identified models of systems are uncertain approximations of the real world. The modeling error itself is interpreted as a statistical process. A systems engineer should have a good understanding of statistical modeling and statistical decision methodology 2009 9 What is statistics? • Statistics is the scientific application of mathematical principles to the collection, analysis, and presentation of data – at the foundation of all of statistics is data. Statistics 2009 deals with Collection Presentation Analysis Use data to make decisions and solve problems 10 Engineering statistics • Engineering statistics is the study of how best to… – Collect engineering data – Summarize or describe engineering data – Draw formal inferences and practical conclusions on the basis of engineering data all the while recognizing the reality of variation 2009 11 Engineering Statistics is the branch of statistics that has three subtopics which are particular to engineering. 1. Design of experiments (DOE) – use statistical techniques to test and construct model of engineering components and systems. 2. Quality control and process control – use statistics as a tool to manage conformance to specifications of manufacturing processes and their products. 3. Time and method engineering – use statistics to study repetitive operations in manufacturing in order to set standards and find optimum (in some sense) manufacturing procedures. 2009 12 Data collection methods • • • 2009 Observational Study Experimental Study Opposite ends of a continuum where the “scale” is in terms of the degree to which an investigator manages important variables in the study 13 Types of data • Qualitative Data (Categorical) – Non-numerical characteristics associated with items in a sample – Examples: • Eye color (blue, brown, green, etc) • Engine status (working, not working & fixable, not working & not fixable) • Quantitative Data (numerical) – Numerical characteristics associated with items in a sample – Typically counts of occurrences of a phenomenon of interest or measurements of some physical property – Can be further broken down into discrete (countable) and continuous (uncountable) 2009 14 Collection of quantitative data (Measurement) • If you can’t measure, you can’t do statistics… or engineering for that matter! • Issues: – Validity – Precision – Accuracy (unbiasedness) 2009 15 Measurement issues • Validity – faithfully representing the aspect of interest; i.e.: usefully or appropriately represents the feature of an object or system • Precision – small variation in repeated measurements • Accuracy (unbiasedness) – producing the “true value” “on average” 2009 16 Precision and accuracy Not Accurate Not Precise Precise, Not Accurate 2009 Accurate, Not Precise Accurate and Precise 17 Statistical thinking • Statistical methods are used to help us describe and understand variability. • By variability, we mean that successive observations of a system or phenomenon do not produce exactly the same result. Are these gears produced exactly the same size? 2009 NO! 18 Method Environment Material Man Sources of variability 2009 Machine 19 Example • An engineer is developing a rubber compound for use in O-rings. • The engineer uses the standard rubber compound to produce eight O-rings in a development laboratory and measures the tensile strength of each specimen. • The tensile strengths (in psi) of the eight O-rings are 1030,1035,1020, 1049, 1028, 1026, 1019, and 1010. 2009 20 Variability • There is variability in the tensile strength measurements. – The variability may even arise from the measurement errors • Tensile Strength can be modeled as a random variable. • Tests on the initial specimens show that the average tensile strength is 1027.1 psi. • The engineer thinks that this may be too low for the intended applications. • He decides to consider a modified formulation of rubber in which a Teflon additive is included. 2009 21 Random sampling • Assume that X is a measurable quantity related to a product (tensile strength of rubber). We model X as a random variable – Occur frequently in engineering applications • Random sampling – – – – Obtain samples from a population All outcomes must be equally likely to be sampled Replacement necessary for small populations Meaningful statistics can be obtained from samples R : x1 , x2 , x3 ,, xi ,, x N 2009 22 Point Estimation • The probability density function f(x) of the random variable X is assumed to be known. – Generally it is taken as Gaussian distribution basing on the central limit theorem. f x x 2 exp 2 2 2 1 • Our purpose is to estimate certain parameters of f(x), (mean, variance) from observation of the samples. 2009 23 Sample Mean & Variance 1 Sample mean: M N N x i 1 i N 1 2 2 xi M Sample variance: S N 1 i 1 M is a point estimator of S is a point estimator of 2009 24 Point estimates as random variables • Since the sample mean and variance depend on the random sample chosen, the values of M and S both depend on the sample set. • As such, they also can be considered as random variables. 2009 Fall 25 Examples Sample (N = 10) {55,41,50,44,55,56,48,29,51,66} {60,34,49,43,40,38,53,46,51,46} 2009 M S 49.5 10.01 46 7.69 {45,54,57,71,36,40,60,46,36,53} 49.8 11.29 {66,57,70,55,69,47,39,48,62,39} 55.2 11.64 {56,44,56,39,51,30,45,55,47,62} 48.5 {44,27,38,61,49,54,59,29,44,43} 44.8 11.47 9.49 26 Quality of Estimators • If y u(x1,x2,...,xN) is a point estimator of a parameter q of the population, we want – E{y} q (unbiased) – V{y} should be as small as possible (minimum variance) • Such an estimator is called an unbiased minimum variance estimator. 2009 27 PDF of sample mean 1 Sample mean: M N N x i 1 i EM 2 1 x f X x exp 2 2 2 m 2 1 2 2 f M m exp ; /N M 2 2 m 2 M 2009 28 4 For larger sample sizes (N) the probability that the mean estimate is closer to the mean is higher. N=1 5 1 N=5 N=100 3 2 1 2M 0 0 2009 2 4 6 8 2 N 10 29 Confidence interval We want to determine an interval I for the actual mean so that P I 1 P a M a a f mdm M 1 PM a M a 2009 30 • Given that X is a Gaussian random variable with mean and variance 2. R : x1 , x2 , x3 ,, xi ,, x N 1 M N N 1 N 2 x ; S x M i i N 1 i 1 i 1 2 M Z 1 V 2 2009 N ; has distribution N(0,1) N 2 has a chi-square distribution x M ; i i 1 with N1 degrees of freedom. 31 Define t Z V / N 1 0.3 Then the pdf of t is given by k 1 / 2 t 1 hN t k k / 2 k k N 1 2 0.2 k 1 / 2 0.1 -4 -2 0 2 t 4 This distribution is known as Student’s t-distribution with k degrees of freedom. The distribution is named after the English statistician W.S. Gosset, who published his research under the pseudonym “Student.” 2009 32 hk(t) 0.3 (1/2 0.2 (1/2 0.1 -4 1 0 -2 tk, 2 t 4 tk, P t k , t t k , S S P M t k , M t k , N N 2009 33 • Thus if we obtain the estimates M and S from the sample set, the actual value of the population mean will lie in the interval S S t N , , M t N , M N N with probability . This is called a ×100 percent confidence interval. • The values for Student’s t-distribution are tabulated. 2009 34 Confidence coefficient 2009 N 0.90 0.99 0.995 10 1.8331 3.2498 3.6897 50 1.6766 2.6800 2.9397 100 1.6604 2.6264 2.8713 500 1.6479 2.5857 2.8196 35 Example • Ten measurements were made on the resistance of a certain type of wire. Suppose that M10.48 W and S1.36 W. We want to obtain a confidence interval for with confidence coefficient 0.90. From the table t10, 0.9 1.83 1.36 1.36 1.83,10.48 1.83 10.48 10 10 9.69,11.27 2009 36 Example 1.4 The voltage measured at the output of a system 1.2 1 0.8 V, Volt 0.6 0.4 0.2 0 -0.2 -0.4 0 1 2 3 4 5 6 7 8 9 10 t, msec 2009 37 Statistics with 500 measurements 1.4 mean 1.2 99.9% confidence interval 1 V exp(t/t) 0.8 0.6 t 3 msec 0.4 0.2 0 2009 0 2 4 6 8 10 t, msec 38 Statistical Hypothesis Testing (experiment design) Manufacturing Process at T=200 C 1K 50 W Manufacturing Process at T=350 C ? 1 K ? 50 W H0 : Mean and variance are not changed (null hypothesis) H1 : Mean and variance are changed (alternative hypothesis) 2009 39 Statistical Hypothesis Testing (process optimization) Old Manufacturing Process (tested in time) MTBF 3 months New Manufacturing Process (more costly) ? MTBF 6 months H0 : MTBF ≤ 3 months (null hypothesis) H1 : MTBF > 3 months (alternative hypothesis) 2009 40 • To guess is cheap. To guess wrongly is expensive - Chinese Proverb • There are three kinds of lies: lies, damned lies, and statistics - Benjamin Disraeli (?), British PM • First get your facts, then you can distort them at your leisure - Mark Twain • Statistical Thinking will one day be as necessary for efficient citizenship as the ability to read and write - H. G. Wells 2009 41