Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011 Monitoring High-yield processes EDUCATION – B.S. Catholic University of Peru – M.A. Monterrey Tech, Mexico – Ph.D. Texas A&M University RESEARCH – Quality Engineering - SPC, Process monitoring – Applied Probability and Statistics – Sequential analysis – Probability modeling – Change point detection, process surveillance Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES MOTIVATION – High-yield processes – Monitor the fraction of nonconforming units p – Very small p (ppm) – To detect increases or decreases in p – A very sensitive procedure Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES ASSUMPTIONS • Process is observed continuously • Process can be characterized by Bernoulli trials • Fraction of nonconforming units p is constant, but may change at an unknown point of time Monitoring High-yield processes Hypothesis Testing For (level ) two-sided tests the region R is made up of two subregions R1 and R2 with limits L and U such that P[X ≤ L] = / 2 P[X ≥ U] = / 2 L U Monitoring High-yield processes Hypothesis Testing Consider testing the proportion p Monitoring High-yield processes Hypothesis Testing The test may be based on different random variables • Binomial (n, p) • Geometric (p) • Negative Binomial (r, p) • Binomial – order k (n, p) • Geometric – order k (p) • Negative Binomial – order k (r, p) Monitoring High-yield processes Binomial tests when p is very small Monitoring High-yield processes Test 1 • proportion p0 = 0.025 • test H0 : p = 0.025 (25000 ppm) against H1 : p 0.025 • X n. of nonconforming units in 500 items • 0.0027 Monitoring High-yield processes Test 1 Let X Binomial (500,p) To test the hypothesis H0 : p = 0.025 against H1 : p 0.025 the rejection region is R = {x ≤ 2} {x ≥ 25} since P[X ≤ 2] = 0.000300 < 0.00135 = /2 P[X ≥ 25] = 0.001018 < 0.00135 = /2 Monitoring High-yield processes Test 1 Plot of P[rejecting H0] vs. p is probability of rejecting Ho 0.01200 0.01000 0.00800 0.00600 0.00400 0.0027 0.00200 0.00000 5000 10000 15000 20000 25000 30000 35000 40000 45000 parts per million Monitoring High-yield processes Hypothesis Testing Now consider testing p0 = 0.0001 (100 ppm) Monitoring High-yield processes Test 1 Let X Binomial (n = 500,p) To test the hypothesis H0 : p = 0.0001 against H1 : p 0.0001 the rejection region is R = {X ≥ 2} since P [X ≥ 2] = 0.0012 For n=500 there is no two-sided test for p = 0.0001. Monitoring High-yield processes Test 1 Binomial (n = 500, p = 0.025) Binomial (n = 500, p = 0.0001) Monitoring High-yield processes Test 1 For this test a plot of P[rejecting H0] vs. p is 0.009 0.008 P [ rejecting Ho] 0.007 0.006 0.005 0.004 0.0027 0.003 0.002 0.001 0 20 40 60 80 100 120 140 160 parts per million 180 200 220 240 260 Monitoring High-yield processes Consider a geometric test for p when p is very small Monitoring High-yield processes Test 2 Let X Geo(p) To test the hypothesis ( = 0.0027) H0 : p = 0.0001 against H1 : p 0.0001 the rejection region is R = {X ≤ 13} {X ≥ 66075} since P[X ≤ 13] = 0.0013 P[X ≥ 66075] = 0.00135 An observation in {X ≤ 13} leads to conclude that p > 0.0001 Monitoring High-yield processes Test 2 For this test a plot of P[rejecting H0] vs. p is 0.01200 P[rejecting Ho] 0.01000 0.00800 0.00600 0.00400 0.00270 0.00200 0.00000 50 100 150 200 p 250 300 Monitoring High-yield processes Another performance measure of a sequential testing procedure Monitoring High-yield processes Hypothesis Testing Let X1, X2, … Geo(p) iid Let T number of observations until H0 is rejected Consider the random variables for j = 1,2,… Aj = 1 Aj = 0 if Xj R P[Aj = 0] = PR otherwise then the probability function of T is P[T= t] = P[A1 = 0] P[A2 = 0]… P[At-1 = 0] P[At = 1] = PR [1-PR]t-1 Monitoring High-yield processes Hypothesis Testing therefore T Geo(PR) Let us consider E[T] = 1/PR as a performance measure then E[T] = 1/PR when p = p0 E[T] = 1/ mean number of tests until H0 is rejected Monitoring High-yield processes Test 2 Let X Geo(p) q=1-p P [X ≤ x] = 1 – qx Let the rejection region R = {X < L} {X > U} then PA = P [not rejecting H0] = P [ L ≤ X ≤ U] = 1 – qU – (1 – qL-1) = qL-1 – qU PR = 1 – (1- p )L-1 + (1 - p)U Monitoring High-yield processes Test 2 Let X Geo(p) To test the hypothesis ( = 0.0027) H0 : p = 0.0001 against H0 : p 0.0001 the rejection region is R = {X < 14} {X > 66074} then P[rejecting H0] is PR E[T] when p = p0 = 1 – (1 – p)13 + (1 – p)66074 = 1/PR E[T] = 1/ = 370.4 Monitoring High-yield processes Test 2 we want E[T] < 370.4 when p > 0.0001 Monitoring High-yield processes Test 2 How can we improve upon this test ? we want E[T] < 370.4 when p > 0.0001 Monitoring High-yield processes run sum procedure Monitoring High-yield processes Geometric chart A sequence of tests of hypotheses Monitoring High-yield processes THE RUN SUM – for the mean Monitoring High-yield processes THE GEOMETRIC RUN SUM Monitoring High-yield processes THE GEOMETRIC RUN SUM - DEFINITION • Let us denote the following cumulative sums SUt = SUt-1 + qt = 0 if Xt falls above the center line otherwise SLt = SLt-1 - qt =0 if Xt falls below the center line otherwise where qt is the score assigned to the region in which Xt falls Monitoring High-yield processes THE GEOMETRIC RUN SUM - DEFINITION • The run sum statistic is defined, for t = 1,2,…, by St = max {SUt, -SLt} with SU0 = 0, SL0 = 0 and limit sum L Monitoring High-yield processes THE GEOMETRIC RUN SUM - DESIGN • Need to define region limits (l1, l2, l3 and l5, l6, l7) region scores (q1, q2, q3 and q4) limit sum L Monitoring High-yield processes THE GEOMETRIC RUN SUM - DESIGN • Region limits above and below the center line are not symmetric around the center line. • To define the region limits we use the cumulative probabilities of the distribution of X Geo (p0) • Such probabilities were chosen to be the same as those of a run sum for the mean with the same scores Monitoring High-yield processes THE GEOMETRIC RUN SUM - DESIGN Monitoring High-yield processes THE GEOMETRIC RUN SUM - EXAMPLE • If X Geo (p0 = 0.0001) the region limits are given by 0.00123 = 0.02175 = 0.15638 = 0.50000 = 0.84362 = 0.97825 = 0.99877 = P [X ≤ P [X ≤ P [X ≤ P [X ≤ P [X ≤ P [X ≤ P [X ≤ l1 ] l2 ] l3 ] l4 ] l5 ] l6 ] l7 ] Monitoring High-yield processes THE GEOMETRIC RUN SUM - EXAMPLE • If X Geo (p0 = 0.0001) the region limits are given by 0.00123 = 0.02175 = 0.15638 = 0.50000 = 0.84362 = 0.97825 = 0.99877 = P [X ≤ 13 ] P [X ≤ 220 ] P [X ≤ 1701 ] P [X ≤ 6932 ] P [X ≤ 18554 ] P [X ≤ 36280 ] P [X ≤ 67007 ] Monitoring High-yield processes THE GEOMETRIC RUN SUM - EXAMPLE • Conclude H1: p p0 when St L • Let T number of samples until H0 is rejected • What is the distribution of T ? • What is the mean and standard deviation? Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING • Markov chain • States defined by the values that St can assume • State space = {-4,-3,-2,-1,0,1,2,3,4,C} where C ={n N | n = …,-6,-5,5,6,…} is an absorbing state • Transition probabilities Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING • Let p1 p2 p3 p4 p5 p6 p7 p8 = = = = = = = = where X Geo (p0) P [ X ≤ l1 ] P [ l1 ≤X ≤ P [ l2 ≤X ≤ P [ l3 ≤X ≤ P [ l4 ≤X ≤ P [ l5 ≤X ≤ P [ l6 ≤X ≤ P [ X > l8 ] l2] l3 ] l4] l5] l6] l7] Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Transitions from St = 0 Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Transitions from St = 1 Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Transitions from St = 2 Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING • Let T be the first passage time to state C n. of observations until the run sum rejects H0 • Let Q be the sub matrix of transient states, then P [T ≤ t] = e ( I – Qt ) J G (s) = se ( I – s Q )-1 ( I – Q) J E [T] = e ( I – Q )-1 J e is a row vector defining the initial state {S0} Monitoring High-yield processes Geometric Run sum For this chart a plot of E[T] vs. p is 600 500 average run length 400 300 200 100 ppm 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 0 Monitoring High-yield processes Geometric Run sum A comparison with Test 2 600 500 370.47 300 200 100 ppm 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 0 20 average run length 400 Monitoring High-yield processes RUN SUM – FURTHER IMPROVEMENT • Consider a geometric run sum – No regions – Center line equal to l4 – Scores are equal to X – Design – limit sum L Monitoring High-yield processes NEW GEOMETRIC RUN SUM - DEFINITION • Let us denote the following cumulative sums SUt = SUt-1 + Xt = 0 if Xt falls above the center line otherwise SLt = SLt-1 - Xt =0 if Xt falls below the center line otherwise Monitoring High-yield processes NEW GEOMETRIC RUN SUM - DEFINITION • The run sum statistic is defined, for t = 1,2,…, by St = max {SUt, -SLt} with SU0 = 0, SL0 = 0 and limit sum L Monitoring High-yield processes NEW GEOMETRIC RUN SUM - MODELING • Markov chain – not possible – huge number of states • Need to derive the distribution of T • Can show that Monitoring High-yield processes NEW GEOMETRIC RUN SUM - MODELING Monitoring High-yield processes CONCLUSIONS • The run sum is an effective procedure for two-sided monitoring • For monitoring very small p, it is more effective than a sequence of geometric tests • If limited number of regions it can be modeled by a Markov chain Monitoring High-yield processes TOPICS OF INTEREST • Estimate (the time p changes – the change point) • Bayesian tests • Lack of independence (chain dependent BT) • Run sum can be applied to other instances - monitoring - arrival process