Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Journey of Learning from Statistics to Manufacturing, Logistics, Engineering Design and to Information Technology Professor J.-C. Lu Industrial and Systems Engineering Georgia Institute of Technology Contents 1 2 3 4 5 Introduction Statistics in Reliability Quality Improvement in Manufacturing Data Mining in Manufacturing Product Design, Manufacturing and Service Chain Management System 6 Information Technology in Education 1. Introduction • Traditional Research Approach: Thesis Background Application #1 New Methods Application #2 “Modifications” “Extensions” • • • New “Areas” Application #k • Non-Traditional Research Methods: Non-Traditional Research Approach Real-life Problems Business Team-work Application-oriented Literature Best Practice Practical Problem Solving Cross-disciplines Academic Problem Formulation Literature Review Academia Discipline-focused New Methods or New Areas in Research Impact Analysis Time 2. Statistics in Reliability Traditional Research Approach: Lu, J. C. (1989), “Weibull Extensions of the Freund and Marshall-Olkin Bivariate Exponential,” IEEE Transaction on Reliability, 38, 5, 615619. Lu, J. C. and Bhattacharyya, G. K. (1990), “Some New Constructions of Bivariate Weibull Models,” Annals of the Institute of Statistical Mathematics, 42(3), 543-559. Lu, J. C. (1990), “Least Squares Estimation for the Multivariate Weibull Model of Hougaard Based on Accelerated Life Test of System and Component,” Communication in Statistics, 19(10), 3725-3739. Lu, J. C. and Bhattacharyya, G. K. (1991), “Inference Procedures for a Bivariate Exponential Model of Gumbel Based on Life Test of System and Components,” Journal of Statistical Planning and Inference, 27, 383-396. Lu, J. C. and Bhattacharyya, G. K. (1991), “Inference Procedures for a Bivariate Exponential Model of Gumbel,” Statistics and Probability Letters, 12, 37-50. Lu, J. C. (1997), “A New Plan for Life-Testing Two-Component Parallel Systems,” Statistics and Probability Letters, 34(1), 19-32. x (1) x (2) x (r) ••• y [1] ’ y [2] ’ x*(r+1) y*[r] ’ y*[r+1] ’ The life-testing experiment was terminated at and data with superscript “*” are censored at x (1) < x (2) y [1] , y [2] , < •••, •••, y [r] x (r) x*(n) ••• y*[n] . x (r), x (r). are ordered statistics, are concomitant ordered statistics. Sample Publications from the Traditional Research Approach: Chen, D., and Lu, J. C. (1998), “The Asymptotics of Maximum Likelihood Estimates of Parameters Based on a Data Type Where Failure and Censoring Times are Dependent,” Statistics and Probability Letters, 36, 379-391. Chen, D., Li. C. S., Lu, J. C., and Park, J. (2000), “Simple Parameter Estimation for Bivariate Shock Models with Singular Distribution for Censored Data with Concomitant Order Statistics,” Australian and New Zealand Journal of Statistics, 42(3), 323-336. Non-traditional Research Approaches: A. Start to work with Nortel in the printed circuit board (PCB) manufacturing area in 1989. Get the 1st Nortel grant in 1990. Publish the 1st paper (in JASA – case study) in 1994. B. Start to work with NCSU’s Semiconductor Center in 1990. Early publications appeared in 1991 (Proceedings), 1993 (engineering journal) and 1997 (statistics journal). Reliability Degradation Studies (First example of the Non-traditional Research Approach): Lu, J. C., Park, J. and Yang, Q. (1997), “Statistical Inference of a Time-to-Failure Distribution from Linear Degradation Data,” Technometrics, 39(4), 391-400. Su, C., Lu, J. C., Chen, D., and Hughes-Oliver, J. M. (1999), “A Linear Random Coefficient Degradation Model with Random Sample Size,” Lifetime Data Analysis, 5, 173-183. Chen, D., Lu, J. C., X. Huo, and Ming, Y. (2001), “Optimum Percentile Estimating Equations for Nonlinear Random Coefficient Models,” Journal of Statistical Planning and Inference,275-292. NSF DMII-ORPS Program, “Modeling Accelerated Degradation Data for Product Reliability Improvement and Warranty Analysis,” 20012003 (with Paul Kvam). Linear Degradation Model (semiconductor manufacturing): y ij 0i = + 1i log(t ij ) + ij , i = 1, 2, …, k (#replicates), j = 1, 2, …, ni (#successive repeated measurements), y ij t = current, threshold voltage shift or transconductance degradation, time. = ij Linear Random Coefficient Model: Assume 0 and 1 have a bivariate normal distribution with mean (0 , 1 ), variance ( 02 , 12 ) and correlation . Define the failure time T as the time that the degradation reaches a specified level y f , and set y = 0 + 1 T . f 0 )/ 1 The distribution of the failure time T = ( yf – Pr( T t ) = Pr( ( y f – 0 )/ 1 < t) { A / B }, where A = 0 + t 1 – yf and B = sqrt(C), C = 02 + 21 t2 + 2 t 0 1 . is Non-linear Degradation Model (motivated from both semiconductor and PCB manufacturing studies): Y i = f ( Xi , i ) + i , i = + b i (random effects). Note that E( Y i ) f ( Xi , E( i )) = f ( Xi , ). Thus, f ( Xi , ) is not the mean response of the population, and may not be the median of the distribution of Y i even when zero is the distribution mean of errors i . By correcting the bias of the median regression, estimates of were obtained from solving a system of (optimum) unbiased percentile estimating equations (PEE). The asymptotic distribution of the estimates was derived. Several examples of asymptotic efficiency evaluations were given. 3. Quality Improvement in Manufacturing Non-Traditional Research (examples): Mesenbrink, P., Lu, J. C., McKenzie, R., and Taheri, J. (1994), “Characterization and Optimization of a Wave Soldering Process,” Journal of the American Statistical Association (JASA), 89, 1209-1217. Gardner, M. M., Lu, J. C., et al. (NCSU ECE and TI researchers) (1997), “Equipment Fault Detection using Spatial Signatures,” IEEE Trans. on Components, Hybrids and Manufacturing, 20(4), 295-304. Hughes-Oliver, J. M., Lu, J. C., Davis, J. C., and Gyurcsik, R. S. (1998), “Achieving Uniformity in a Semiconductor Fabrication Process using Spatial Modeling,” JASA, 93, 36-45. Lu, J. C., et al. (SRC (semiconductor research corporation) and NCSU ECE people) (1998), “A New Device Design Methodology,” IEEE Trans. on Electron Devices - Special Issue on Process Integration and Manufacturability, 45(3), 634-642. Li, C. S., Lu, J. C., Park, J., Kim, K. M., Brinkley, P. A., and Peterson, J. (1999), “A Multivariate Zero-inflated Poisson Distribution and its Inferences,” Technometrics, 41(1), 29-38. 4. Data Mining in Manufacturing Rying, E. A. Bilbro, G. L. Ozturk, M. C., and Lu, J. C. (2000), “In Situ Selectivity and Thickness Monitoring based on Quadrupole Mass Spectroscopy during Selective Silicon Epitaxy,” Proceedings of the 197th Meetings of the Electronchemical Society, 383-392. Lu, J. C. (2001), “Methodology of Mining Massive Data Set for Improving Manufacturing Quality/Efficiency,” Chapter 11 (pp. 255-288) in Data Mining for Design and Manufacturing edited by D. Braha, Kluwer Academic Publishers: New York. Lada, E. K., Lu, J. C., and Wilson, J. R. (2002), “A Wavelet Based Procedure for Process Fault Detection,” IEEE Trans. on Semiconductor Manufacturing, 15(1), 79-90. Rying, E. A., Bilbro, G. L., and Lu, J. C. (in press), “Focused Local Learning with Wavelet Neural Networks,” IEEE Trans. on Neural Networks. Porter, A. L., Kongthon, A., and Lu, J. C. (in press), “Research Profiling – Improving the Literature Review: Illustrated for the Case of Data Mining of Large Datasets,” Scientometrics. Data from Nortel’s Antenna Manufacturing Process Auto-Correlation Map Keywords (Cleaned) (cor map2) B Similarity > 0.75 0.50 - 0.75 0.25 - 0.50 < 0.25 te m p o ra l d a ta b a se s d a ta re d uc tio n w a ve le t tra nsfo rm s d a ta a na lysis re m o te se nsing im a g e re c o g nitio n c la ssific a tio n sp a tia l d a ta struc ture s D fuzzy se t the o ry im a g e c la ssific a tio n fe a ture e xtra c tio n d e c isio n tre e s Ba ye s m e tho d s sta tistic a l a na lysis le a rning (a rtific ia l inte llig e nc e ) p a tte rn re c o g nitio n C p a tte rn c la ssific a tio n ne ura l ne ts A d a ta vem ryining la rg e d a ta b a se s fuzzy ne ura l ne ts p a tte rn c luste ring tre e d a ta struc ture s b a c kp ro p a g a ti on im a g e p ro c e ssing fuzzy lo g ic tre e s (m a the m a tic s) p a ra lle l a lg o rithm s unsup e rvise d le a rning p a ra lle l p ro g ra m m ing Node size reflects relative frequency in the dataset of 991 abstract records. Placement is based on a VantagePoint proprietary Multi-dimensional Scaling (MDS) routine. Topics depicted close together are Discrete Wavelet Transform: Data Reduction Procedures 1 Linear and Nonlinear Approximation in Signal Processing 2 Information Metric Based Procedures 3 Data Denoising Procedures 4 Our Methods RRE_h and RRE_s 5 Comparisons • Testing Curves • “Data without Noises” • “Data with Inherent Random Noises” Linear and Nonlinear Approximation in Signal Processing Information Metric Based Procedure – AMDL (Approximation Minimum Description Length) Saito’s (1994) method selects C to minimize AMDL(C) = 1.5 C log2 N + 0.5 N log 2 [ N ( y i i=1 2 ^ – y i,C ) ]. Data De-noising Procedures: Donoho and Johnstone (1995) considered the nonparametric regression model, y i = f i + i , i = 1, 2, …, N, where i are i.i.d. normal variables with zero mean and constant variance. The goal of the data de-noising procedures is to find a smooth estimate to minimize the mean square error (MSE). Three methods,VisuShrink, RiskShrink and SURE (Stein’s Unbiased Risk Estimate) were compared in our studies. Seven Testing Curves, Two Reallife Data Examples Comparison Results (“Data without Noise”) Comparison Results (“Data with Inherent Random Noises”) Decision Rules (based on the “reduced-size data”) 1 2 3 4 Chi-square tests Multi-scale Statistical Process Control (SPC) (Functional) Principal Component Analysis (PCA) Bayesian Odds-ratio Probability-based Classification (and Canonical Variation Analysis) 5 Decision Tree (CART) 6 Scalogram (from Signal Processing Literature) 7 Integrated Energy Metrics Scalogram Challenges: derive the distribution of the “energy,” 2 E j = I ( | wjk | ) wjk , where is decided from the k data reduction method, and w jk is the wavelet coefficient. Key Challenges in Data Mining Procedures in Manufacturing Applications: The replication size in “fault classes” is small. Proposal: generating “learning data” Example: Rying (2001) conducted 25 runs of RTCVD experiments with four induced fault cases. Nominal Runs: Four Induced Fault Cases Challenges in Learning-data Generations: 1. Difficult to generate the “data shifting patterns” (e.g., Rying’s nominal data) at the wavelet domain, which has a much smaller size of data to deal with compared to the original data domain with possible large size data. Idea: “Zoom-in” the regions that “fault data patterns” occurred, and generate the shifteddata at the original data domain in these focused regions. Illustration Example: “Zoom-in Procedure”: Generate Replicates in the Wavelet Domain with the following “Patching Technique”: 5. Product Design, Manufacturing and Service (PDMS) Chain Management System Initiatives in iTimes (Information Technology Integrated Manufacturing Enterprise System) Engineering Domains Customer-Driven Design/Engineering Application Areas Additive Fabrication E-Design, Engineering Supply Chains Simulation-Based Design Environments for Field Service Engineering Aero/Auto/Elec Systems Education Materials Design Enabling Technologies Decision Making and Design Synthesis Interoperability: Fine and Coarse Grained Engr. Modeling, Validation, Testbeds IT Architectures for Affordable Change Tools for Modeling Current Involvement in iTimes: (1) developing a collaborative game theory based decision support system for structuring interactions among partners in the ePDMS chain, e.g., random coefficient based evolution modeling of utility functions changing over the “co-developing periods”); (2) extracting design-relevant relationships from “data” collected from various sources, e.g., past designs, conditions of machines on the factory floor at distributed sites, etc.; (3) monitoring and controlling resource (e.g., energy) utilization and environmental impact. Challenges in Data Mining on Product Design (1) “Retrieving past design information”: How to define “similarity” in 3-D geometric objects with spatial relationships? Is it possible to develop a “multi-resolution” presentation of design models or data? (2) Source of “variation” in design (3) Relationship between design, manufacturing and service activities. Analysis Models of Varying Fidelity Analysis Models (CAE) Design Model (CAD) 1D Beam/Stick Model Airframe Subassembly Associativity Gaps 3D Continuum/Brick Model Diverse Fidelities Design Model Analysis Model PWA Component Occurrence 3 APM linear-elastic model primary structural total height, h c material PWB C L h1 body 1 APM ABB core: FR4 Plane Strain Bodies System 2 ABB Component base: Alumina Epoxy Solder Joint Solder Joint Plane Strain Model 4 CBAM To body 4 body 3 body 2 plane strain body ,i i = 1...4 geometry i material ( E , , ) i Informal Associativity Diagram 3 APM sj solder joint shear strain range 1 SMM 2 ABB deformation model Fine -Grained Associativity approximate maximum inter-solder joint distance component occurrence c primary structural material hc linear-elastic model [1.1] length 2 total thickness pwb primary structural material Tc Ls [1.2] hs linear-elastic model [1.1] detailed shape solder 1.25 + rectangle solder joint Plane Strain Bodies System Lc total height component linear-elastic model Ts [1.2] [2.1] average bilinear- elastoplastic ABB SMM model [2.2] T0 a L1 h1 stress-strain model 1 T1 L2 h2 stress-strain model 2 T2 geometry model 3 stress-strain model 3 T3 xy , extreme, 3 Tsj Constrained Object -based Analysis Template Constraint Schematic View 4 CBAM xy , extreme, sj 6. Information Technology in Education CaMILE IC web-page links Laboratory project Web-based User Interface Modeling and analysis tools in “existing systems ePDMS decision support tools Middleware (e.g., CORBA, SOAP, Jini, etc.) Case study database Simulated enterprise operation system Industrial practicum reports and case studies Architecture of the Integrated Curriculum (IC)-ePDMS System