Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Working Draft Application of Data Mining in Energy Industry: A Case Study of NOx Prediction Jongsawas Chongwatpol† NIDA Business School, National Institute of Development Administration 118 Seri Thai Road, Bangkapi, Bangkok, 10240 Thailand, Email: [email protected], Phone: +6686-776-9686 Abstract The incorporation of environmental control systems in coal-fired power plants has become increasingly standard. However, power producers are still looking for ways to proactively monitor plant operation so that the level of toxic substances emissions of NOx is complied with the environmental regulations. Particularly under alarm conditions, data mining techniques can be applied to provide plant-wide signals of any unusual operational and coal-quality factors that impact the level of NOx, which is deviated from its traditional standard. This study demonstrates a step-by-step guidance on how to conduct data mining project to explain and predict the leading causes of variation of emission of NOx and in the combustion process. Corrective action and preventive maintenance on those unusual factors are regularly evaluated and monitored. Introduction The use of electricity has been an essential part of economy. Coal power, an established electricity source that provides a vast quantity of inexpensive and reliable power, has become more important as supplies of oil and natural gas. Coal-fired power plants currently fuel 41% of global electricity. In fact, a higher percentage of electricity produced by coal-fired power plants can be expected in some countries. Although the incorporation of environmental control systems in coal-fired power plants has become increasingly standard, power producers are still looking for ways to proactively monitor plant operation so that the level of toxic substances emissions of NOx is complied with the environmental regulations. Particularly under alarm conditions, data mining techniques can be applied to provide plant-wide signals of any unusual operational and coal-quality factors that impact the level of NOx, which is deviated from its traditional standard. This study demonstrates a step-by-step guidance on how to conduct data mining project to explain and predict the leading causes of variation of emission of NO x in the combustion process. Coal-Fired Power Plants A case study of a coal-fired power plant in Thailand has been conducted to explore the application of data mining techniques in the energy industry. The flow of the electricity generation process in a coal-fired power plant starts when coal is crushed into a fine power in the coal bunker to increase the surface area for the burning process. The powdered coal is then burned at a high temperature in the combustion chamber of a boiler. The hot gases and heat energy produced convert water in tubes lining the boiler into steam, which is used to spin turbines to generate electricity. The byproduct of this combustion process is the flue gas, which is discharged into the air. This flue gas contains water vapor, fly ash, and many toxic substances such as carbon dioxide, especially NOx, and SOx. To comply with the environmental 1|Page regulations, this flue gas must be treated appropriately by the installed protected system equipment. Figure 1 presents the scatter plots of NOx and SOx emissions from July 2009 to June 2014. For SOx, any emission level that is greater than 262 ppm requires immediate attention. Similarly, corrective action is needed for NOx Emission greater than 241 ppm. Additionally, any emission level that is greatly deviated from the mean is worth mentioning. Promoting preventive maintenance on the factors that affect the stack emission of NOx and SOx helps improve the performance of the plant accordingly. Average of 150 ppm Average of 152 ppm Figure 1: NOx and SOx Emission from July 2009 to June 2014 2|Page Currently, the plant employs traditional excel-based regression analysis to monitor the power plant performance. Figure 2 presents an example of the overall reaction equations, currently employed at the plants: ----------------------------------------------------------------------------------------------------------------------------- ------CaHbOcNdSe + wH2O + g(O2+3.76N2) x1CO + x2CO2 + x3H2O + x4NO + x5NO2 + x6SO2 + x7SO3 + x8O2 + x9N2 + x10C Where the primary reaction form of Sulphur Oxide S(s) + O2(g) = SO2 (g) and The secondary reaction SO2 (g) + 1/2O2(g) = SO3 Where the primary reaction form of Nitrogen Oxide 1/2N(s) + 1/2O2(g) = NO(g) (Temperature is above 1600 C or 2900 F) The secondary reaction NO(g) + 1/2O2(g) = NO2(g) -----------------------------------------------------------------------------------------------------------------------------------Figure2: The Overall Reaction Equations This study seeks to develop predictive modeling to support management decision making. Since NOx is one of the key contributors to Thailand’s pollution, the focus of this study is on the following research question “What are the most important factors that influence the stack emission of NOx and SOx in the combustion process?” To answer this research question, the researcher examine whether more complex analytical models using several data mining methodologies and algorithms can be predict and explanation. In this study, we follow the five steps of the SEMMA methodology– Sample, Explore, Modify, Model, and Assess Research Framework: In this study, we provide the in-depth analysis on how data mining approach can be a great help to improve the overall plant performance and to reduce the potential plant pollution. For the preliminary analysis, the data set contains a total of 30,000 records with 117 variables, which is derived from approximately 11 months of power plant operations from 2013 to 2014. Figure 3 and Table 1 present the research framework and the examples of variables used in this study. 3|Page Data Partition Data Data Data Preparation -Select Data Collect initial process -Describe Data -Explore Data -Verify Data Quality -Clean Data -Construct Data -Intregrate Data Model Building Decision Trees Neural network Regression Model Testing Training Data Set Validating Data Set Model Evaluation Model Deployment Reseach Question “What factors have a great impact on stack emission when the coal qualities are changed in the combustion process?” Figure 3: The Research Framework for Stack NOx and SOx Prediction Model 4|Page Table 1: An Example of Variables Used in this Study Variable Stack NOx Emission Stack SOx Emission Date Operation control process Model Role Target Target ID Input Measureme nt Level Nominal Nominal Nominal Nominal Description The flue gas exhaust NOx emissions (ppm) The flue gas exhaust SOx emissions (ppm) Date and time : hourly average Operation control conditions: For Stack NOx and SOx prediction: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 Fuel qualities Input Nominal 3RY_SH_OUTLET MAIN STEAM PRESS (A) 3RY_SH_OUTLET MAIN STEAM PRESS (B) 3RY_RH_OUTLET STEAM PRESS 3RY RH OUTLET STEAM PRESS 3RY SH OUTLET MAIN STEAM TEMP-A 3RY SH OUTLET MAIN STEAM TEMP-B HOT REHEAT STEAM TEMP (ST INLET) (RH-2) HOT REHEAT STEAM TEMP (ST INLET) (LH) HOT REHEAT STEAM TEMP (ST INLET) (RH-1) ECO OUTLET GAS O2-A ECO OUTLET GAS O2-B ECO OUTLET GAS O2 (LOW SELECT) COAL FLOW-A COAL FLOW-B COAL FLOW-C COAL FLOW-D COAL FLOW-E COAL FLOW-F BURNER TILT ADJUSTABLE CONTROL DRIVE _AA_A DEM-1 RB ADJUSTABLE CONTROL DRIVE _AA_A DEM-2 RB ADJUSTABLE CONTROL DRIVE _AA_B DEM-2 RB ADJUSTABLE CONTROL DRIVE _AA_C DEM-1 RB ADJUSTABLE CONTROL DRIVE _AA_D DEM-1 RB ADJUSTABLE CONTROL DRIVE _AA_C DEM-2 RB ADJUSTABLE CONTROL DRIVE _AA_D DEM-2 RB AUX DAMPER _U_AA_A DEM-1 RB AUX DAMPER _U_AA_B DEM-1 RB AUX DAMPER _L_AA_A DEM-1 RB AUX DAMPER _L_AA_B DEM-1 RB AUX DAMPER _U_AA_A DEM-2 RB AUX DAMPER _U_AA_B DEM-2 RB AUX DAMPER _L_AA_A DEM-2 RB AUX DAMPER _L_AA_B DEM-2 RB AUX DAMPER _U_AA_C DEM-1 RB AUX DAMPER _U_AA_D DEM-1 RB AUX DAMPER _L_AA_C DEM-1 RB AUX DAMPER _L_AA_D DEM-1 RB AUX DAMPER _U_AA_C DEM-2 RB AUX DAMPER _U_AA_D DEM-2 RB AUX DAMPER _L_AA_C DEM-2 RB AUX DAMPER _L_AA_D DEM-2 RB AUX DAMPER _U_AA_C DEM-2 RB AUX DAMPER _U_AA_D DEM-2 RB AUX DAMPER _L_AA_C DEM-2 RB AUX DAMPER _L_AA_D DEM-2 RB 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 UNIT_GROSS_MW MAIN_STEAM_FLOW MAIN STEAM PRESS MAIN STEAM TEMP RH STEAM TEMP FDF-A INLET VANE CD FB FDF-B INLET VANE CD FB IDF-A INLET VANE CD FB IDF-B INLET VANE CD FB BOOSTER FAN-A BLADE PICH TR SIGNAL BOOSTER FAN-B BLADE PICH TR SIGNAL BUF-A MOTOR CURRENT BUF-B MOTOR CURRENT FDF-A MOTOR CURRENT FDF-B MOTOR CURRENT IDF-A MOTOR CURRENT IDF-B MOTOR CURRENT AH_A INLET GAS TEMP (2) AH_B INLET GAS TEMP (2) AH_A INLET GAS TEMP (1) AH_B INLET GAS TEMP (1) AH_A OUTLET FLUE GAS TEMP (1) AH_A OUTLET FLUE GAS TEMP (2) AH_A OUTLET FLUE GAS TEMP (3) AH_B OUTLET FLUE GAS TEMP (1) AH_B OUTLET FLUE GAS TEMP (2) AH_B OUTLET FLUE GAS TEMP (3) IDF-A INLET FLUE GAS TEMP IDF-B INLET FLUE GAS TEMP FGD GAS DUCT INLET GAS TEMP STACK INLET FLUE GAS TEMPERATURE PULVERIZER-A PRIMARY AIR FLOW PULVERIZER-B PRIMARY AIR FLOW PULVERIZER-C PRIMARY AIR FLOW PULVERIZER-D PRIMARY AIR FLOW PULVERIZER-E PRIMARY AIR FLOW PULVERIZER-F PRIMARY AIR FLOW TOTAL AIR FLOW TON PER HOUR AH_A OUTLET SECONDARY AIR FLOW AH_B OUTLET SECONDARY AIR FLOW FGD BOOSTER FAN-A FLUE GAS FLOW FGD BOOSTER FAN-B FLUE GAS FLOW FGD BOOSTER FAN FLUE GAS FLOW STACK INLET FLUE GAS FLOW FGD BYPASS DUCT FLUE GAS DIFF PRESS 14 15 16 17 18 19 20 21 22 23 24 25 26 Weight average K2O Input to Bolier Weight average MgO Input to Bolier Weight average Mn3O4 Input to Bolier Weight average Na2O Input to Bolier Weight average Nitrogen Input to Bolier Weight average P2O5 Input to Bolier Weight average SiO2 Input to Bolier Weight average SO3 Input to Bolier Weight average Ti2O Input to Bolier Weight average Total Mositure Input to Boiler Weight average Total Sulphur Input to Bolier Weight average Volatile Matter Input to Boiler Ash analysis percent Unburnt Carbon Coal qualities and chemical composition in coal 1 2 3 4 5 6 7 8 9 10 11 12 13 Weight average Al2O3 Input to Bolier Weight average Ash Input to Boiler Weight average CaO Input to Bolier Weight average Carbon Input to Boiler Weight average Chloride Input to Boiler Weight average ESP K-Factor Input to Bolier Weight average F.T. Input to Bolier Weight average Fe2O3 Input to Bolier Weight average Fixed Carbon Input to Bolier Weight average fuel ratio Input to Bolier Weight average Total Heat Input to Boiler Weight average HGI Input to Boiler Weight average Hydrogen Input to Bolier 5|Page Preliminary Result and Discuss: NOx - The Flue Gas Exhausts Emission Prediction Data Exploration The first task in data exploration is to get some senses of the potential causes of increasing NO x level in the power generation process. Based on the Pearson Correlation results in Figure 4, increasing Motor Current, Unit_Gross_MW, and AH_A_Outlet_Secondary_Air_Flow mean higher NOx level. Similarly, we observe a decrease in the NOx level, when increasing AUX_DAMPER_U_AA_CDEM and Weight_average_Nitrogen_Input. According to the variable worth in Figure 4, the top five variables which have a great impact on NOx emission include AH_Outlet_Secondary_Air_Flow, FDF_Motor_Current, ECO_Outlet_O2, Unit_Gross_MW, AH_inlet_Gas_temp, Main_steam_flow; meanwhile, the lowest variable worth is Ambient_Pressure Figure 4: NOx – Variable Worth and Pearson Correlation 6|Page Model Development The dataset is partitioned into 70% for training and 30% for validation. Six predictive models are developed and compared: (1) a decision tree with a maximum of 2 branches, (2) a decision tree with a maximum of 3 branches, (3) a neural network with 1 hidden layer and 3 hidden units, (4) an autoneural network, (5) a stepwise regression, and (6) a polynomial regression. Model Comparison The training dataset is used to build predictive models. The next step is to find out the accuracy of the model fit on the validation dataset. We measure the performance of our models based on the ASE (average squared error). ASE is evaluated among the three models built on the validation dataset. The lower the ASE, the better the model is predicted. The results of models developed to predict NOx level are presented in Figure 5. Stepwise regression generates the best overall prediction accuracy with the lowest ASE, 48.889, followed by polynomial regression (ASE = 50.780) and a 3-way decision tree (ASE = 67.312). A 3-way Decision Tree Model Figure 6 summarizes the variable importance selected for constructing the 3-way decision tree model. The top three important factors include (1) FDF_A_Motor_Current, (2) Weight_average_F_T_Input_to_Boiler, and (3) AH_B Outlet Secondary Air Flow. A total of 29 “IfThen” rule-based predictions from the decision tree model is generated with the maximal tree of 41 leafs. Stepwise Regression Model The stepwise regression equation is presented below. The power plant may start by monitoring the Weight_average_Mn3O4_Input_to_Boi and Weight_average_fuel_ratio_Input_to_Boi in the combustion process. The power plant may try to improve operations to reduce the ECO_Outlet_Gas_O2_Low_Select and FDF_A_Motor_Current or increase Aux_Damper_U_AA_A_RB, Coal_Flow_C, and Coal_Flow_C to reduce the NOx emission. Stack NOx Emission = - 251.8 - 0.7626(Adjust_Contorl_Drive_AA_C_D) + 0.8761(AH_A_inlet_Gas_Temp) +0.6903(AH_A_outet_Gas_Temp) - 1.3539(AH_B_inlet_Gas_Temp) – 0.5702(Aux_Damper_U_AA_A_RB) – 0.8045(Coal_Flow_A) – 0.2535(Coal_Flow_B) – 0.3303(Coal_Flow_C) + 15.4966(ECO_Outlet_Gas_O2_Low_Select) + 2.2013 (FDF_A_Motor_Current) + 1.0684(Main_Steam_Press) + 0.4897(Pulverizer_A_Primary_Air_flow) + 0.2240(Pulverizer_C_Primary_Air_flow) – 175.6(Weight_average_Mn3O4_Input_to_Boi) – 13.0318(Weight_average_Na2O_Input_to_Boi) + 30.5430(Weight_average_fuel_ratio_Input_to_Boi) 7|Page Figure 5: Fit Statistics for NOx Prediction Figure 6: A 3-Way Decision Tree Model 8|Page Conclusion This finding helps the power plant prioritize the important factors associated with the NO x emission; thus closer attention to those factors can be promptly initiated. The results of this study show that data mining approaches are capable of predicting the NOx level, given sufficient data with the proper input variables. Power plant managers can use their existing databases along with advanced analytics through data mining approaches to accurately predict any other toxic substances to monitor plant performance and especially to comply with environmental regulations. Contact Information Your comments and questions are valued and encouraged. Contact the authors at: Name: Jongsawas Chongwatpol, Ph.D. Enterprise: NIDA Business School, National Institute of Development Administration Address: 118 Seri Thai Road, Bangkapi, Bangkok, 10240 Thailand Email: [email protected], [email protected] Jongsawas Chongwatpol is a lecturer in NIDA Business School at National Institute of Development Administration. He received his BE in industrial engineering from Thammasat University, Bangkok, Thailand, and two MS degrees (in risk control management and management technology) from University of Wisconsin - Stout, and PhD in management science and information systems from Oklahoma State University. His research has recently been published in major journals such as Decision Support Systems, Decision Sciences, European Journal of Operational Research, and Journal of Business Ethics. His major research interests include decision support systems, RFID, manufacturing management, data mining, and supply chain management. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. 9|Page