6TH INTERNATIONAL CONFERENCE ON MODERN POWER SYSTEMS MPS2015, 18-21 MAY 2015, CLUJ-NAPOCA, ROMANIA Data Mining Tools in Electricity Distribution Systems Bogdan Constantin Neagu Gheorghe Grigoraş Power Systems Department „Gheorghe Asachi” Technical University Iaşi, Romania email@example.com Power Systems Department „Gheorghe Asachi” Technical University Iaşi, Romania firstname.lastname@example.org different approaches such as fuzzy method , neural networks , and genetic algorithm . Abstract— Actually, the ongoing rapid growth of modern computational technology has created an astounding flow of data. The smart grids trends in Romania represent the Smart Metering systems implementation. With the advent of Smart Meter, a huge amount of power system data is being stored dayto-day in the DSO databases. The electricity consumption information contains a lot of valuable knowledge, very useful for both distribution system operator (DSO) and final users. In this context, the smart meters are a key element for load controlling and monitoring of the future smart grids. This paper describes an approach to identify the consumption indicators using Data Mining techniques. Knowledge discovery and machine learning techniques can make use of this data to extract valuable information and interesting patterns in these databases. The originality of the paper consists in the data mining software developed and controlled by a friendly graphic interface for results visualization which can be used in a lot of electricity distribution systems application. The analysis of study case results can lead power consumers to use electricity rationally, and provide decision support for each DSO. The paper presents how load profiling and consumption characteristic indicator extraction methodology can be implemented to customers using the information provided by smart meters. This methodology uses data mining techniques to process data, identify real data, and generate typical load profiles (TLPs). The major innovation of this paper consists in the data mining software developed and controlled by a friendly graphic interface for results visualization which can be used in a lot of electricity distribution systems application. For processing the large number of information provided by the data mining software, our tool can be used. The remainder of this paper is organized as follows. Section 2 presents a short presentation of smart meters. Section 3 presents the proposed data mining tools. Section 4 shows the results of the data mining tools for a lot of substation, taking into account large databases performed using smart meters. Section 5 contains the paper conclusions. Keywords—data mining; smart metering; distribution system; I. II. INTRODUCTION SMART METERING SYSTEMS Nowadays power companies need extensive use of the modern methods and technologies to offer better service to their customers and respond to the needs of power industry. The smart grids trends in Romania represent the Smart Metering systems implementation. Smart meter is an advanced energy meter that measures electrical energy consumption and provides additional information as compared to a conventional energy meter. It aims to improve the reliability, quality and security of supply . Presently, the rapid growth of modern computational technology has created an astounding data flow. In this context, the power system domain is facing an explosive growth of data, and the power system development has resulted in more and more real-time data being stored in databases . The electricity consumption data contains a lot of valuable knowledge, very useful for both distribution system operator (DSO) and final users. In this context, the smart meters are a key element for load controlling and monitoring of the future smart grids. Many DSO are deploying smart metering systems as a first level. Smart metering are an integral part of the smart grid infrastructure in data collection and communications . An important research topic used in optimal operation and planning of distribution systems by DSO refers at customer’s consumption. Also, the electricity consumption data contains a lot of valuable knowledge, very useful for both distribution system operator (DSO) and customers. In this context, the smart meters are a key element for load controlling and monitoring of the future smart grids. These systems have emerged as a breakthrough in relationship between the final consumer and DSO. Coupled with an Advanced Metering Infrastructure (AMI) presented in Fig. 1 , Smart Metering allows a large electricity distribution system application . A typical AMI network includes three main components: smart meters on the consumer side; the communication network between the smart meter and DSO; meter data management application. To extract useful information from a large databases, such as for example power consumption, the data mining techniques may be used . The processing and interpreting this huge volume of data is extremely complex, costly and time consuming . Taking into account that a rational data mining techniques represents an essential instrument in infrastructure development strategies for every DSO, in the paper a complex data mining tools based on smart meters database are presented. Regarding the data mining techniques used for power system application, the recently literature indicate 209 6TH INTERNATIONAL CONFERENCE ON MODERN POWER SYSTEMS MPS2015, 18-21 MAY 2015, CLUJ-NAPOCA, ROMANIA Fig. 1. The AMI infrastructure for power distribution system III. DATA MINING TOOLS IN POWER DISTRIBUTION SYSTEMS Data mining is defined as “the nontrivial extraction of implicit, previously unknown, and potentially useful information from data” . Also, data mining is the process that extracts the information and knowledge which is implicit in them, unknown in advance, but potentially useful, from the massive, incomplete, fuzzy, noisy and random data generated in the practical application . Their techniques are different regarding the following aspects: problem representation, accuracy, parameters that must be optimized, results complexity, run time, interpretability, transparency etc. The proposed data mining software is divided into three main modules : The configuration module for select one or more databases, each of them being defined through the name of the consumption place. The monitoring module can perform a supervision of the smart meter data both in real time (online monitoring) or based on the history data (offline monitoring). The monitoring functions are done for time intervals established upon initialization by the user. Also, this module allows the synchronization of the counter timer with computer timer. Data selection The data management and visualization module can display different reports regarding the information collected by smart meters, such as : power consumption on different points and selectable time intervals (day, month, year), both tabular or graphically, indicating the goal and peak load, the dispersion and the average and hourly power factor or on selectable time intervals: day, month, year; daily, monthly or yearly balances from the monitored consumer; calculus and display of the active, reactive or apparent load curves characteristic parameters (peak value, goal value, average quadratic load; peak load duration; load factor; losses duration; loss factor; dispersion and standard deviation; fill factor of the load curve; coefficient of irregularity and shape factor of the load curve; correlation coefficient between the active and reactive loads etc.) for a certain interval selected by the user. Data transformation Data mining Results interpretation and validation Incorporation of the discovered information Fig. 2. The steps for data mining process The steps for data mining process are made in Fig. 2. An important role plays the visualization, because it may provide the preliminary data understanding or main specific visualizations, presenting the obtaining results using the data mining techniques. The data mining application in electricity distribution system are multiple: power forecasting; consumption monitoring; power networks planning etc. Also, must be mentioned that for processing the large number of information provided by the smart meters, a data mining technique is used. This approach is used for power consumption management and load profiling, taking into account the uncertainty effect in the decision making process. dynamic security assessment; adaptive (control and protection) system design; load modelling and profiling; 210 6TH INTERNATIONAL CONFERENCE ON MODERN POWER SYSTEMS MPS2015, 18-21 MAY 2015, CLUJ-NAPOCA, ROMANIA IV. d duration repported reprresent the relaative value of the peak load to whole w analysis period. STUDY T CASE T highlightinng the utility of To o proposed mining m data toools a largee database waas used. The innformation froom this databaase is recorded with thee smart meterrs in over 800 substations from Molddavian area thhat supply witth electricity a lot of resideential and tertiary t custom mers. In I , for urban u MV/LV V substation, the t average filling f facto ors are 0.644. In analyzeed case, onlly the resideential custtomers can validate v this value, becau use in the terrtiary conssumption casee they have diffferent values,, Fig. 4. B using the proposed By p data mining tool all substations were analyyzed taking into accouunt some chharacteristics. For exam mple, the substations supplly different customers suchh as: residdential, hospital, domestic farm, f supermaarket, hotel, baakery etc. Using U the metthodology preesented in ] and the propposed data mining tool, the active tyypical load prrofile obtainedd for diffeerent costumerr’s categories are presented in Fig.3. T allure of the The t typical loaad profile has changed in reecent yearrs being necesssary the DSO O database update regardingg the conssumption profi files. Fig. 4. 4 Four substatioon filing factor off active daily load d curve For F energy losses determ mination in radial r distribbution netw works the lossees duration meethod is used. The loss factoor for threee substations (bakery, supeermarket and hospital) from m our dataabase was anaalyzed. From Fig. 5 can bee observed thaat the loss factor values estimated witth proposed data mining toool are supeerior to those from our coountry literaturre . It folllows that the energy loosses values inn the low volltage networkks can be overrated o or unnderestimated.. Fig. 3. 3 Typical load profile p determinattion using propossed data mining toools T Table 1 showss the characterristic parameteer (minimum load, peakk load, averagge load, load duration d - Tmaxx, load factor T* T max, coeffficient of varriation - kV, nonuniformityy coefficient - α) regaarding the actiive daily loadd curve for a hospital h consuumer (worrking and weeekend day), extracted e direcctly from the data miniing tools. Table 1. Dailyy load curve charracteristic values extracted e from d mining softw data ware for a hospitaal Pmin [kkW] Pmax [kkW] Pmed [kkW] Tmax [h] [ T*maax kV α Workking Day 18.476 53.759 31.871 144.240 0 0.593 0 0.366 0 0.344 Weeekend Day 17.952 34.962 23.202 15.987 0.666 0.187 0.515 Fig. 5. 5 Loss factor comparative analyysis of active daiily load curve forr three substtation Taking T into account the aforementioned, in ordeer to estim mate the pow wer losses, neeed to have acccess to the onn-line dataa about the substation s loaads in the peeriod under study. s How wever, as it is not practical to measure so s many substtation load ds, and we foound it better to estimate power p loss inn real timee using the datta mining andd clustering tecchniques. I the follow In wing, taking into account the consumpption grow wth, a comparrative analysiss between reaal (estimated using u the proposed datta mining toools) indicatorrs and those ones propposed by otherr authors  at substation level l are madee. CONCLLUSIONS Taking T into account a that tthe data miniing techniquees are wideely used todaay for the anaalysis of large datasets storred in dataabases and daata archives, tthe proposed tools can be used both h by distributioon system opeerators and con nsumers. IIn Fig. 4 the filling factor of the active daily load cuurves onlyy for four subsstations that suupply hospital, hotel, bakeryy and residdential custom mers are presennted. This facttor as known in i the literaature as smooothing coeffi ficient of the load curve, and 211 6TH INTERNATIONAL CONFERENCE ON MODERN POWER SYSTEMS MPS2015, 18-21 MAY 2015, CLUJ-NAPOCA, ROMANIA The proposed data mining tools are an obvious candidate for assisting in such analysis of large scale power system monitoring data. In the paper, significant results obtained from cluster analysis, classification and association rules for illustrate the applicability of data mining tools in power distribution system have been developed.    REFERENCES       Olaru C., Geurts P., Wehenkel L., “Data Mining Tools and Application in Power System Engineering”, Proceedings of the 13th Power System Computation Conference, PSCC99, pp. 324–330, 1999. Piacentini R., “Modernizing Power Grids with Distributed Intelligence and Smart Grid-Ready Instrumentation, Proceedings of Innovative Smart Grid Technologies”, Washington, USA, pp.1 – 6, 2012. Tchokonte N., Narcisse Y., “Real-time Identification and Monitoring of the Voltage Stability Margin in Electric Power Transmission Systems Using Synchronized Phasor Measurements”. Kassel University Press, 2009. Vale Z.A., Ramos C., Pinto T., Ramos S., “Data Mining Applications in Power Systems — Case-studies and Future Trends.”, Seoul, pp.1 – 4, 2009. Mori H., Sakatani Y., Fujino T., Numa K., “An Integrated Method of Fuzzy Data Mining and Fuzzy Inference for Short-term Load Forecasting”, Enginnering Intelligent Systems Journal, Vo. 13, No. 2, pp.73-80, 2005. Mori H., Komatsu Y., “A Hybrid Method of Optimal Data Mining and Artificial Neural Network for Voltage Stability Assessment”, Proceedings of IEEE PowerTechCD-ROM, Petersburg, Russia, 2005.       212 Krishna B., Kaliaperumal B., “Efficient Genetic-wrapper algorithm based data mining for feature subset selection in a power quality pattern recognition application, The International Arab of Information Technology”, Vol. 8, no. 4, pp.397-405, 2011. Gandhi K., Bansal H.O., “Smart Metering in electric power distribution system”, International Conference on Control, Automation, Robotics and Embedded Systems (CARE), pp.1-6, 2013. Shahinzadeh H., Hasanalizadeh-Khosroshahi A., “Implementation of Smart Metering Systems: Challenges and Solutions”, TELKOMNIKA Indonesian Journal of Electrical Engineering, Vol.12, No.7, pp. 51045109, July 2014. Leite D. R. V., Lamin H., de Albuquerque J. M. C., Camargo I. M. T., “Regulatory Impact Analysis of Smart Meters Implementation in Brazil”, Proc. of Innovative Smart Grid Technologies (ISGT), Washington, DC, USA, 2012. Frawley W. J., Piatetsky-Shapiro G., Matheus C. J., Knowledge Discovery in Databases: An Overview, AI Magazine 13, no. 3, 1992. Jia N., Wang J.S., Li N.,“Application of data mining in intelligent power consumption“, International Conference on Automatic Control and Artificial Intelligence (ACAI 2012), pp. 538-541, 2012. Neagu B., Georgescu Gh. and Elges A., "Monitoring System of Electric Energy Consumption to Users", Int. Conf. of Electr. and Power Engng., Iasi, pp. 265-270, 2012. Neagu B., Georgescu Gh., “The Load Curves Profiling and their Parameters of Different Consumer Categories Supplied from Electric Energy Repartition and Distribution Systems”, Buletinul Institutului Politehnic din Iaşi, Tomul LVII (LXI), F. 4, 2011, pp. 167-178. Albert H., Mihailescu A., “Pierderi de putere si energie in retelele electrice. Determinare. Masuri de reducere”, Ed. Tehnica, Bucuresti, 1997.