Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Data Mining Solution for Small & Medium Business Li Jun1 Li Xingsen1 1. Management School of Graduate University of Chinese Academy of Sciences, P.R.China, 100080 2. Graduate University of Chinese Academy of Sciences, P.R.China, 100080 Abstract Data mining has been used in many large-scaled enterprises. However, it is believed that there is neither necessity nor enough data to be mined in Small & Medium Businesses (SMBs). Based on our working experiences in enterprises and the analysis on the relationship between data mining and information system, we think data mining is more valuable for Small & Medium Business. Accordingly, a data mining model for SMB is presented and a detailed working process is illustrated. It collects the experts’ experience in daily working, combines human’s enterprises with knowledge from data mining, and avoids crooked road on the way of informationalization of enterprises. Its applications in companies show that this solution can improve data quality and can carry out data mining project in enterprises with low quality data and accelerate the speed of decision support systems in SMBs. Key words Data mining, Knowledge management, Small & Medium Business, Extenics, Application model 1 Introduction , Recently data mining as a knowledge resource has been accepted widely, especially in large sized-enterprises, government, and financial departments[1,2]. As the information systems of Small & Medium Business either are still in the planning phase, or the systems are not complete yet, or the database of the system is quite small, it is thought that there is no necessity to use data mining technique, in other words, only simple statistics and analysis are enough. Actually, comparing with large-sized enterprises, SMBs are in inferior positions which have more furious competition and less information. In this case, they are eager to require a technique to support decision-making in differentiation management regarding market positioning, selling, producing, storing, etc. Further more, SMBs may have “Able Person” crisis, which means the group depends on the able person completely or which has division among the employees in the organization. These crises can occur when able person decides to leave the company. A 1998’s survey of European firms by KPMG Peat Marwick found that almost half of the companies have been suffered a significant setback from losing key staff, 43% experienced impaired client or supplier relations and 13% faced income loss because of the departure of one single employee [3]. In order to change Able Person System into standard management system, knowledge management has become the focus of experts’ research [3], as it may have difficulty and a long time in converting the outstanding employees’ invisible knowledge into visible knowledge. Data mining is recognized as one of the most important information resources, especially for SMBs. Data mining could help them establish knowledge base during the development phase and avoid making mistakes, gain the benefit from informationalization process as soon as possible, improve the quality of the data from the information system and make the right decision. Besides, it still could bring additive value from data service and new revenue point. However, the cost of present data mining software is quite high and the quality of data needs to be improved or even no data at all, therefore most Small & Medium Business could not afford and carry out it. The purpose of this paper is to propose a data mining application model for SMBs. The rest of our paper is organized as follows. In Section 2, we analyze the current situations of SMBs and the problems that they faced. Section 3 presented a series of specific solutions that could do data mining project by steps for SMBs. Furthermore, we apply our model to a real application to verify the effectiveness and got a satisfactory answer in Section 4.The paper is summarized in Section 5. 2 Analysis of Small & Medium Business The quality of information systems is quite diverse because of lacking capital, techniques, and human 986 resource. At present, there are 4 kinds of situations of information systems among SMBs: 2.1 None information system There are quite a lot SMBs are lacking of information systems. Fortunately, they are already aware of the importance of information management systems or even planning to establish them. What kinds of information systems need to be carried out, how to implement, and the cost & effect of using it are the most concern for them. 2.2 Incomplete information system About 57% SMBs (statistics data in Manufacture Industry Informationalization Conference, 2006, Beijing) have already implemented some management information systems, such as financial management, storing management, etc. Nonetheless, the data are not complete and consistent. For example, short of customers’ business information, the record of selling products, and so on. In this case, the data could not be transformed into valid report forms. Sometimes, it needs people to input data by themselves so as to create a reports. How to using the present information system effectively, how to transform the data into valid report form and what system needs to be carried out in the next phase are the most concern for those SMBs. 2.3 Complete information system, but poor data quality About 25% SMBs are having relatively complete information systems (such as ERP DRP POS SCM CRM) which could realize online data transmit and create report forms automatically. However, as different systems come from different suppliers, it is quite difficult to integrate the different databases and therefore the systems are quite isolated from each other. It is reported that report forms are a lot, but one is different from the other and no one is reliable for making right decisions (from both speeches in the Information system in Manufacture Industry Informationalization Conference, 2006, Beijing and catchwords on the web). At present, the quality of data has been the shortage in supporting decision making and prevents the information system working effectively. How to use the management information system effectively, turn the information debt into revenue, and bring value in decision making and promote business increase are the most important questions for these enterprises. 2.4 Complete information system and relatively good data quality Less than 10% SMBs are having complete information system and relatively good data quality. They have already used database, OLAP, etc. to analyze the data primarily. Whereas, these analysis emphasizes particularly on historic data show and have less usage in future business and decision direction. How to make the decision making more scientifically and gain more value from the data are these enterprises care the most. Generally speaking, the lacking of talents & capital, the furious competition, and short-term benefit-oriented business model are quite frequent problems the Small & Medium Business face. Some enterprises even never hear about data mining. The ones who know about data mining have less confident in carrying out data mining project which needs large investment, high techniques, and long operation time. 、 、 、 、 3 Application solutions of data mining in SMBs Extension Theory, which was established in 1976 by Prof. Wen Cai in China, is a discipline which studies the extensibility of things, the laws and methods of exploitation and the innovation to solve all kinds of contradiction problems in real world with formalized models [4]. Extension theory establishes matter-element, affair-element and relation-element to describe matter, affair and relation. From the view of matter-element analysis in extension theory, a matter can be divided into two parts: imaginary part and real part, from the view point of material nature of matters, or soft part and hard part from the view point of system nature, which is called the conjugate nature of matter-elements. Every matter is the entity of the real part and the imaginary part. It's saying that real part is the base and imaginary part is what we used [5]. According to Extension theory, data mining is conjugated. The real part is data mining technique and software tool and the imaginary part is the thought of data mining and the methodology. As usual, the imaginary part plays a very important part. Base on the integration of the real and imaginary parts, we design the following solutions for Small & Medium Business: 987 3.1 Establishing information system planning based on data mining in the SMB which haven’t information system. Figure 1 Information system planning based on data mining The process includes 5 steps: 1. Defining the business objects which are supported by decision-making according to corporate strategy and competition environment. 2. Based on the data which is required by business objects, making the data map. 3. Choosing implemented software and the application order, etc. according to information system planning from the mined business objects and data map. 4. Carrying out planned information system and accumulating data. 5. Starting data mining project to gain mined conclusion for decision-making when the mined data accumulated to the required amount. In this way, it could prevent from making mistakes and may help organizations make rational decisions in a short time. In the mean while, the information system would be updated continually and become perfect in the end. 3.2 The enterprises which have incomplete information systems and need to improve them especially from the data mining aspect 988 Figure 2 Perfecting the information system based on data mining The process includes 5 steps, Step 1 and Step 2 are the same with 3.1, and the others are as follows: Step 3. Identifying the present information system could satisfy the requirement of data mining based on the business object and the data map. If not, choosing the complementary software system and the applying order. Step 4. Carrying out the complementary information system and accumulating data. Step 5. Starting data mining project to gain mined conclusion for decision-making when the mined data accumulated to the required amount. In this case, it could help organizations perfect the information system efficiently according to the requirement of enterprise’s decision. In other words, it could have effect in a very short time. 3.3 The SMBs which have complete information system but poor data quality and need to improve the quality of data via data mining Figure 3 Data mining consultancy improves data quality [6] In order to improve the accuracy and integrity of the data, paper [6] gives a detailed solution. it needs to identify the distance between the present data and the objective data from the data mining aspect, supply data mining consulting (including data mining, data quality analysis, and suggestions providing from data mining aspect, etc.), adjust data structure, the way of storage & integration, and remaining time length, etc. By recycling data mining experiment and taking improvement measures, the differences will be diminished and high quality data will take the place of the poor ones. Along with the quality of data enhanced gradually, it may be changed sharply. Once conclusions of data mining benefit the business decision-making, senior management will pay more attention to data’s accuracy and take some effective measures that will boost information system development, such as increasing input, rectifying management, emphasizing data analysis, etc. With the above measures, we can augment the demand of data, make more data integrated, deal with the relevant quality issues and come to next phase of data mining consulting and implementation. This kind of spirally recycling implementation will not only impel the transform from un-mining data to mining data, but also enhance quality of corporation informationization as a whole. 3.4 Carrying out data mining project directly in the SMBs which have complete information system and relatively good data quality Data mining projects in SMBs could adopt data mining software which has sole-function, such as software MCLP [7]. At the same time, establishing knowledge platform, accumulating knowledge, and changing the Able Person System. 989 Graph 4 Implement solution of data mining project In addition, it may help the SMBs which have wide business and insufficient data mining internet information and gain the knowledge of supplier, customer, and competition environment, etc. 4 Case studies The number of registered customers and ordinary visitors of an internet company had increased rapidly since it established. It is providing richer information& more products and the accumulated data of each business unit becomes more abundant as well. These data are eager to be analyzed & mined and become useful in the organization’s future development. However, OLAP statistically analyses are quite descriptive and lacks of illustration of the rules and the business value behind the data. Therefore, it is very difficult to know the intrinsic relation among the data, understand the real demand of customers, and even forecast the requirement in the future. The reason could be that the information customers supplied is not complete and has low validity and reliability. In this case, many data mining enterprises are not likely to cooperate with this company. To know the characteristic and the real requirement of the clients and develop the corresponding product as soon as possible along with the more and more furious competition, this company cooperated with us to solve the problem. The team analyzed the operation the customer data in depth in virtue of Extension theory and rich data mining experience and proposed to use data mining consulting to improve data quality and carry out the project by phases. At present, a multi-objective linear programming method based software which is cheaper has been used in experimental data mining. Correspondingly, some primarily conclusions are deduced and got a good effect. 5 Conclusions Based on Extension theory, this paper enlarges the extent of data mining to some SMBs which have poor data quality or even no information systems, and supplies specific solutions to different enterprises. In practice, data mining could make a lot of effort to SMBs. It can collect knowledge and improve the quality of decision-making, finally convert the Able Person System into standard management system to enhance the organization’s competition power. Nonetheless, there are still some limitations during the implementation of data mining project. For instance, the cost of integration and update of the software could be high, the maintain ability of personnel is quite poor, and the requests of mining internet data are quite strict, and so on. Anyway, we are sure that if we study on them together sooner or later we can find solutions. 990 Acknowledgment This research has been partially supported by Key Project #70531040, National Natural Science Foundation of China; #70472074, #70501030, National Natural Science Foundation of China; 973 Project #2004CB720103, Ministry of Science and Technology, China; and BHP Billion Co., Australia. Corresponding author: Xingsen Li, e-mail: [email protected]. Reference [1] J. Han, K. Micheline, Data Mining: Concepts and Techniques (2nd.ed), Morgan Kaufmann, 2006. [2] Y. Shi. Data mining, M. Zeleny (Ed.), IEBM Handbook of Information Technology in Business, International Thomson Publishing, England, 2002. [3] Maryam Alavi, Dorothy E. Leidner, Review: Knowledge management and knowledge management systems: Conceptual foundations and research issues, MIS Quarterly; Vol.25 No.1 Mar 2001:107 136 [4] W. Cai, C.Y. Yang, et al, A New Cross Discipline —Extenics, Science Foundation In China, 2005,13(1):55 61. [5] W. Cai, Extension theory and its application, Chinese Science Bulletin 1999,44(17):1538 1548 [6] Li Xing-sen Shi Yong Li Ai-hua, Application Study on Enterprise Data Mining Solution Based on Extension Set, Journal of Harbin Institute of Technology, 7(2006):1124 1128(in Chinese) [7] Gang Kou, Xiantao Liu, Yi Peng, Yong Shi, Morgan Wise and Weixuan Xu, Multiple criteria linear programming approach to data mining: models, algorithm designs and software development Optimization Methods and Software Vol. 18, No. 4, August 2003:453 473 ~ ~ , , , ~ 991 ~ ~