Download A Data Mining Solution for Small & Medium Business

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
A Data Mining Solution for Small & Medium Business
Li Jun1 Li Xingsen1
1. Management School of Graduate University of Chinese Academy of Sciences, P.R.China,
100080
2. Graduate University of Chinese Academy of Sciences, P.R.China, 100080
Abstract Data mining has been used in many large-scaled enterprises. However, it is believed that
there is neither necessity nor enough data to be mined in Small & Medium Businesses (SMBs). Based
on our working experiences in enterprises and the analysis on the relationship between data mining and
information system, we think data mining is more valuable for Small & Medium Business. Accordingly,
a data mining model for SMB is presented and a detailed working process is illustrated. It collects the
experts’ experience in daily working, combines human’s enterprises with knowledge from data mining,
and avoids crooked road on the way of informationalization of enterprises. Its applications in companies
show that this solution can improve data quality and can carry out data mining project in enterprises
with low quality data and accelerate the speed of decision support systems in SMBs.
Key words Data mining, Knowledge management, Small & Medium Business, Extenics, Application
model
1 Introduction
,
Recently data mining as a knowledge resource has been accepted widely, especially in large
sized-enterprises, government, and financial departments[1,2]. As the information systems of Small &
Medium Business either are still in the planning phase, or the systems are not complete yet, or the
database of the system is quite small, it is thought that there is no necessity to use data mining technique,
in other words, only simple statistics and analysis are enough. Actually, comparing with large-sized
enterprises, SMBs are in inferior positions which have more furious competition and less information. In
this case, they are eager to require a technique to support decision-making in differentiation
management regarding market positioning, selling, producing, storing, etc. Further more, SMBs may
have “Able Person” crisis, which means the group depends on the able person completely or which has
division among the employees in the organization. These crises can occur when able person decides to
leave the company. A 1998’s survey of European firms by KPMG Peat Marwick found that almost half
of the companies have been suffered a significant setback from losing key staff, 43% experienced
impaired client or supplier relations and 13% faced income loss because of the departure of one single
employee [3]. In order to change Able Person System into standard management system, knowledge
management has become the focus of experts’ research [3], as it may have difficulty and a long time in
converting the outstanding employees’ invisible knowledge into visible knowledge. Data mining is
recognized as one of the most important information resources, especially for SMBs. Data mining could
help them establish knowledge base during the development phase and avoid making mistakes, gain the
benefit from informationalization process as soon as possible, improve the quality of the data from the
information system and make the right decision. Besides, it still could bring additive value from data
service and new revenue point. However, the cost of present data mining software is quite high and the
quality of data needs to be improved or even no data at all, therefore most Small & Medium Business
could not afford and carry out it.
The purpose of this paper is to propose a data mining application model for SMBs. The rest of our paper
is organized as follows. In Section 2, we analyze the current situations of SMBs and the problems that
they faced. Section 3 presented a series of specific solutions that could do data mining project by steps
for SMBs. Furthermore, we apply our model to a real application to verify the effectiveness and got a
satisfactory answer in Section 4.The paper is summarized in Section 5.
2 Analysis of Small & Medium Business
The quality of information systems is quite diverse because of lacking capital, techniques, and human
986
resource. At present, there are 4 kinds of situations of information systems among SMBs:
2.1 None information system
There are quite a lot SMBs are lacking of information systems. Fortunately, they are already aware of
the importance of information management systems or even planning to establish them. What kinds of
information systems need to be carried out, how to implement, and the cost & effect of using it are the
most concern for them.
2.2 Incomplete information system
About 57% SMBs (statistics data in Manufacture Industry Informationalization Conference, 2006,
Beijing) have already implemented some management information systems, such as financial
management, storing management, etc. Nonetheless, the data are not complete and consistent. For
example, short of customers’ business information, the record of selling products, and so on. In this case,
the data could not be transformed into valid report forms. Sometimes, it needs people to input data by
themselves so as to create a reports. How to using the present information system effectively, how to
transform the data into valid report form and what system needs to be carried out in the next phase are
the most concern for those SMBs.
2.3 Complete information system, but poor data quality
About 25% SMBs are having relatively complete information systems (such as ERP DRP POS SCM
CRM) which could realize online data transmit and create report forms automatically. However, as
different systems come from different suppliers, it is quite difficult to integrate the different databases
and therefore the systems are quite isolated from each other. It is reported that report forms are a lot, but
one is different from the other and no one is reliable for making right decisions (from both speeches in
the Information system in Manufacture Industry Informationalization Conference, 2006, Beijing and
catchwords on the web). At present, the quality of data has been the shortage in supporting decision
making and prevents the information system working effectively. How to use the management
information system effectively, turn the information debt into revenue, and bring value in decision
making and promote business increase are the most important questions for these enterprises.
2.4 Complete information system and relatively good data quality
Less than 10% SMBs are having complete information system and relatively good data quality. They
have already used database, OLAP, etc. to analyze the data primarily. Whereas, these analysis
emphasizes particularly on historic data show and have less usage in future business and decision
direction. How to make the decision making more scientifically and gain more value from the data are
these enterprises care the most.
Generally speaking, the lacking of talents & capital, the furious competition, and short-term
benefit-oriented business model are quite frequent problems the Small & Medium Business face. Some
enterprises even never hear about data mining. The ones who know about data mining have less
confident in carrying out data mining project which needs large investment, high techniques, and long
operation time.
、 、 、 、
3 Application solutions of data mining in SMBs
Extension Theory, which was established in 1976 by Prof. Wen Cai in China, is a discipline which
studies the extensibility of things, the laws and methods of exploitation and the innovation to solve all
kinds of contradiction problems in real world with formalized models [4]. Extension theory establishes
matter-element, affair-element and relation-element to describe matter, affair and relation. From the
view of matter-element analysis in extension theory, a matter can be divided into two parts: imaginary
part and real part, from the view point of material nature of matters, or soft part and hard part from the
view point of system nature, which is called the conjugate nature of matter-elements. Every matter is the
entity of the real part and the imaginary part. It's saying that real part is the base and imaginary part is
what we used [5].
According to Extension theory, data mining is conjugated. The real part is data mining technique and
software tool and the imaginary part is the thought of data mining and the methodology. As usual, the
imaginary part plays a very important part. Base on the integration of the real and imaginary parts, we
design the following solutions for Small & Medium Business:
987
3.1 Establishing information system planning based on data mining in the SMB which haven’t
information system.
Figure 1 Information system planning based on data mining
The process includes 5 steps:
1. Defining the business objects which are supported by decision-making according to corporate
strategy and competition environment.
2. Based on the data which is required by business objects, making the data map.
3. Choosing implemented software and the application order, etc. according to information system
planning from the mined business objects and data map.
4. Carrying out planned information system and accumulating data.
5. Starting data mining project to gain mined conclusion for decision-making when the mined data
accumulated to the required amount.
In this way, it could prevent from making mistakes and may help organizations make rational decisions
in a short time. In the mean while, the information system would be updated continually and become
perfect in the end.
3.2 The enterprises which have incomplete information systems and need to improve them
especially from the data mining aspect
988
Figure 2 Perfecting the information system based on data mining
The process includes 5 steps, Step 1 and Step 2 are the same with 3.1, and the others are as follows:
Step 3. Identifying the present information system could satisfy the requirement of data mining based
on the business object and the data map. If not, choosing the complementary software system and
the applying order.
Step 4. Carrying out the complementary information system and accumulating data.
Step 5. Starting data mining project to gain mined conclusion for decision-making when the mined
data accumulated to the required amount.
In this case, it could help organizations perfect the information system efficiently according to the
requirement of enterprise’s decision. In other words, it could have effect in a very short time.
3.3 The SMBs which have complete information system but poor data quality and need to
improve the quality of data via data mining
Figure 3 Data mining consultancy improves data quality [6]
In order to improve the accuracy and integrity of the data, paper [6] gives a detailed solution. it needs to
identify the distance between the present data and the objective data from the data mining aspect, supply
data mining consulting (including data mining, data quality analysis, and suggestions providing from
data mining aspect, etc.), adjust data structure, the way of storage & integration, and remaining time
length, etc.
By recycling data mining experiment and taking improvement measures, the differences will be
diminished and high quality data will take the place of the poor ones. Along with the quality of data
enhanced gradually, it may be changed sharply. Once conclusions of data mining benefit the business
decision-making, senior management will pay more attention to data’s accuracy and take some effective
measures that will boost information system development, such as increasing input, rectifying
management, emphasizing data analysis, etc. With the above measures, we can augment the demand of
data, make more data integrated, deal with the relevant quality issues and come to next phase of data
mining consulting and implementation. This kind of spirally recycling implementation will not only
impel the transform from un-mining data to mining data, but also enhance quality of corporation
informationization as a whole.
3.4 Carrying out data mining project directly in the SMBs which have complete information
system and relatively good data quality
Data mining projects in SMBs could adopt data mining software which has sole-function, such as
software MCLP [7]. At the same time, establishing knowledge platform, accumulating knowledge, and
changing the Able Person System.
989
Graph 4 Implement solution of data mining project
In addition, it may help the SMBs which have wide business and insufficient data mining internet
information and gain the knowledge of supplier, customer, and competition environment, etc.
4 Case studies
The number of registered customers and ordinary visitors of an internet company had increased rapidly
since it established. It is providing richer information& more products and the accumulated data of each
business unit becomes more abundant as well. These data are eager to be analyzed & mined and become
useful in the organization’s future development. However, OLAP statistically analyses are quite
descriptive and lacks of illustration of the rules and the business value behind the data. Therefore, it is
very difficult to know the intrinsic relation among the data, understand the real demand of customers,
and even forecast the requirement in the future. The reason could be that the information customers
supplied is not complete and has low validity and reliability. In this case, many data mining enterprises
are not likely to cooperate with this company.
To know the characteristic and the real requirement of the clients and develop the corresponding product
as soon as possible along with the more and more furious competition, this company cooperated with us
to solve the problem. The team analyzed the operation the customer data in depth in virtue of Extension
theory and rich data mining experience and proposed to use data mining consulting to improve data
quality and carry out the project by phases. At present, a multi-objective linear programming method
based software which is cheaper has been used in experimental data mining. Correspondingly, some
primarily conclusions are deduced and got a good effect.
5 Conclusions
Based on Extension theory, this paper enlarges the extent of data mining to some SMBs which have
poor data quality or even no information systems, and supplies specific solutions to different enterprises.
In practice, data mining could make a lot of effort to SMBs. It can collect knowledge and improve the
quality of decision-making, finally convert the Able Person System into standard management
system to enhance the organization’s competition power. Nonetheless, there are still some limitations
during the implementation of data mining project. For instance, the cost of integration and update of the
software could be high, the maintain ability of personnel is quite poor, and the requests of mining
internet data are quite strict, and so on. Anyway, we are sure that if we study on them together sooner or
later we can find solutions.
990
Acknowledgment
This research has been partially supported by Key Project #70531040, National Natural Science
Foundation of China; #70472074, #70501030, National Natural Science Foundation of China; 973
Project #2004CB720103, Ministry of Science and Technology, China; and BHP Billion Co., Australia.
Corresponding author: Xingsen Li, e-mail: [email protected].
Reference
[1] J. Han, K. Micheline, Data Mining: Concepts and Techniques (2nd.ed), Morgan Kaufmann, 2006.
[2] Y. Shi. Data mining, M. Zeleny (Ed.), IEBM Handbook of Information Technology in Business,
International Thomson Publishing, England, 2002.
[3] Maryam Alavi, Dorothy E. Leidner, Review: Knowledge management and knowledge management
systems: Conceptual foundations and research issues, MIS Quarterly; Vol.25 No.1 Mar
2001:107 136
[4] W. Cai, C.Y. Yang, et al, A New Cross Discipline —Extenics, Science Foundation In China,
2005,13(1):55 61.
[5] W. Cai, Extension theory and its application, Chinese Science Bulletin 1999,44(17):1538 1548
[6] Li Xing-sen Shi Yong Li Ai-hua, Application Study on Enterprise Data Mining Solution Based
on Extension Set, Journal of Harbin Institute of Technology, 7(2006):1124 1128(in Chinese)
[7] Gang Kou, Xiantao Liu, Yi Peng, Yong Shi, Morgan Wise and Weixuan Xu, Multiple criteria linear
programming approach to data mining: models, algorithm designs and software development
Optimization Methods and Software Vol. 18, No. 4, August 2003:453 473
~
~
,
,
,
~
991
~
~