Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Towards a Practical Approach to Discover Internal Dependencies in Rule-Based Knowledge Bases Roman Simiński, Agnieszka Nowak-Brzezińska, Tomasz Jach, and Tomasz Xiȩski University of Silesia, Institute of Computer Science ul. Bȩdzińska 29, 41-200 Sosnowiec, Poland {roman.siminski,agnieszka.nowak, tomasz.jach,tomasz.xieski}@us.edu.pl Abstract. In this paper, we intend to introduce the conception of discovering the knowledge about rules saved in large rule-based knowledge bases, both generated automatically and acquired from human experts in the classical way. This paper presents a preliminary study of a new project in which we are going to join the two approaches: the hierarchical decomposition of large rule bases using cluster analysis and the decision units conception. Our goal is to discover useful, potentially implicit and directly unreadable information from large rule sets. Keywords: rule knowledge bases, inference, decision unit, cluster analysis, data mining. 1 Introduction The last decade resulted in significant development of data mining methods, tools and applications. Data mining and knowledge-discovery in databases is the process of automatically searching large volumes of data for useful patterns (typically rules) [1, 2]. We will assume that the result of data mining is a set of rules, herein after referred as a knowledge base. From our point of view, the resulting knowledge base is the most interesting feature, rather than the process of acquiring rules from data, but usually we will have in mind the automatic rules generation method based on the rough set theory [3–5]. When the rule sets have been induced from data they can be applied in several ways, for example to classify the unseen cases in classifiers or to build the knowledge base of decision support systems. However, in many applications the process of rule induction is in fact discovering previously unknown knowledge, discovered rules are previously unknown and potentially surprising. Several hundred and sometimes thousands of rules is a frequent result of a data mining process applied on the real-world data sets. The dependencies between rules from different large rule sets are typically not simple and clear enough. An interesting paradox arises: data mining, which was to bring out knowledge hidden in large databases, often discovers and provides large sets of rules, J.T. Yao et al. (Eds): RSKT 2011, LNCS 6954, pp. 232–237, 2011. c Springer-Verlag Berlin Heidelberg 2011 Towards a Practical Approach to Discover Internal Dependencies 233 which contain knowledge which is hard to understand and are directly useless for domain experts. Thus, in many cases the methods of ”the nontrivial extraction of implicit ... and potentially useful information from...” ”...large data sets...” produce large rule sets with nontrivial, probably useful, but unreadable knowledge for human experts who attempt to verify the discovered knowledge. The main goal of this paper is to introduce the conception of discovering the knowledge about rules saved in large rule-based knowledge bases, both generated automatically and acquired from human experts in the classical way. This paper presents a preliminary study of a new project in which we are going to join the two approaches: the hierarchical decomposition of large rule bases using cluster analysis and the decision units conception. The proposed approach assumes that we reorganize any attributive rule knowledge base from the set of not related rules to groups of similar rules (using cluster analysis) [8] or decision units [6]. These structurs will be a source for extracting relevant information about the rules, this information is useful for the understanding of the internal structure of the knowledge base, as it describes the knowledge about knowledge saved in the rule sets. We can use this information to extract the current decision model implicitly stored in the rule sets, as well as for formal verifying and validating the rules against the expert knowledge. We plan to build a modular, hierarchically organized rule base using the cluster analysis method and decision units. These methods have been successfully used in the optimization of the inference task [6] due to the analysis of the internal properties discovered in the rule sets [8, 12]. In this work we want to introduce an extension of the clustered rules and the decision units oriented for extraction of additional knowledge from rule-based knowledge bases. The practical goal of the project is the implementation of visual-oriented software tool for knowledge engineering: a new, second version of kbBuilder system [12] and HKB_Builder system [8]. 2 Preliminaries and Basic Notation The proposed approach is dedicated to the rule knowledge bases containing the Horn clause rules, where literals are coded using attribute-value pairs. Such form of rules representation is very popular and widely used in data mining. Let be the rule knowledge base containing m rules: = {r1 , r2 , . . . , rm }, where r : l1 ∧ l2 ∧ . . . ∧ ln → c and n is the number of conditional literals in the rule r, li denotes the i-th conditional literal of r rule and c denotes the conclusion literal of r rule. Let A = {a1 , a2 , . . . an } is a non-empty finite set of conditional and decision attributes and for every a ∈ A the set Va is called the value set of a: A = {a1 , a2 , . . . an }, Va = {v1a , v2a , . . . , vka }. Each attribute in a ∈ A may be a conditional and/or a decision attribute. Let literals of the rules from become (a, via ) where a ∈ A and via ∈ Va and let notations (a, via ) and a = via be equivalent. We make no assumption about rules in the set . This set can be a result of a data mining process as well as it may be created by human expert or knowledge engineer, still it is possible 234 R. Simiński et al. that contains errors and anomalies, we will use such a rule set for forward and backward inference, which probably will create a deep inference path. 3 Decision Units as a Decision Model for Rule Base The concept of decision units originally came into existence as a tool of rule base verification, the decision units can be also considered as a simple tool for rule knowledge base modeling. The concept of a decision unit is described in the [10– 12] and due to the limited space in this paper can not be presented in detail. The idea of decision units allows us to divide a set of rules into subsets according to a simple criterion: dividing by the elementary decision (the rules contained in the subset have exactly this same attribute-value pair (a, v) in the conclusion), dividing by the decision (the rules contained in the subset have exactly this same attribute a in the each attribute-value pair (a, v) in the conclusion) or dividing by the general decision (like above, but we do not take into account the information about values appearing in the conclusions of the rules). The decision unit is the triple (I, O, R) where R is the subset of rules (R ⊆ ) created according to one of the three criteria described above. O denotes the set of the output entries of the decision unit, output entries are conclusions of rules from R, I denotes the set of the input entries of decision unit, input entries are the conditions of rules from R [12]. According to the selected rules dividing criterion, we obtain three kinds of decision units [11]: elementary decision units — denoted as triple U e = (I e , Oe , Re ), ordinal decision units — denoted as triple U = (I, O, R), general decision units, related to ordinal decision units, but only information about attributes are considered, denoted as triple U g = (I g , Og , Rg ). In the case of the two kinds of decision units — elementary and ordinal — sets I e , I and Oe , O contain attribute-value pairs (a, v). For the general decision units we consider only the attributes, therefore I g and Og are the sets of attributes. An attribute usually represents in knowledge base particular concept or property from the real word. The value of an attribute express the determined meaning of the concept or state of the property. If we consider a specified (a, v) pair, we think about a particular kind of concept or about a particular property state. Because in the considered case the pair (a, v) is a conclusion literal, attribute a is a decision attribute. On the high level of abstraction, an elementary decision i tn the decision table [4]. units De is similar to i-th decision class XA e e O represent concrete, the R describes reasoning about such concrete: we can confirm (or not) hypothesis about concrete using goal-driven backward inference or we can check the availability of concrete based on currently know fact using data-driven forward inference. We can say that an elementary decision units is a model of elementary decision about concrete from real-word concepts. An ordinary decision can be considered as the model of decision making about particular concept or property from the real word. We consider decision about concepts to be on higher level of abstraction than decision about meaning of concepts. Thus the rule set R can be constructed as the composition of elementary decision units De . A decision unit is similar to decision table DT = (U, A ∪ {d}) with single decision attribute d. Towards a Practical Approach to Discover Internal Dependencies 235 Sometimes it is interesting which relations the particular attribute a depends on. It is especially useful when we attempt to discover hidden decision model in an extensive data set which is not known at the early stage. Connections between attributes can express the relations between concepts. For this reason, we introduce general decision units U g . The connections between conclusion and conditional literals are usually hidden in rule bases and only during an inference we discover how sometimes deep the inference chains are. Such knowledge bases generate a few decision units with connections between input and output entries — in this way we obtain a decision units net. An example of the usage of the properties of decision units in the inference optimization task is presented in [6, 7], same issues concerning rule base modeling contains [12]. 4 Rules Clusters as a Decision Model for Rule Base The clustering techniques have to be used whenever the size of input data exceeds capabilities of the traditional methods of data analysis. This is the case especially when we deal with a problem of large number of rules in knowledge bases in modern typical decision support system, we should cluster rules in order to increase the overall efficiency. Effective inference in such structures can be a tremendous task usually overwhelming classical systems. This is why the authors try to find some previously unknown connections in the rules which can aid to create more effective systems. At first, the rules are organized into groups of similar rules. Clustering is considered optimal if each cluster consists of very similar rules and if different clusters are easily distinguished with every another cluster. There is a vast number of methods to achieve this goal, among them there are the hierarchical and the non-hierarchical clustering algorithms. Both try to divide the datasets into the clusters, but the first one produces the hierarchical structure of the rules. By organizing data into the clusters, several additional benefits are achieved. Primo, additional information about analyzed data is acquired. The groups of rules can be further analyzed in order to observe the additional patterns which can be used to simplify the structure of the knowledge base either by simplifying the rules or by reducing the number of them. This data would not be found if the division into groups was not established. Secundo, clustering rules is the faster method of finding relevant rules in the knowledge base. It is an undeniable fact that the cluster analysis methods allow to look for the relevant rule much faster than browsing through the entire knowledge base. In this case, the hierarchical methods are better because of the possibility of finding the relevant rule by traversing through the tree of rules made by the hierarchical algorithm: AHC as well as mAHC [6–8]. Every level of a tree produces more accurate answer and, additionally, the number of comparisons needed is much less than in the classical systems. The concept of new hierarchical model of the knowledge base is described in [6–8] and due to the limits of this paper can not be presented in detail. Let us just remind that it is represented by T ree = {w1 , · · · , w2n−1 } or T ree = {w1 , · · · , wk } where 236 R. Simiński et al. k ≤ 2n − 1 (if we apply the mAHC algorithm which creates a smaller tree). It is essential to note that each node in that tree: wi = {di , ci , f, i, j} (where i, j are numbers of clustered groups) is the representative of the conditional parts ci as well as the decisions di of the rules belonging to it. Also the similarity values f : × → R|[0 · · · 1] of rules forming the cluster wi to each other is stored. Clustering the rules from the knowledge base allows us to agglomerate a set of the rules into the subsets according to a simple criterion of their inner similarity. Apparently, each representant ci consists of all (or the most frequent) literals ci : l1 ∧ l2 ∧ . . . ∧ lk → dj of the conditional parts of the rules clustered in a cluster i (k is the number of conditional literals). At the first step of the algorithm, having the set of rules from knowledge base (), each rule ri (rj ∈ ) represents a node wi in created T ree structure. In the second step, two most similar rules (on the basis of comparison the set of conditional attributes and their values) are found and linked in a cluster, which creates a new node at the higher level of the T ree. We may say that all clusters from nodes that are not leaves represent some kind of concepts. That is why the description of the rules forming a cluster is just a connection of all (or the most frequent) pairs (a, v) from conditional part of those rules. The process ends when there is only one cluster with every rule from the system in it (AHC) or at the desired, previously set level (mAHC)[6, 8]. Having rules organized into clusters shortens the time needed to find relevant ones. Because the created structure is a kind of binary tree, we can state that the time efficiency of searching such trees is O(log2 n). It means that the rule interpreter in a given decision support system does not have to search the whole knowledge base in the time efficiency of O(n) (like it is for the classical knowledge bases). However, in order to have hierarchical structure computed (needed for fast lookup of queries), firstly the clustering algorithm has to be executed. Time complexity of typical hierarchical algorithms varies, but is often a member of O(n2 ) class (noticeable fact: one must execute this algorithm only once). 5 Conclusions The main purpose of this study was to introduce a foundations of practical approach to the internal dependencies discovery in rule-based knowledge bases. We have presented the motivation and objectives of the project, next we have introduced the concept of the representation of knowledge using the clusters analysis and the decision unit. We plan to build a modular, hierarchically organized rule base using these methods. We paraphrase the classical definition of data mining and we shall to introduce the the term knowledge mining — the clusters of rules can be analyzed in order to discover the hidden patterns in the knowledge base, the decision units allow us to discover, verify and improve the current decision model. These methods have been successfully used in the optimization of the inference task. We believe that the techniques described in this paper will allow us to extract some kind of the new, metaknowledge from the large knowledge bases. This information will be useful both in knowledge engineering and in the Towards a Practical Approach to Discover Internal Dependencies 237 utilisation of rule bases within the decision support systems. In the first case we expect that the proposed approach will result in improved the quality of knowledge base, in the second case the groving of efficiency of inference is expected. The practical goal of the project is the implementation of visual-oriented software tool for knowledge engineering: a new, second version of kbBuilder system and HKB_Builder system. References 1. Frawley, W., Piatetsky-Shapiro, G., Matheus, C.: Knowledge Discovery in Databases: An Overview. AI Magazine, 213–228 (Fall 1992) 2. Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001) 3. Nguyen, H.S., Skowron, A.: Rough Set Approach to KDD (Extended Abstract). In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT 2008. LNCS (LNAI), vol. 5009, pp. 19–20. Springer, Heidelberg (2008) 4. Skowron, A., Komorowski, H.J., Pawlak, Z., Polkowski, L.T.: A rough set perspective on data and knowledge. In: Handbook of Data Mining and Knowledge Discovery, pp. 134–149. Oxford University Press, Oxford (2002) 5. Moshkov, M., Skowron, A., Suraj, Z.: On Minimal Rule Sets for Almost All Binary Information Systems. Fundamenta Informaticae 80 (2007) 6. Nowak, A., Simiński, R., Wakulicz-Deja, A.: Towards modular representation of knowledge base. In: Advances in Soft Computing, pp. 421–428. Physica-Verlag, Springer Verlag Company, Heidelberg (2006) 7. Nowak, A., Simiński, R., Wakulicz-Deja, A.: Two-way optimizations of inference for rule knowledge bases. In: Proceedings of International Conference CSP 2008, Concurrency, Specification And Programming, September 29-October 1, vol. 3, pp. 398–409. Humboldt-Universität, Berlin (2008) 8. Nowak, A., Wakulicz-Deja, A.: The way of rules representation in composited knowledge bases. In: Advanced In Intelligent and Soft Computing, Man - Machine Interactions, pp. 175–182. Springer, Heidelberg (2009) 9. Nowak-Brzezińska, A., Jach, T., Xiȩski, T.: Wybór algorytmu grupowania a efektywność wyszukiwania dokumentów. Studia Informatica, Zeszyty Naukowe Politechniki Śląskiej 31(2A(89)), 147–162 (2010) 10. Simiński, R., Wakulicz-Deja, A.: Verification of Rule Knowledge Bases Using Decision Units. In: Advances in Soft Computing, Intelligent Information Systems, pp. 185–192. Physica Verlag, Springer Verlag Company, Heidelberg (2000) 11. Simiński, R., Wakulicz-Deja, A.: Decision units as a tool for rule base modeling and verification. In: Advances in Soft Computing, Information Processing and Web Mining, pp. 553–556. Springer Verlag Company, Heidelberg (2003) 12. Simiński, R.: Decision units approach in knowledge base modeling. In: Recent Advances in Intelligent Information Systems, pp. 597–606. Academic Publishing House EXIT, New Jersey (2009)