Download Knowledge Uncertainty in Intelligent System

ATHABASCA UNIVERSITY KNOWLEDGE UNCERTAINTY IN INTELLIGENT SYSTEM BY SHIKHA SHARMA An essay submitted in partial fulfillment Of the requirements for the degree of MASTER OF SCIENCE in INFORMATION SYSTEMS Athabasca, Alberta October, 2010 ©Shikha Sharma, 2010 i DEDICATION I would like to dedicate this essay to my parents, and my siblings Rinki and Anshul, who have always been a source of encouragement and motivation to me. Without their continued love and support, this would not have been possible. ii ABSTRACT One of the prominent questions in the field of artificial intelligence is “how to deal with knowledge uncertainty?” Uncertainty is a fundamental and inevitable feature of daily life; it is a central topic in many domains such as economics, artificial intelligence, and logic. Management of uncertainty is an essentially important issue in the design of an intelligent system. Various uncertainty models are available to deal with uncertainty: Bayesian network. Fuzzy logic, Rough set theory, Multi-valued logic, and Uncertainty can be found in many different Information Technology applications such as semantic web services and data mining. These applications are used in day to day lives where modeling and reasoning with uncertainty is primordial; this makes it critical to have excellent measures in place to deal with uncertainty. For intelligent system to deal with this uncertainty there has to be a structured soft-computing framework in place, which will allow it to accomplish this goal. The essence of designing an intelligent system lies in its ability to effectively control an object in the dynamic environment under the influence of uncertainty. Hybridization of soft computing techniques provides a cutting edge to the hybrid intelligent systems. The design and architecture play a central role in the success of intelligent system. At the design level, dealing with uncertainty at object, environment and goal level help to deal with uncertainty at an architecture level. Therefore, having a right design and architecture for intelligent system defines the success of intelligent systems. ANFIS is an excellent example of an intelligent system based upon hybridization of neural network and fuzzy logic useful in suppressing maternal ECG from fetal ECG. An intelligent system that is iii implemented to handle uncertainty can handle real world situations accurately and effectively than a system where uncertainty is fully ignored. iv ACKNOWLEDGEMENTS I am heartily thankful to my supervisor, Larbi Esmahi whose encouragement, guidance and support from the initial to the final level enabled me to develop an understanding of the subject. Very special thanks to my Mom, Dad, Rinki and Anshul for providing me with the support during this journey. I would also like to thank wonderful friends for their continued support and encouragement. v TABLE OF CONTENTS Table of Contents INTRODUCTION ....................................................................................................... 1 1.1 Background................................................................................................... 1 1.2 Statement of Purpose ................................................................................... 3 1.3 Research Problem ........................................................................................ 3 1.4 Organization of Thesis .................................................................................. 4 REVIEW OF RELATED LITERATURE ..................................................................... 5 2.1 Classical Theory ........................................................................................... 5 2.2 Fuzzy Logic................................................................................................... 7 2.2.1 Characteristics of Fuzzy Logic ................................................................... 9 2.2.2 Features of Fuzzy Logic ............................................................................ 9 2.2.3 Deduction Process .................................................................................. 10 2.2.4 Membership Function .............................................................................. 11 2.2.5 Advantages ............................................................................................. 12 2.2.6 Disadvantages ......................................................................................... 12 2.2.7 Applications ............................................................................................. 12 2.2.8 Future Work ............................................................................................. 13 2.3 2.3.1 Rough Set ................................................................................................... 13 Basic Concept ......................................................................................... 15 vi 2.3.2 Advantages ............................................................................................. 19 2.3.3. Disadvantages ..................................................................................... 19 2.3.4 Future Work ............................................................................................. 19 2.4 Multi-Valued Logic ...................................................................................... 20 2.4.1 Approximate Reasoning with Linguistic Modifiers ................................... 24 2.4.2 Synthesis of Multi Valued Logic ............................................................... 25 2.4.3 Future Work ............................................................................................. 27 2.5 Bayesian Network ....................................................................................... 27 2.5.1 Independence Assumptions .................................................................... 28 2.5.2 Consistent Probabilities ........................................................................... 29 2.5.3 Constraints .............................................................................................. 30 2.5.5. Applications .......................................................................................... 32 2.5.6 Advantages ............................................................................................. 33 2.5.7 Disadvantages ......................................................................................... 33 UNCERTAINTY MODELS IN APPLICATIONS ....................................................... 34 3.1 Data Mining................................................................................................. 34 3.1.1 Background ............................................................................................. 34 3.1.2 Characteristics of Data Mining ................................................................. 36 3.1.3 Data Mining and Uncertainty ................................................................... 37 3.1.4 Fuzzy Logic Uncertainty Model ............................................................... 39 3.1.5 Applications ............................................................................................. 43 3.2 Semantic Web Services and Uncertainty .................................................... 44 vii 3.2.1 Background ............................................................................................. 44 3.2.2 Semantic Web Services .......................................................................... 45 3.2.3 Uncertainty in Semantic Web Services.................................................... 48 3.2.4 Fuzzy Logic Uncertainty Model ............................................................... 51 SOFT COMPUTING FOR INTELLIGENT SYSTEM: DESIGN AND ARCHITECTURE..................................................................................................... 57 4.1 Soft-computing for Intelligent Systems ....................................................... 57 4.1.1 Main Components of Soft Computing ...................................................... 58 4.1.2 Characteristics of Soft Computing ........................................................... 59 4.2 4.2.1 Design of Intelligent Systems with Uncertainty ........................................... 61 Main Aspects of Design ....................................................................... 62 1. Uncertainty in Objects .................................................................................... 62 2. Uncertainty in Surrounding Environment........................................................ 62 3. Uncertainty in Expected Functionality ............................................................ 63 4.2.2 Design Framework .................................................................................. 64 1. Fuzzy Logic .................................................................................................... 65 2. Evolutionary Artificial Neural Networks .......................................................... 66 1. Evolution introduced at weight training level .................................................. 67 2. Evolution introduced at the architecture level................................................. 67 3. Evolution introduced at the learning level ...................................................... 68 4.2.3 4.3 Selection of Appropriate Design .............................................................. 69 Architecture of Intelligent System with Uncertainty ..................................... 70 viii 4.3.1 Architecture for Intelligent System ........................................................... 70 4.3.2 Architecture for Hybrid Intelligent System................................................ 71 4.3.3 Evolutionary Algorithm Architecture......................................................... 75 4.3.4 Application: Suppression of Maternal ECG from Fetal ECG................... 76 CONCLUSION AND RECOMMENDATIONS .......................................................... 84 5.1 Conclusion .................................................................................................. 84 5.2 Future Work ................................................................................................ 86 REFERENCES ........................................................................................................ 88 ix LIST OF TABLES Table 1: Candidate Data ......................................................................................... 15 Table 2: Buidling Phase [72] ................................................................................... 55 Table 3: Utilitization Phase [72]............................................................................... 56 x LIST OF FIGURES Figure 1: D-connecting Paths [23]........................................................................... 29 Figure 2: Connected Networks................................................................................ 31 Figure 3: Overview Steps in Knowledge Discovery of Databases [42].................... 35 Figure 4: Data Mining [80] ....................................................................................... 38 Figure 5: Fuzzy Logic in Data Mining [70] ............................................................... 42 Figure 6: Web Services & Semantic Web Services [67] ......................................... 45 Figure 7: Semantic Web (Detailed) [66] .................................................................. 46 Figure 8: Web Services Framework [72] ................................................................. 51 Figure 9: Relation between soft computing and other fields [73] ............................ 60 Figure 10: Basic Architecture for Intelligent Systems .............................................. 71 Figure 11: Sequential Type of Architecture ............................................................. 72 Figure 12: Parallel Type of Architecture .................................................................. 73 Figure 13: Feedback Type of Architecture .............................................................. 74 Figure 14: Evolutionary Intelligent System Architecture [73] ................................... 76 Figure 15: Basic Configuration of a Fuzzy Logic System [89] ................................. 79 Figure 16: Maternal ECG Cancellation in Abdominal Signal using ANFIS [87] ....... 81 xi CHAPTER 1 INTRODUCTION “As a general principle, the uncertainty of information in the knowledge base will induce some uncertainty in the validity of its conclusions. These systems possess nontrivial inferential capability and in particular, have the capability to infer from premises which are imprecise, incomplete or not totally reliable.” 1.1 Prof. Lotfi A. Zadeh Background One of the prominent questions in the field of artificial intelligence is “how to deal with knowledge uncertainty?” Uncertainty is a fundamental and inevitable feature of daily life; it is a central topic in many domains such as economics, artificial intelligence, and logic. philosophy, physics, Its definition varies in a number of fields, including statistics, economics, finance, insurance, psychology, sociology, engineering, and information science [3]. More specific definition of uncertainty by Doug Hubbard is: “the lack of certainty, a state of having limited knowledge where it is impossible to exactly describe existing state or future outcome, more than one possible outcome [3].” When dealing with real-world problems, we can rarely avoid uncertainty. Klir and Wierman describe uncertainty in [57]. “At the empirical level, uncertainty is an inseparable companion of almost any measurement, resulting from a combination of inevitable measurement errors and resolution limits of measuring instruments. At the cognitive level, it emerges from the vagueness and ambiguity inherent in natural language. At the social level, uncertainty has even strategic uses and it is often 1 created and maintained by people for different purposes (privacy, secrecy, propriety) [57].” There are three main types of uncertainties: 1. Fuzziness (vagueness): uncertainty due to imprecise boundaries (fuzzy set instead of crisp set). 2. Imprecision (non-specificity): uncertainty due to size of relevant sets of alternatives. 3. Discord (strife): uncertainty due to conflicts among various sets of alternatives. Management of uncertainty is an essentially important issue in the design of an intelligent system. To define an intelligent system: it is an information system which provides the user with a facility of posing and obtaining answers to questions relating to information stored in its knowledge base. The knowledge base of an intelligent system is a repository of human knowledge which is usually not very precise in nature and is not a complete set of accurate facts and rules. Hence, much of the information in the knowledge base is imprecise, incomplete, or not totally reliable thereby making it imperative to deal with uncertainty. There has been enormous effort undertaken to deal with uncertainty and lot of literature has been generated during this time on “how to handle uncertainty.” The most popular approach to dealing with uncertainty is the theory of probabilistic logic; Judea Pearl’s classical book of “Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference [25]” provides a framework for this reasoning. Other approaches were conditional planning, and decision theory. There has been a revolution in the field of Artificial Intelligence on how to handle uncertainty; various uncertainty models have been introduced based upon predicate logic, and 2 probability based method. Some of these models are: • Fuzzy Logic • Multi-valued Logic • Bayesian Networks • Rough Sets Uncertainty can be found in many different Information Technology applications such as semantic web services and data mining. These applications are used in day to day lives, hence making it critical to have excellent measures in place to deal with uncertainty. We will be identifying unique characteristics of each domain and map them to the uncertainty model that best compliments them. 1.2 Statement of Purpose Main goal of this essay is to identify main components of soft-computing framework and discuss design and architecture for intelligent system. This will provide users with much needed tools for developing intelligent systems that can handle knowledge uncertainty in a diligent manner. 1.3 Research Problem Many theories have been developed to deal with knowledge uncertainty but neither structured framework has been established nor have standard guidelines been developed. This makes it imperative to find new measures to represent knowledge uncertainty in intelligent systems. For these reasons, additional research is needed to build frameworks and develop recommendations for managing uncertainty in information systems. There has been extensive research done in the field, 3 identifying issues of uncertainty as well as uncertainty models of information systems, but only limited interaction exists between these two areas. 1.4 Organization of Thesis This thesis contains 5 chapters: Chapter 2 provides a literature review of five types of uncertainty models highlighting underlying principles, their strengths and weaknesses. These models are: Fuzzy logic; Multi-valued logic; Rough Set; and Bayesian network; Chapter 3 deals with review of different domains of applications and mapping uncertainty models to each type of application. Semantic web services and Data mining will be two domains of interest for the purpose of this essay. Chapter 4 presents a framework to represent knowledge uncertainty. The design and architecture of the main components to be included in the framework will be discussed. One application of intelligent system in the real world will be explored. Chapter 5 concludes the thesis with conclusions, recommendations to work around knowledge uncertainty, and future work to be conducted in this field. 4 CHAPTER 2 REVIEW OF RELATED LITERATURE “Uncertainty modeling is an area of artificial intelligence concerned with accurate representation of uncertain information and with inference and decision-making under conditions infused with uncertainty [4].” In an ideal world, agents would know all the facts about the environment in which they operate. Unfortunately, reality is far from idealism where agents do not have access to the whole truth, thereby making it impossible to derive conclusions that are fully accurate. Hence these agents should be well equipped to deal with uncertainty. 2.1 Classical Theory There are different methodologies to deal with uncertainty; few of them are described below: • Conditional Planning: one of the traditional approaches to dealing with uncertainty is conditional planning. Conditional planning can deal with uncertainty as long as it is a simple case where there are not too many variables involved, i.e. ability to get its hand on required information and deal with few contingencies. Due to very complex nature of our environment, it is practically impossible to have a complete set of facts about the environment. Three main reasons why first order logic fails to deal with uncertainty are [12]: 1. Laziness: it takes a lot of work to compile the complete set of rules for the environment in which it operates. 5 2. Theoretical Ignorance: having incomplete knowledge of the complete theory for the domain in question. 3. Practical Ignorance: each case is unique therefore all the generic rules cannot be applied; hard to deal with exceptions. • Probability Theory: Rational decision is another method where an agent has a goal and will execute the plan that guarantees result (i.e., a goal is achieved). This method is based upon “degree of belief.” In the world full of uncertainty, it becomes tough to provide a yes or no answer. Therefore we provide a number (ranging between 0 to 1) to the likelihood of event happening or how true a statement is. This number represents degree of belief and this theory is referred to as Probability Theory. • Decision Theory: It is a combination of Probability Theory and Utility Theory. o Probability Theory: as discussed above, is dependent upon degree of belief. o Utility Theory: is dependent upon making the decision based upon highest utility (degree of usefulness). These theories were competent in their own ways to deal with uncertainty; but as the complexity grew, so did the demand for sophisticated models. These conventional theories failed to provide an adequate model for modes of reasoning which are approximate rather than exact, and most of human reasoning fall into this category [15]. There were many different approaches introduced; we will take a look at four different models: Fuzzy Logic, Multi-valued Logic, Bayesian Networks, and Rough Sets. 6 2.2 Fuzzy Logic “Fuzzy logic provides a natural framework for the management of uncertainty in intelligent system because – in contrast to traditional logic systems – its main purpose is to provide a systematic basis for representing and inferring from imprecise rather than precise knowledge. In effect, in fuzzy logic everything is allowed to be – but need not be – a matter of degree.” - Prof. Lotfi A. Zadeh One of the main problems in dealing with uncertainty in information system is the fuzziness associated with the knowledge base of an intelligent system; this lead to the introduction of Fuzzy Logic, also referred to as fuzzy reasoning. Wikipedia defines Fuzzy Logic as “a form of multi-valued logic derived from fuzzy set theory to deal with reasoning that is approximate rather than precise [6].” Fuzzy logic contrary to its name is not fuzzy rather precise. Fuzzy logic variables may have truth values that range between 0 and 1 which corresponds to the degree of truth [6]. Prior to fuzzy logic being introduced in the world of uncertainty, probability theory enjoyed the monopoly; but this traditional approach to dealing with uncertainty failed to come to terms with the fact that uncertainty is possibilistic in nature than probabilistic. As Asli and Burhan [5] claimed, in realm of uncertainty and imprecision, fuzzy logic has much to offer. Fuzzy Logic is based upon both predicate logic and probability theory providing the answer to the posed question with an assessment of “how reliable the answer is.” This assessment of reliability is also called a certainty factor. Fuzzy Logic has two main components: 1. Test-Score Semantics: represents the knowledge (Predicate Logic). 7 2. Inferential Component: infer answers to posed questions and provide fuzzy quantifier (Probability Theory). The main difference between fuzzy logic and traditional approach is that the objects of interest are allowed to be much more general and much more complex than the objects in traditional logical systems and probability theory. Fuzzy Logic further addressed issues that were hard to deal with using conventional techniques. Here are few important issues [1] that can be handled through fuzzy logic: 1. Fuzzy rules where antecedent/consequent are of form: a. If A is M then B is N b. If A is M then B is N with CF = α In the above forms, A is M and B is N are fuzzy propositions, and α is a certainty factor. 2. Partial match between the users supplied fact and the rule in the knowledge base: this is the case where fact may not be identical to antecedent of a rule in the knowledge base. 3. Fuzzy Quantifiers: Antecedent/Consequent of a rule may contain explicit or implicit fuzzy quantifiers. For example, consider the following proposition with implicit fuzzy quantifier (disposition): d = university students are between 18 and 24 This may be interpreted as the proposition: p = most university students are between 18 and 24 Expressing this as a rule: r = If x is a university student then it is likely that x is between 18 and 24. 8 2.2.1 Characteristics of Fuzzy Logic Main characteristics of Fuzzy Logic as outlined by Zadeh in [15]: 1. Matter of Degree concept: representing everything as a matter of degree. “The unique property of fuzzy sets is that membership in a fuzzy set are not a matter of acceptance or denial, but rather a matter of degree.” 2. Any logic system can be fuzzified: i.e., conversion of any system to a fuzzy system. This is achieved by fuzzifying the inputs by applying membership functions to the input. 3. Knowledge base consists of fuzzy constraint on collection of variables. 4. Reasoning is viewed as elastic constraint propagation. 2.2.2 Features of Fuzzy Logic Main features of Fuzzy logic as summarized by Zadeh in [15] are: 1. Truth values can range over the fuzzy subsets of a finite or infinite truthvalues set, usually assumed in the range of [0, 1]. This can be regarded as providing some kind of characterization to intermediate truth values. 2. Predicate can be crisp or fuzzy: In contrast to bivalent systems where predicates are only crisp for e.g., larger than, fuzzy logic lets predicates to be fuzzy, for e.g., much larger than. 3. Allows typical quantifiers (all & some) and fuzzy quantifiers (e.g. most, few): fuzzy logic allows quantifiers that are used in day to day lives, thereby making it easier to relate to the real world. 9 4. Ability to represent non fuzzy and fuzzy-predicate modifiers: In contrast to classical system where negation (not) is the main predicate modifier, fuzzy logic utilizes fuzzy modifiers such as very, extremely. 5. Three models of qualification: a. Truth Qualification: expressing fuzzy truth value. b. Probability Qualification: expressing fuzzy probability. c. Possibility Qualification: expressing fuzzy possibility. 2.2.3 Deduction Process Four main categories for Propositions: 1. An unconditional, unqualified proposition X is F, Where X = variable and F = Fuzzy predicate 2. An unconditional, qualified proposition X is F is λ, Where X = variable, F = Fuzzy predicate and λ = Fuzzy probability 3. Conditional, unqualified proposition If X is F then Y is G, Where X and Y = variable, and F and G = Fuzzy predicate 4. Conditional, qualified proposition If X is F then Y is G is λ, Where X and Y = variable, F and G = Fuzzy predicate and λ = Fuzzy probability 10 All facts or propositions in knowledge base are stored in canonical form; this is usually done through inspection or by applying test-score semantic application on propositions. By applying canonical form, we get possibility distribution where each proposition in knowledge base is converted into possibility distribution which provides constraints on the variable. Applying conjunction will lead to the construction of Global Possibility Distribution which is induced by the totality of propositions in knowledge base. 2.2.4 Membership Function The key problem in application of fuzzy logic is the construction of the membership function of a fuzzy set. Three principal approaches are used to address this concern: 1. Declarative approach: membership functions are specified by the designer of a system. 2. Computational approach: membership function is expressed as a function of the membership functions of one or more fuzzy sets with specified membership functions. 3. Modelization Elicitation approach: membership functions are computed through the use of co-intension enhancement techniques. The main challenge in development of fuzzy system models is to generate fuzzy if –then rules. These rules are created by extracting knowledge from human experts which might be incomplete or not organized. As oppose to traditional approach, this challenge has lead to building automated algorithms for modeling systems using fuzzy theories via machine learning and data mining techniques. 11 2.2.5 Advantages • It is time-invariant and deterministic: this allows for the integration of stability analysis methods to be used with fuzzy logic. • Ability to handle real world situations since it goes beyond the restriction of two-state model (yes/no): it is not constrained to the regular true/false or yes/no and can handle any situation through truth values ranging from 0 -1. • It provides computation framework for dealing with uncertainty through testscore semantics which provides a higher level of expressing power to represent meanings of more propositions in a natural language. • It is easily blended with conventional control techniques and can be added on top of the expert opinion/experience. 2.2.6 Disadvantages • Hard to synthesize if-then rules: difficult to deduce membership functions. • Defuzzification of output should be validated to ensure that output is being translated in a right way as it was intended to. 2.2.7 Applications • Anti-lock Braking Systems • Data Mining • E-services • Quality Support • Decision control system 12 2.2.8 Future Work • More research is needed to create if-then rules more accurately. • Compare if-then rules created by domain expert versus through machine learning to see which one is more accurate and feasible. 2.3 Rough Set “Rough set theory is a new approach to decision making in the presence of uncertainty and vagueness.” - Zdzislaw Pawlak Rough set theory was introduced by Zdzislaw I. Pawlak in early 1980s; this theory is based upon formal approximation of a crisp set – pair of sets which provide the lower and the upper approximation of the original set [10]. Traditional use of rough set was to deal with decision problems, since then it has become an area of interests among researchers from different disciplines, most of which are related to Artificial Intelligence. Recently, rough set theory has been extended to deal with knowledge uncertainty. S. Wong demonstrated that rough sets provide a suitable framework for managing uncertainty in intelligent system. It is one of many techniques available in the area of artificial intelligence to deal with knowledge uncertainty and for uncertainty management in relational database [11]. Rough set theory is also used in different disciplines in computer technology such as, knowledge acquisitions, data mining and many more. 13 Rough set theory is based on the fundamental principle of associating some information with every object in the universe. The underlying principle for this mathematical tool is based upon indiscernibility relation. Indiscernibility relation exists between two objects when all their attribute values are identical with respect to the attributes or information under consideration [14]. These attribute values cannot be distinguish (discerned) with regards to the considered attribute. Generally a knowledge base is composed of two different sets: 1. Crisp Set – is precise; union of elementary sets (collection of indiscernible objects). 2. Rough Set – is imprecise or vague. Usually, we hit a grey zone with boundary line objects which are hard to be placed in either of these sets. As Pawlak said in [16] “knowledge base has a granular structure; due to this, vague concepts cannot be characterized in terms of information about their elements.” Rough set theory brings forth the approach of replacing vague concepts with a pair of precise concepts; Indiscernibility relation is used to divide universe into equivalence classes. Pair of precise sets consists of lower approximation and upper approximation of the vague concept. The notion of approximation (lower and upper) allows us to distinguish between certain and possible or partial inclusion in a rough set. • Lower Approximation Region – results that are certain and “surely” belong to the concept, i.e., exact match. • Upper Approximation Region – results that are likely but still uncertain and “possibly” belong to the concept. 14 • Boundary Region – difference between the upper approximation and the lower approximation constitutes the boundary region of the set. 2.3.1 Basic Concept Here is the basic concept of Rough Set theory: 1. Indiscernibility Relation As mentioned earlier, it considers groups of indiscernible objects as oppose to a single object. As in [16], indiscernibility relation can be formulated in a table called information system or an attribute-value table. Table 1: Candidate Data Name Education Job Prospects Mike Elementary No Philip High School No Shelly High School Yes Melissa University Yes Jeff University Yes Looking at the above table, we can see that for each Candidate we have three attributes: A Name B Education C Job Prospects 15 Each person can be discerned (distinguished) from each other based on all three attributes. But if we were to take a look at the attribute Education, equivalence classes could be defined as: R(B) = {{Mike}, {Philip, Shelly}, {Melissa,Jeff}} These subsets also define a partition of objects into classes. Information table is useful in determining classification patterns. Representing Information table above into more formal way as in [16]: Let U = universe consisting of finite set of objects; Let A = finite set of attributes (for each object in Universe) With every attribute a € A is associated a set of value Va Every attribute a determines a function: Fa: U Æ Va Let B be a subset of A, indiscernibility relation on Universe U will be defined as: I(b) = {(x,y) U X U: fa(x) = fa(y), a B} 2. Approximation The method of approximation helps identify unique characteristics of object in question by deducing information in knowledge base; in other words, be able to identify attributes given the set. Using this process we define lower and upper approximation. From Table 1, we infer that candidate with Job Prospects are {Shelly, Melissa, Jeff}. If we were to define attributes of candidates with Job 16 Prospects, we can easily deduce that if Candidate has good education, then they have job prospects as well. We define lower and upper approximation: Lower Approximation: {Melissa, Jeff} Upper Approximation: {Philip, Shelly, Melissa, Jeff} Boundary Region = Upper Approximation – Lower Approximation Hence, Boundary region is: {Philip, Shelly} Transforming this in mathematical way as in [16] we get: Let U = universe consisting of finite set of objects; Let X = subset of U B = Subset of attributes A B*(x) = {x U: B(x) X} (Lower approximation) B*(x) = {x U: B(x) X ≠ 0} (Upper approximation) BNb(x) = B*(x) - B*(x) (Boundary region of x) If BNb(x) = 0, then set (x) is called crisp set where have an exact match; and if BNb(x) ≠ 0, then we have a rough set. Rough set is characterized numerically. 3. Rough Membership It identifies the boundary region member which does not belong to crisp set. As Pawlak said in [16] “Rough membership identifies uncertainty related to the membership of element to a set.” membership function as: μbx(x) = |x B(x) |B(x)| where μbx(x) [0, 1] 17 He described rough This can be interpreted as a degree of certainty to which x belongs to X. Using this to define approximations: B*(x) = {x U: μbx(x) = 1} Lower Approximation B*(x) = {x U: μbx(x) > 0} Upper Approximation B*(x) = {x U: 0< μbx(x) < 1} Boundary Region Pawlak said that above function confirms that “vagueness is related to set, while uncertainty is related to elements of sets.” 4. Dependency of Attributes It analyzes relationships between attributes to see if one can be inferred from another; that is AÆB, if the value of B can be inferred uniquely from the value of A. In formal way this can be defined as: B depends totally on A, iff: I(A) I(B) Now to define partial dependency of attributes as in [16], Let A and B be subset of C B is dependent upon A to kth degree where 0 ≤ k ≤ 1 (A Æ kB) if K = |POSA(B)| |U| Where POSA (B) = U x U/B Ax(X) POSA (B) represents a set of all elements of U that can be uniquely identified to the partition U/B from A. 18 5. Reduction of Attributes An attribute is superfluous if its appearance or no appearance does not make any difference to the object in universe. Hence we can reduce the attributes and be able to get the minimal set of attributes which delivers the same classification as does the full set of attributes. 2.3.2 Advantages • It only requires data and no additional information. • Mathematical approach with fully structured model makes it easy to understand and obtain straightforward interpretation. • Generates minimal decision rules. 2.3.3. Disadvantages • Hard to generate decision rules from data. • Hard to optimize decision rules. 2.3.4 Future Work • More research is needed to generate optimal decision rules from data. 19 2.4 Multi-Valued Logic “Uncertainty means that the atoms may be assigned logical values other than the conventional ones - true and false, in the semantics of the program. The use of multi-valued logics to express uncertainty in logic programs may be suitable.” -Daniel Stamate “Multi-valued logic is a ‘logical calculi’ in which there are more than two-truth values. Traditionally, there were only two-possible values for any proposition. An obvious extension to classical two-valued logic is an n-valued logic for n>2 [10]”. This extension leads to a new set which may be finite or infinite and have same structure in place. As Dubois and Prade said in [20], multi-valued logic is constructed on truth function calculi: the degree of truth of a formula can be calculated from the degree of truth of its constituents. Due to this, it has become an attractive model to be applied in the field of uncertainty, where degree of truth was viewed as certainty factors. Multi-valued logic has been used in wide array of logic systems such as memory, multi level data communication coding and various digital processors [28]. Its roots were originated from Lukasiewicz and Post in the twenties. In this logic, fuzziness phenomenon can occur at metalogical level (the level of construction of the calculus and its semantics) and set is considered to be fuzzy if it is the actualization of a predicate symbol in a structure [21]. There are many instances in real world where we get different view from different people on topics of interest, such as requirement gathering stage in a 20 software lifecycle. Different stakeholders are interested in different aspects and have different expectations of functionalities accomplished from the software. This usually results in information which might not be consistent with each others’ views and opinions and might be even incomplete in nature. Inconsistent viewpoints might be critical if they affect the main functionality of the software, otherwise inconsistency can be easily ignored. These types of inconsistencies can be overcome by adopting non-classical paraconsistent logic. Multi valued logic is a type of paraconsistent logic which is not limited to typical two truth values, rather it can represent different types of contradictions and different levels of uncertainty. Belnap said in [27] that “paraconsistent logic (multi valued logic) has been driven by the need for automated reasoning systems that are done given spurious answers if their database becomes inconsistent.” The usual choice of values in multi valued logic depends upon the nature of the problem or system in hand and to what granularity do we want to sustain the information so we do not lose much data. Lattices are used to represent the information (truth values) of the system. In multi-valued logic, we can calculate the product of lattices as the merging point for different views for dealing with inconsistent data. Product of two lattices results in a lattice where a pair of element (a, b) are composed of element a from the first lattices and element b from the second lattice. information of individual logics. These products sustain all the Products can be taken to n lattices, where the number of values in the resulting product lattices grows exponentially as n increases. To deal with this, we can use the technique of abstraction. Abstraction 21 results in discarding some information and only retaining information that is relatively important. With the multi-valued logic the more values we have, the more detailed information we hold about the system it represents, and more complex it becomes. Hence depending upon the problem in hand, we make a tradeoff between complete or abstract data. Multi-logic is usually used to represent abstract and qualitative things such as helpful, handsome. Fuzzy logic falls short to represent these descriptions through the use of fuzzy set, and that is where multi-valued logic is used. As an example given in [33]: If X is A then Y is B If X is A` then Y is B` In here X and Y are variables, and A, B, A`, B` are predicates. In multi-valued logic these predicates are expressed as multi-sets. Multi-valued logic can be viewed as an extension of fuzzy logic, some of its features and principles can be extended to multi-valued logic. Multi-set theory is used to formalize the notions of membership degree and of a truth degree. Defining it further as in [34]: “A membership degree is not an uncertainty degree on the membership of an object to a multi-set; it is instead the degree to which an object belongs to a multi-set regardless of any uncertainty.” A truth degree (τα) is used to express the confidence of “how accurate the predicate is”; this is associated with each multi-set which usually tells how true the predicate is. As an example, “Student A is extremely smart,” here student A satisfies the predicate smart with the degree extremely. 22 The main difference between multi-valued logic and fuzzy logic is that in multi-valued logic the membership degree is a subset of natural language while in fuzzy logic the membership degree belongs to set [0,1]. The condition of an “ordered list” is forced upon the set of truth degree (symbolic) with λM = {τ0,…, τi, τm-1}2 with the total order relation τi ≤ τj such that i ≤ j The truth degrees can be proposed by expert using multi-valued logic as long as it satisfies the condition of being in order. An example described in [33]: M =7; λ7 = {not-at-all, very-little, little, moderately, enough, very, completely} In multi-valued logic, Lukasiewicz’s aggregation functions are generally used. Here is the definition in [32] for M truth-degrees: TL(τα, τβ) = τmax(0, α + β – M + 1) SL(τα, τβ) = τmin(M-1, α + β) IL(τα, τβ) = τ min(M-1, M+1, α + β) Using General Modus Ponens we can infer that we can have a rule defined by the same multi-set as the premise, but can modify the truth-degree associated with it. Taking a look at two relations [32]: A´ > A represents that A´ is a reinforcement of A A´ < A represents that A´ is a weakening of A The above relations are expressed through modifications of the truth degree of the same multi-set. Reinforcement is represented through increase in its truthdegree and weakening through reduction. 23 2.4.1 Approximate Reasoning with Linguistic Modifiers Linguistic modifier is another dimension of approximate reasoning which is based upon validating the “axiomatic of approximate reasoning.” Using the concept of linguistic modifiers, El-Sayed and Pachlotczyk introduced new rules of General Modus Ponens with free rules [32]. The primary difference between typical GMP rules and new rules is that, in GMP observation and the premise correspond to the same multi-set, where as in new rules they both are represented by different multiset (i.e., observed multi-set is different from conclusion multi-set). Linguistic modifier is defined as an operator which builds terms from a primary term; there exists two types of modifiers [32]: • Reinforcing modifier: to reinforce the concept expressed such as “extremely”. This modifier results in high precision. • Weakening modifier: to weaken the concept expressed such as “rarely”. This modifier results in low precision. In multi-sets theory, these modifiers result in the same multi-set, but the truth degree is modified, whereas in fuzzy logic, these modifiers result in a whole new fuzzy set which is different from the original set. An example of an approximate reasoning using linguistic modifier is: If ‘X is A” then “Y is B” “Y is m(A)”________ then “Y is m(B)” Inference conclusion is B´ = m´(B). This conclusion is drawn using the hypothesis m´ = m, thereby giving the ability to infer this conclusion. A general principle is that, 24 a modification applied on the rule premise will be applied to rule conclusion as well. For e.g.: A Æ B (Very A would imply Very B) This would imply if A is reinforced, so is B and if A is weakened, then so is B. To infer using linguistic modifiers, author of [32] proposed the approach of using generalized linguistic modifiers in General Modus Ponens. Using this approximation, we get as in [32]: If “X is vαA” then “Y is vβB” “X is m(vαA)____________ “Y is m(vβB)” It is recommended to use modifiers which modify the truth-degree and not the actual multi-set. 2.4.2 Synthesis of Multi Valued Logic Sarif and Barr defines n-variable multi-valued logic in mathematical terms as the function f(x) with radix (r); f(x): Rn Æ R where R = {0, 1, …, r-1} is a set of r logic values where r ≥ 2 and X = {x1, x2,...,xn} is a set of n variables. There are two main algorithms for synthesis of multi-valued logic: Deterministic algorithm This is based on direct cover approach and requires high computational time [29]. Direct cover approach consist of the following important steps: o Choose a minterm o Identify a suitable implicant that covers the minterm o Obtain a reduced function by removing the identified implicant 25 o Repeat steps 1-3, until all minterms are explored. The steps to choose minterm and implicant that covers minterms are critical in obtaining less expensive solutions (directly proportional to number of items required). There are many different implementations in how to choose minterms and implicants; these algorithms can be reviewed in [29, 30, 31]. Iterative heuristic based algorithm This is based on exploring large solution space and coming to near optimal solutions. This is based on the concept of chromosome and genes, where solutions are represented using string of chromosomes and each chromosome further contains several genes. These genes consist of five attributes which represent the product term as explained in [28]: • First attribute: value of the constant of the corresponding product terms • Second and third attribute: window’s boundary of product term for the first variable X1 • Fourth and fifth attribute: window’s boundary of product term for second variable X2 Length of chromosome plays a critical role in the solution; hence it is critical for the length to be just right. If it is too short, it will not be able to reach best solution, and if it is too large, it will take too long of a time. There are two proposed approaches for selecting the length of chromosomes: 1. Static: length of chromosome is equal to the length of truth table 2. Reduced static: length of chromosome is equal to 75% of length of truth table 26 2.4.3 Future Work As authors said in [32], it would be interesting to extend their proposal of new rules to more complex strong rules, such as a set with multiple premises. 2.5 Bayesian Network “Bayesian networks are to a large segment of the AI-uncertainty community what resolution theorem proving is to the AI-logic community.” -Eugene Charniak Wikipedia defines Bayesian Network as a probabilistic graphical model that represents a set of random variables and their conditional dependences via a directed acyclic graph (DAG) [22]. Bayesian network can be used to represent probabilistic relationship between two different variables, such as problem and symptom. Given symptoms of a car, we can use the probabilistic relation to calculate the probabilities of different problems that can occur. Bayesian network is also referred to as a Belief network, directed acyclic graphical model or knowledge maps probabilistic causal network. Nodes represent random variable (RV), which can either have discrete values (such as true/false) or continuous values (such as 1.0, 1.9). Directed Arcs between pairs of nodes represent dependencies between the random variables. When specifying probabilities in Bayesian networks, we should have probabilities of all root nodes and the conditional probabilities of all non-root nodes. It allows us to calculate conditional probabilities of a given node in the network if we have the 27 values of some of the nodes that have been observed. When new information (evidence) is added to the network, it would result in recalculation of conditional probabilities due to which they might change. When Bayesian network is referred as Belief network, belief refers to the conditional probability given the evidence. In the classic probability theory, probability distribution is complicated as the complete distribution of n random variables will require 2n-1 joint probabilities. As the random variable (n) grows in number, it becomes hard to depict all probabilities, for e.g., if we have n = 5, then it will require 31 joint probabilities, where as if n = 10, it will require 1023 joint probabilities. Bayesian network overcomes this complexity through the use of “build in independence assumptions.” 2.5.1 Independence Assumptions As Charniak explained in [23], in Bayesian network, a variable a is dependent on a variable b given evidence E = {e1, e2,..} if there is a d-connecting path from a to b given E. There are three types of d-connecting path as shown in figure 1. 28 Figure 1: D-connecting Paths [23] D-connecting path is a path from a to b with respect to the evidence nodes E if every interior node n in the path has the property that either [23]: 1. It is linear or diverging and not a member of E or 2. It is converging and either n or one of it descendants is in E To summarize, two nodes are d-connected it there exists a causal path between them or there exists an evidence that renders the two nodes correlated with each other. 2.5.2 Consistent Probabilities Another problem that comes with classical probabilistic theory is the problem of inconsistent probabilities which usually requires some mechanism in place to ensure we do not run in to this problem. Bayesian network handles this problem effectively thereby ensuring consistent probabilities, which requires that probabilities of each and every nodes in the network be specified (all possible combinations of its parents). In fact, the network will calculate the joint distribution. The joint distribution of a set of random variables r1,r2,…rn rn is defined as p(r1,r2,…rn) for all values of r1,r2,…rn. This provides all the information associated with the distribution. Also sum of all the joint distributions should equal 1. The joint probability distribution of a set of variables {r1,r2,…rn} is calculated through the following equation [25]: P(r1,r2,…rn) = ∏ P[ri| parents(ri)] 29 It is important to understand how to number random variables 1, 2, n. There are various techniques but for our interest we will look at topological sort where each variable comes before its descendants. Recent work in this field has lead to the invention of many new algorithms which are both sophisticated and efficient in nature for computing and inferring probabilities in Bayesian Network. As Boudali and Dugan said in [84], “during inference, these new algorithms take advantage of the independence assumption between the variable and proceed by local computations which makes the execution times relatively short.” We mentioned earlier that Bayesian networks have the feature of Independence Assumption; hence new algorithms make full use of this feature offered by Bayesian Network. Using this, the number specified by the Bayesian network formalism defines single joint distribution. Consistency at the local level is used for insuring that global distribution is consistent as well. 2.5.3 Constraints Underlying principle of Bayesian network is the calculation of conditional probability of every single node in the network, as this computation is NP-hard (non deterministic polynomial time) and usually takes exponential time to get the problem solved. There are many factors that are taken into consideration during the evaluation of the network, such as the type of network, the type of algorithm used, and its implementation method. Option of having an exact solution or an approximate solution provides different alternatives. We will briefly discuss here exact solutions vs. approximate solutions. 30 Exact Solution To find an exact solution is usually NP-hard, with the exception of single connected network (also referred as polytree). It is an undirected graph with at most one undirected path between any two nodes as shown in figure 2. These are usually less complicated compared to multiple tree nodes (figure 2). Figure 2: Connected Networks We will not look at the algorithm in this paper, but it can be found in [25]. The main difference between single connected networks (figure 2) and the multiple connected networks is the way change in the connections is introduced. In polytree, a change in one node will only have affect on its neighboring nodes, for e.g., in figure 2, a change in d can not affect any other nodes except for node going through b itself. However in multiple connected networks, there can be more than one path between any two nodes. Hence when a change is introduced, for e.g., in figure 2, if a change is introduced in d, it will not only affect c, but also affect a through b. 31 Hence a will have double affects (through b and c). This ripple effect is what makes multiple connected networks complicated. To deal with multiple connected networks, we convert them into single connected networks through various techniques such as clustering. This conversion works fine when dealing with networks consisting of fewer nodes, but gets complicated when nodes created through clustering has large values. Trade off is go from exact solution to approximate solution. Approximate Solution There are various techniques to calculate approximations of conditional probabilities in Bayesian network and each technique fits well or not depending upon the nature of network in question. Most of these techniques are based on the following principles: • Randomly picking (assuming) values of some nodes. • Using values of some nodes to determine values of remaining nodes. • Based on some values, use approximation to answer the questions. 2.5.5. Applications Bayesian networks have been applied in different domains. The most frequent domains of application are: • Diagnosis problems • Speech recognition • Data mining • Determination of errors 32 2.5.6 Advantages • Conclusions are made through probabilistic approach as oppose to logical approach. • Used for complex simulations since it does not rely on traditional approach of specifying a set of all numbers that grows exponentially (independence assumption). • Knowledge is stored as collections of probabilities. 2.5.7 Disadvantages • Time of evaluation: Bayesian networks require exponential time for processing most cases. 33 CHAPTER 3 UNCERTAINTY MODELS IN APPLICATIONS This chapter discusses two of the main applications where modeling and reasoning with uncertainty is primordial; these applications are: Data Mining and E-services. The chapter provides overview of these applications, discusses how uncertainty comes into play and recommends model to deal with this uncertainty. 3.1 Data Mining “The fruits of knowledge growing on the tree of data are not easy to pick.” - WJ Frawley, G. Piatetsky-Shapiro, CJ Matheus 3.1.1 Background Data mining is defined as “extracting or mining knowledge from large amounts of data [45].” Data mining is also referred to as knowledge extraction, information discovery, information harvesting, data archaeology, and data pattern processing. It is a process of extracting patterns, associations, anomalies, changes and significant structures from large database, data warehouses or other information repositories [47]. A step in the knowledge discovery of databases that consists of applying data analysis and discovery algorithm under acceptable computational efficiency limitations, produce a particular enumeration of patterns over the data [42]. Knowledge discovery is a process which provides methodologies for extracting knowledge from large data repositories. Computers have enabled humans to gather more data than we can digest; it is only natural to turn to computational techniques 34 to help us unearth meaningful patterns and structures from the massive volumes of data. Hence, knowledge discovery of databases is an attempt to address a problem that the digital information era made a fact of life for all of us: data overload [42]. Knowledge discovery consists of following steps [45] as in figure 3 [42]: 1. 2. 3. 4. 5. 6. 7. Data Cleaning Data Integration Data Selection Data Transformation Data Mining Pattern Evaluation Knowledge Presentation Figure 3: Overview Steps in Knowledge Discovery of Databases [42] Across a wide variety of fields, data are being collected and accumulated at a dramatic pace. Whether it is science, finance, telecommunication, retail, or 35 marketing, the classical approach to data analysis relied fundamentally on one of more analysts becoming intimately familiar with the data and serving as an interface between the data and the users and products [42]. Databases are increasing in size by growing number of records and increasing files or attributes associated with each record. To replace this manual and traditional approach which is slow and expensive, and to deal with huge databases, demand for data mining has grown proportionally to handle and utilize data efficiently. The unifying goal is extracting high level knowledge from low-level data in the context of large data sets [42]. Organizations uses this data for various purposes such as understanding customers’ behavior, increased efficiency, gain competitive advantage, predicting future trend and be able to make knowledge driven decisions. Data is stored in data warehouse; data warehouse is a repository of multiple heterogeneous data sources organized under a unified schema at a single site in order to facilitate management decision making. Data warehouse technology includes data cleaning, data integration, and on-line analytical processing (OLAP), that is, analysis techniques with functionalities such as summarization, consolidation, and aggregation as well as the ability to view information from different angles [45]. It collects information about subjects that span an entire organization. Data Mart is a department subset of a data warehouse which focuses on selected subjects, and thus its scope is department wide. 3.1.2 Characteristics of Data Mining 1. Scalability: designed to hold unlimited amounts of data 2. Complexity: very complex structure 36 3. Automated capability: ability to automatically discover hidden patterns or useful information from a data set 4. Embedded learning capability: ability to learn from the past and to apply its learning in the future 3.1.3 Data Mining and Uncertainty Data mining has since evolved into an independent field of research in which intelligent data analysis methods attempt to “unearth the buried treasures from the mountains of raw data [48].” Data mining component of Knowledge discovery relies heavily on techniques ranging from machine learning to pattern recognition and statistics to find patterns. Data mining has functionalities such as outlier analysis, association analysis, cluster, and evolution analysis. The main tasks involved in data mining are: the definition/extraction of clusters that provide a classification scheme, the classification of database values into the categories defined, and the extraction of association rules or other knowledge artifacts [41]. highlights the steps involved in data mining. 37 Figure 4 [80] Figure 4: Data Mining [80] A cluster consists of group of objects that are more similar to each other than to other cluster. It is nothing but the subsets of the data set. In fact cluster analysis has the virtue of strengthening the exposure of patterns and behavior as more and more data becomes available [50]. Aim of cluster analysis is the classification of objects according to similarities among them, and organizing objects into groups [47]. Once the clustering task is executed, the product of categories could either be fuzzy or crisp (hard) in nature. Hard clustering method is based upon a classical set theory, where object either belong or does not belong to a cluster [47]. On the other hand, fuzzy clustering method is based upon the concept where an object can belong to several clusters simultaneously with the degree of belief associated with each object in the cluster. That is, during the clustering algorithm, there could be some values that belong to the borderline, thereby not fully classifying 38 into one specific category or might belong to more than one category. In real world, fuzzy clustering occurs more than hard clustering where objects in borderline are not forcefully classified into one cluster. This is due to the fact that mostly real world data suffers from following limitations [51]: 1. Not clearly known : Questionable; problematical 2. Vague : Not definite or determined 3. Doubtful : Not having certain information 4. Ambiguous : Many interpretations 5. Not steady : Varying 6. Liable to change : Not dependable or reliable Another issue that exists in data mining is when data values are given equal treatment during the classification process which is carried out in a crisp manner. During this classification, some values belong more to the category as opposed to other values in the same category. As an example, if employee X is working with a company for 15 years, and another employee Y has been working with the same company for 20 years. Theoretically, during classification they both belong to the senior category. However an important fact is that an employee Y has more seniority than X, but during the classification process, we lose this important knowledge since it is not captured anywhere through the regular classification technique. 3.1.4 Fuzzy Logic Uncertainty Model The conventional clustering algorithms have difficulties in handling the challenges posed by the collection of natural data which is often vague and uncertain [51]. 39 Traditionally, to deal with uncertainty in Data mining, several approaches have been proposed, such as fuzzy decision trees, and fuzzy c-means. The underlying principle with these approaches is to associate degree of belief to each value during classification process where data value can be classified into more than one category. Fuzzy logic is a good model to deal with uncertainty in Data mining. Fuzzy set theory is based upon membership function; users can use the given data to define membership functions to characterize an element with a fuzzy subset [47]. Integration of fuzzy logic with data mining techniques has become one of the key constituents of soft computing in handling the challenges posed by the massive collection of natural data [52]. Rules can be designed to model the to-be-controlled system given the input and output variables. Here are the basic steps of the approach proposed as in [41, 47]: 1. Standardization: process of standardization is applied to the data where some kind of calculation is performed on data to remove the influence of dimension. As an example, each data value can be standardized by subtracting a measure of central location (mean or median) and divided by some measure of spread (standard deviation). 2. Clustering scheme extraction: defining or extracting clusters that correspond to initial categories for the data set. Many clustering algorithms are available for extraction. During this step, correlation coefficient is calculated to classify data into clusters. 40 3. Evaluation of the clustering scheme: clustering method is used to find a partition. Different parameters are applied to chosen clustering algorithm to find the optimized clustering schema. 4. Definition of membership function: fuzzy logic is used to calculate degree of belief (grade of membership) of each data value to the clusters. Uncertainty features is assigned by an assignment of appropriate mapping functions to the clusters. The membership value is in the range zero to one and indicates the strength of its association in that cluster. 5. Fuzzy classification: data values (Ai) are classified into categories according to a set of categories L = {li} available and clustering method chosen. This results in a set of degree of belief (d.o.b.s) M = {µli(tk.Ai)} where tk is the tuple identifier. This represents the confidence level with which tk.Ai belongs to the category li. 6. Classification Value Space (CVS) construction: transforming the data into classification beliefs and storing them in a cube, where the cell store the degree of belief for the classification of attribute values. This cube is also referred to as CVS. 7. Handling the information included in CVS: CVS contains knowledge about our data set based on which sound decisions can be made. Fuzzy logic concept is used for quality measurement of our dataset with regards to each category. 8. Association rules extraction: extraction of rules between attributes depending upon the classification method chosen. 41 9. Prediction and determination of samples to determine which cluster it will belong to. This is usually done by calculating the average index of each cluster and using proximal values to determine the sample’s cluster. Figure 5 [70] illustrates main steps to the approach mentioned above. Figure 5: Fuzzy Logic in Data Mining [70] Pseudo code of fuzzy c-means clustering algorithm is given below [51]: initialize p=number of clusters initialize m=fuzzification parameter initialize Cj (cluster centers) Repeat 42 For i=1 to n :Update μj(xi) applying (3) For j=1 to p :Update Ci with(4)with current μj(xi) Until Cj estimate stabilize This Fuzzy logic approach in data mining enables us to [41]: 1. Handle uncertainty based on degree of belief (membership function): ability to transform crisp clustering method in fuzzy method to handle uncertainty. 2. Definition of a classification function to handle uncertainty: emphasis on handling uncertainty during the classification phase through the framework which is based on fuzzy logic. 3. Information measures for the classification scheme: checks for information quantity included in fuzzy sets. Using these measures, we can check which set best fits by checking the degree associate with the sets; this allow us to make sound business decisions using the information measures. 3.1.5 Applications There are many applications of applying fuzzy logic to dealing with uncertainty in data mining; here are few examples: • Human Resource Management: Han Jing’s Application of Fuzzy Data Mining Algorithm in Performance Evaluation of Human Resources provides the application of applying fuzzy logic uncertainty model in data mining. 43 3.2 Semantic Web Services and Uncertainty “…deeds, efforts or performances whose delivery is mediated by information technology. Such e-service includes the service element of e-tailing, customer support, and service delivery” - J. Rowley 3.2.1 Background The spreading of network and business-to-business technologies has changed the way business is performed. Companies are able to provide services as semantically defined functionalities to vast number of customers by composing and integrating these services over the Web [53]. Such services are referred to as E-services which stand for electronic services, also known as web services. Web has altered how businesses used to do its operation. The introduction of e-business brought along a revolution and created a surge in technology-based-selfservice [56]. Enterprises look to business-to-business (B2B) solutions to improve communications and provide a fast and efficient method of transacting with one another [54]. E-services provide companies with the opportunity to conduct electronic business with all other companies in the marketplace instead of traditional approach of conducting business through collaborative business agreements only. Services offers are described in such a way that they allow automated discovery to take place and offer request matching on functional and non-functional service capabilities [54]. E-services are available for different purposes, such as, banking, shopping, health care, learning, and has high potential benefits in the areas of Enterprise 44 Application Integration and Business-to-Business Integration. The concept of e- services plays a vital role in knowledge management applications through the ability of exchanging functionality and information over the Internet. Web services provide a service-oriented approach to system specification, enable the componentization, wrapping and reuse of traditional applications, thereby allowing them to participate as an integrated component to knowledge management activity [59]. It is important to note that web services operate at a purely syntactic level [65] as shown in Figure 6 [67]. Figure 6: Web Services & Semantic Web Services [67] 3.2.2 Semantic Web Services Semantic Web Services (SWS) is a combination of semantic web technology with web services. Semantic Web Services are pieces of software advertised with a formal description of what they do; composing services means to link them together in a way satisfying a complex user requirement [63]. Discovery composition, invocation, and interoperation are the core pillars of the deployment of semantic web services [64]. SWS is taking web services to next level by adding the dimension of 45 semantically enhanced information processing in conjunction with logical inference to provide development of high quality techniques for automated discovery, composition and execution of services in the web [65]. As Polleres said in [65], “SWS provides a seamless integration of applications and data on the web.” Figure 6 [67] illustrates both web services and semantic web services, and Figure 7 [66] represents the detailed overview of semantic web services. Figure 7: Semantic Web (Detailed) [66] Different semantic web services framework such as, OWL Service Ontology (OWL-S), Web Service Modeling Ontology (WSMO) and the Semantic Web Services 46 Framework (SWSF) are used to semantically describe necessary aspects of services in a formal way for creating machine-readable annotations [65]. Matching of a goal (client’s purpose of using web services) to the web services capabilities are classified as follows as in [60]: 1. Exact-match: a goal exactly matches the matched web services capabilities 2. Plug-in-match: a goal is subsumed by matched web services capabilities 3. Subsume-match: matched web services capabilities are subsumed by a goal 4. Intersection-match: a goal and matched web services capabilities have some common elements 5. Disjoint-match: a goal and matched web services capabilities do not belong to any above classifications During the matching process, it would be nice to identify a degree of matching to each matched web services capabilities. This will tell us which result is closer to the goal in comparison to all the results returned. There are three forms of Semantics as defined in [71]: 1. Implicit Semantics: unstructured text, loosely defined and less formal structure of data repositories. This is useful in processing data set to obtain bootstrap semantics that can be used to represent through formal knowledge. Machine learning utilizes implicit semantics. 2. Formal Semantics: well defined syntactic structure for knowledge representation, more formal structure of data representation. Definite rules of syntax in place which allows for automated reasoning thereby making applications more intelligent. Since human language is ambiguous both 47 semantically and syntactically, it is tough for computers to use this language as a means of communication with other machines. Semantics that are represented in well formed syntactic form is referred to as formal semantic. These are machine processable that does not allow for uncertainty due to limited expressiveness. Two features of a formal language are: • The Principle of Compositionality • The notions of Model and Model Theoretic Semantics 3. Powerful Semantics: use of knowledge to its fullest; allows for vagueness, imprecise or uncertain knowledge, and fuzzy form. Although it is ideal to have a consistent knowledge base, but in practical, it is almost impossible. It is usually impossible to gain local consistency but almost infeasible to maintain global consistency. We should allow contradicting systems in knowledge base, and have the ability to computationally evaluate these contradicting statements to come to the right conclusion. 3.2.3 Uncertainty in Semantic Web Services The real power behind human reasoning however is the ability to do so in the face of imprecision, uncertainty, inconsistency, partial truth and approximation. Powerful semantics provide the benefit of utilizing a common language which allows for abduction, induction and deduction. This will provide inference mechanism that is complete with respect to the semantics. Uncertainty exists in almost every life situation, and semantic web services are no different. As authors of [63] said, one important issue with semantic web services is the fact that they are embedded in background ontologies which 48 constraint the behavior of the involved entities. Semantic web provides a vision where knowledge is being transferred by agents. This knowledge would be imprecise or incomplete in nature, thereby introducing different aspects of uncertainty. In semantic web services, when a user initiates a request through a query, the request is not one hundred percent crisp. Semantic description contains information that may be incomplete or imprecise in nature thereby making it critical to have the ability to deal with uncertainty. In these cases, we cannot assume exact matches of inputs provided by the users, as we might not be able to comprehend it. Since both web content and user’s query are vague or uncertain in nature, we need to foster the environment to deal with uncertainty in semantic web services. Current semantic web services framework use first order logic and relies on subsumption checking for matching process between goal and web services capabilities. Authors of [71] said, “Overtime, many people have responded to the need for increased rigor in knowledge representation by turning to first order logic as a semantic criterion. This is distressing since it is already clear that first order logic is insufficient to deal with many semantic problems inherent in understanding natural language as well as the semantic requirements of a reasoning system for an intelligent agent using knowledge to interact with the world.” In the real world, concepts are not always subsumed by each other, and cannot always be classified in crisp subsumption hierarchies [69]. This summarizes the foundational problem with semantic web ontology which is based on the concept of crisp logic. Semantic web frameworks such as OWL are not equipped to deal with this uncertainty. They 49 assume that the knowledge base is crisp in nature thereby entirely eliminating the concept of uncertainty. For most part, classical theories where used in semantic web services for reasoning under uncertainty. Assumption was made of a closed world where knowledge base was assumed to be complete and precise. Hence there was a need to extend non-classical theories to deal with uncertainty (both qualitative and quantitative). In recent years, probabilistic and possibilistic logic has been extended into semantic web services to deal with uncertainty. The underlying principles behind these approaches were annotating the ontologies with some kind of uncertainty information about its axioms and use this information to perform uncertainty reasoning [68]. The main issue with this approach was that these uncertainties were asserted by humans who are not good at either predicting or perceiving concepts like probability [68]. The foundational problem with Semantic web ontology is that it is built upon crisp logic. There is a need to represent partial subsumption in a quantified manner. There are various models recommended to deal with this situation and handle uncertainty. M. Holi and E. Hyvonen recommended Bayesian Network in [69], P. Oliveria and P. Gomes recommended Markov Network in [68], and D. Parry recommended Fuzz Logic in [61]. 50 3.2.4 Fuzzy Logic Uncertainty Model To deal with incomplete knowledge base, the combination of fuzzy logic with probabilistic logic seems promising. Zadeh recommended this approach of combining fuzzy logic with probabilistic logic to complement each other and provide best of both worlds. Fuzzy set theory classifies object into fuzzy sets (sets with fuzzy boundaries) along with the degree of membership associated with each object in the set. Figure 8 [72] illustrates Web Services Framework using Fuzzy Set Logic. Figure 8: Web Services Framework [72] 51 The main steps involved in Semantic web services with integration of fuzzy logic are [72]: 1. Scope and rules specification: Domain experts specify both the scope and rules; these rules are matched in the rules matching phase with the web service description 2. Fuzzy set generation: fuzzy set is then generated based on the scope provided by the domain experts. 3. Weights calculation and assignment: weights are calculated using probabilistic model; degree of truth is assigned to every fuzzy set based on the history of how often it is used. This is then stored in a local database and used for weights calculation. 4. Define fuzzy rules: Two fuzzy sets are defined; one is a fuzzy set of weights, and the second is a fuzzy set of distance, which will be used in the matching distance algorithm during the matching process. These fuzzy sets are used in conjunction. 5. Model for fuzzy matching: all services that have been matched are stored in database with associated weights, distance and matched values. Results are sorted in indexed order based upon weights. The fuzzy matching algorithm as stated in [72] is as follow: Algorithm 1: FuzzyMatching Input: S[1..n], W[1..n] Output: services, composedServices 1. for i ¬ 1 to n do 2. initiate new thread 52 3. member ¬S[i] 4. weight ¬ W[i] 5. if weight is High then 6. distance ¬ Approximate 7. else if weight is Medium then 8. distance ¬ Close 9. else if weight is Short then 10. distance ¬ Exact 11. end if 12. service ¬ Fetch Web service 13. result ¬ call ApproximateMatchingAlgorithm(service, member, distance) 14. if result > 0 then 15. Store service in database 16. end if 17. Sort stored services 18. for each stored service 19. initiate new thread 20. O[1..n] ¬ service.outputParameters 21. service ¬ Fetch Web service 22. I[1..n] ¬ service.inputParameters 23. temp ¬ false 24. for i ¬ 1 to n do 25. if O[i] = I[i] then 26. temp ¬ true 27. else 28. temp ¬ false 29. break loop 30. end if 31. end for 32. if temp = true then 33. link services and store in database 53 34. end if 35. end for 36. end for 6. Constraint Satisfaction: the user’s request is matched with the constraints specified by the service provider and rules specified in the first step are satisfied with web services input parameters, output parameters and operations. 7. Evaluation: the composition of various web services is conducted from the pool of all web services. Final web service is selected by domain expert depending upon their experience and knowledge. Table 2 and 3 below show how fuzzy logic in integrated with Semantic web to deal with uncertainty. 54 Bootstrapping Phase (building phase) Capabilities Implicit Semantics Building ontologies either automatically or semiautomatically Analyzing word cooccurrence patterns in text to learn taxonomies/ontologies Annotation of unstructured content with respect to these ontologies (resulting in semantic metadata) Analyzing word occurrence patterns or hyperlink structures to associate concept names from and ontology with both resources and links between them Using clustering techniques or Support Vector Machines (SVM) for Entity Disambiguation Entity Disambiguation Semantic Integration of different schemas and ontologies Semantic Metadata Enrichment (further enriching the existing metadata) Analyzing the extension of the ontologies to integrate them Analyzing annotated resources in conjunction with an ontology to enhance semantic metadata Table 2: Building Phase [72] 55 Formal Semantics Using an ontology for Entity Disambiguation Possible use of Powerful (soft) Semantics Using fuzzy or probabilistic clustering to learn taxonomic structures or ontologies Using fuzzy or probabilistic clustering to learn taxonomic structures or ontologies OR using fuzzy ontologies KR mechanisms to represent ontologies that may be used for Disambiguation Schema based integration techniques This enrichment could possibly mean annotating with fuzzy ontologies. Capabilities Implicit Semantics Concept-based search Utilization Phase Connection and pattern explorer Context-aware retriever Word frequency and other CL techniques to analyze both the question and answer sources Analyzing occurrence of words that are associated with a concept, in resources Analyzing semistructured data stores to extract patterns Word frequency and other CL techniques to analyze resources that match the search phrase Dynamic user interfaces Interest-based content delivery Navigational and Research Possible use of Powerful (soft) Semantics Hypothesis validation queries Complex Query processing Question Answering Formal Semantics Analyzing content to identify concept of content so as to match with interest profiles Navigation searches will need to analyze unstructured content Table 3: Utilization Phase [72] 56 Using Formal ontologies for QA Providing confidence levels in answer based on fuzzy concepts or probabilistic Using hypernymy, partonomy and hyponymy to improve search Using ontologies to extract patterns that are meaningful Using formal ontologies to enhance retrieval Using Fuzzy KR mechanisms to represent context Using ontologies to dynamically reconfigure user interfaces User profile will have ontology associated with it which contains concepts on interest Discovery style queries Fuzzy matches for research search results. CHAPTER 4 SOFT COMPUTING FOR INTELLIGENT SYSTEM: DESIGN AND ARCHITECTURE “Role model for soft-computing is the human mind.” -Prof. Lotfi A. Zadeh Intelligent systems have to deal with knowledge uncertainty in practically every real world situation as much of the knowledge base is based on human knowledge which is usually imprecise and vague in nature. We have looked at different uncertainty models such as fuzzy logic, rough set theory and so forth and mapped best fitted uncertainty model to data mining and semantic web services application. For intelligent systems to deal with this uncertainty there has to be a proper design and architecture in place. The focus of this chapter is to discuss design and architecture of intelligent system using soft computing techniques. 4.1 Soft-computing for Intelligent Systems “Soft computing is a term applied to a field within computer science which is characterized by the use of inexact solutions to computationally-hard tasks such as the solution of NP-complete problems, for which an exact solution cannot be derived in polynomial time [74].” Soft computing techniques can work around knowledge base which is incomplete, imprecise and uncertain in nature. Traditional approaches of finding exact solutions cannot be applied anymore in today’s world which is highly 57 unpredictable. Hence the need for soft computing came about to deal with uncertainty. Guiding principle of soft computing as Zadeh said in [75] is to: “exploit the tolerance for imprecision, uncertainty, partial truth, and approximation to achieve tractability, robustness and low solution cost.” The main constituents of soft computing are Neural Network (NN), Fuzzy Logic, Evolutionary Algorithm, and Probability Theory. Soft computing is a fusion of various methodologies mentioned above to create Intelligent Systems which can solve the problem in hand. Zadeh defined soft computing as a “partnership in which each of the partners contributes a distinct methodology for addressing problem in its domain; these methodologies are complementary rather than competitive.” Combination and hybridization of these methodologies provides soft computing with the cutting edge which is missing in other techniques. 4.1.1 Main Components of Soft Computing 1. Neural Network: Inspired through field of biology, neural network is an interconnected group of artificial neurons which can exhibit complex global behavior. A neuron comprises of a significant information processing elements [75]. It replicates a human central nervous system, where functions are performed collectively by neurons and run in parallel. 2. Fuzzy Logic: It is discussed in great deal in Chapter 2 (Section 2.2) 3. Evolutionary Algorithms: Evolutionary algorithms also known as Genetic algorithms have been used recently to program and engineer intelligent systems. It is an adaptive heuristic search algorithm which is based upon 58 natural evolution and Darwin’s theory of “survival of the fittest.” Natural evolution consists of selection, reproduction and mutation to reach solution to a problem in hand. A standard process for generating new algorithms is [73]: potential candidate solutions are initialized; through reproduction techniques, new solutions are created; and suitable solution is selected depending upon the “what fits best.” These steps undergo series of iteration before the final solution is chosen. In comparison to other popular techniques, evolutionary algorithms are easy to implement and provide solutions to resolve issues in hand. It is different from other methodologies as it aims for an optimized solution rather than just a good solution and also makes use of historic data to gain better performance during search. Advantages of Evolutionary Algorithm: a) More robust, hence a better option than typical AI. b) Offers better performance while searching large space through heuristic based approach and linear programming. c) Ability to handle changes in input variables. 4. Probability Reasoning: used for approximate reasoning. This is based upon probabilistic theory Discussed in Chapter 2 (Section 2.1). 4.1.2 Characteristics of Soft Computing 1. Human expertise represented through knowledge base which is a repository of human knowledge. 59 2. Earlier soft computing would aim for good solutions versus optimal solution; but now with introduction of new techniques such as evolutionary algorithms, it can achieve optimization as well. 3. Neural Networks which is based upon biological system, more precisely Central Nervous System. 4. Ability to handle real world applications by dealing with uncertainty rather than ignoring it. 5. Support various applications for which mathematical models are not available or inflexible. 6. Soft computing intersects with lot of other disciplines as in Figure 9 [73]. Figure 9: Relation between soft computing and other fields [73] 60 Next sections deal with design and architecture of intelligent systems with uncertainty. 4.2 Design of Intelligent Systems with Uncertainty The essence of designing an intelligent system lies in its ability to effectively control an object in the dynamic environment. In an ideal world, this object would be a replica of a human expert making similar decision if they were placed in the situation in which the intelligent system is operating. In the closed environment, where all elements are accurately defined, and with minimal scope of change or introduction of new or unknown elements, intelligent system can very precisely perform an action for which they are designed. Design of such intelligent systems will be focused on defining accurate and complete sets of rules for knowledge base. Real world applications however cannot be described by complete set of facts and rules. The variable which makes it harder to achieve this goal is “uncertainty”; it plays a critical role in the design of an intelligent system. Therefore, instead of ignoring this variable; it should be well considered during early stages of design phase. While designing an intelligent system, it becomes vital to handle uncertainty at three different levels: uncertainty in objects, uncertainty in surrounding environment in which they operate, and uncertainty in expected functionalities. Here are more details on these three aspects. 61 4.2.1 Main Aspects of Design 1. Uncertainty in Objects Intelligent system is executed at the level of an object. An object is usually operational with lot of sensors to record different measurements about itself; these sensors help objects maintain their integrity with in parameters defined during the design stage. If at any time, any measurement goes out of range, these sensors send a signal to the object identifying something is wrong. When uncertainty is factored in the situation, these sensors play a critical role in signaling object of the uncertainty. If there is noise in the measurements, then objects would filter the data to ensure it can ignore noise in the data. In situations, where knowledge base is not fully equipped with rules and facts to help objects, different soft computing techniques are used in the design of these objects. An example would be the use of rough set theory, which can help identify the uncertain situation. Through the use of approximation and rough membership concept of rough set theory, we can handle uncertainty to the best of the ability; the success of handling uncertainty through rough set theory is proportional to the data stored in knowledge base. 2. Uncertainty in Surrounding Environment An object has to adapt itself to the surrounding environment in which it operates. Environment is an open ended concept which constitute of many different variables; it can never be described accurately and precisely through facts in the knowledge base. Change is another factor which has to be dealt 62 with, since environment can change anytime. There could be many other objects existing in the environment, and it would be critical to keep tabs on them as well. An interaction of the main object with other objects in the system is dependent upon the nature of other objects which could be very uncertain in nature. Due to these reasons, this level is a little hard to deal with where uncertainty plays a central role. There could be various unknown factors that could be introduced in the environment for which object has no account of; it is important to understand if this is just a noise, an anomaly or a new factor. For example, in a retail store, the sale of the store can go down steeply; for an intelligent system, it becomes critical to understand if this was just one time thing due to bad storm, or if there is a downward trend due to economic recession. Because there are many different factors at this level, the best way to deal with uncertainty is through multi-valued logic. Multi-valued logic provides the flexibility to hold as much as information as needed depending upon the problem in hand. Hence, if environment is not very complicated and pretty well defined, then multi-valued logic can hold precisely almost all the information, even if the information is conflicting with each other to determine the behavior during uncertainty. 3. Uncertainty in Expected Functionality An intelligent system is created to accomplish a given task at hand; there is an expected functionality it has to perform. In earlier days, when systems were less complicated, the scope of the problem would seldom 63 change; expansion of horizon. But recently, as systems have become more complex and evolved to next level, change is the only thing that is constant. Hence, scope of expected functionality can change anytime, depending upon various variables. At this level, it is critical for the system to act intelligently and be able to accept changes in the scope in a diligent manner. Intelligent systems should be well equipped to deal with possible modifications, contingency situations and be well aware of its safe mode of operations. To accomplish these tasks, it should be able to perform analysis of its current situation and predict future evolution when the modifications are introduced. For contingency planning, it should be able to implement that through decision making, learning and self learning. would be the requirement development cycle. engineering phase An example of this during the software The scope of software can change anytime which requires requirements to be modified as per the new scope. Rough set theory can be used at this level to help with decision making and self learning. For optimization, we can use various hybrid soft-computing techniques instead of the traditional ones. 4.2.2 Design Framework Traditional design frameworks are pretty effective and efficient in handling many real world applications; main shortcoming is that they aim for a solution instead of an optimized solution. Similar to field of agriculture, where hybrid seeds are created through various techniques such as crossover; different hybrid frameworks have 64 been developed for optimization. Fuzzy logic and hybrid frameworks are two different design frameworks discussed in this section. 1. Fuzzy Logic Fuzzy logic is a popular soft-computing technique being utilized in the design of an intelligent system; this concept is known as fuzzy information granulation [78]. Underlying principle of this concept involves partitioning a class of objects into smaller granules in such a manner that the objects with in granules are similar in nature, and objects amongst different granules are distinct in nature. State where granules can be easily distinguished from each other could be referred to as black and white zone, which is crystal clear as to which objects belong to which granule. In addition to typical black and white zone, there is a grey zone, where granules cannot be easily discerned from each other; the boundary line that divides one granule from another is fuzzy instead of crisp in nature. This fuzziness is represented through words rather than numbers which help bridge the gap between machine language and human knowledge. These words act like labels of fuzzy granules; the ability of these labels to use natural language is an added benefit which makes it easily adaptable into the real world. Advantage of using words helps us to handle imprecision and uncertainty thereby making systems more robust and flexible in dealing with reality. Since most knowledge bases are repository of human knowledge, 65 using words as labels for granules, can be practically used in every field where soft computing techniques have been already explored. 2. Evolutionary Artificial Neural Networks Recently, hybrid frameworks have gained a lot of popularity; this is identical to creating hybrid seeds in the field of agriculture. Current soft- computing techniques are inefficient when it comes to their computation speed, due to the large search space. “Current state of Artificial Neural Network is dependent upon human experts who have enough knowledge about various aspects of the network and the problem domain [73]”. With the growing complexity, this traditional design becomes insufficient to handle the problem domain thereby, shifting the gear towards evolutionary algorithm in Artificial Neural Networks. Evolutionary algorithm is used for design and architecture of neural networks which offer two main features: evolution and learning. These qualities make them highly adaptable to dynamic environment making them effective and efficient than classical approaches. The underlying algorithm is based on Darwin’s theory of “survival for the fittest.” The selection process is such that the desirable behaviors and features are passed on to the next generation, whereas less desirable behaviors fade away. Evolution in this hybrid network is introduced at three different layers, as highlighted in [73]: 66 1. Evolution introduced at weight training level The training process at weights level is used for global search of a connection weight to get an optimal set which is defined by evolutionary algorithm. The evolutionary algorithm is a step ahead when compared to other techniques such as gradient based techniques since it looks for a global optimal solution rather than local optimum solutions. Here is the algorithm for Evolutionary search of connection weights as in [73]; I. Initialize the population of N weight chromosomes II. Fitness of each network is evaluated depending upon the problem in hand. III. Based on results from step (II), selection method is executed to create number of children for each individual (node) in the current generation. . IV. Genetic operators are then applied to each child individual created in step (III) to further reproduce next generation. V. Check number of generations created versus the required target to evaluate next step. If target has not been achieved, then step (II) is executed again, else (VI). VI. End 2. Evolution introduced at the architecture level Evolutionary architecture is achieved through constructive and destructive algorithm. Constructive algorithm refers to constructively adding complexity by starting with a simple architecture, whereas destructive algorithm refers to destructing the large architecture until network cannot perform its task. Evolution is usually introduced at architecture level when prior knowledge of architecture is known. Indirect 67 coding can be used in these cases to improve scalability such as, Blueprint. Algorithm for Evolutionary search of architectures as in [73]; this algorithm is similar to the algorithm at the weight level, except this initializes the population of architecture chromosomes. I. Initialize the population of N architecture chromosomes II. Fitness of each network is evaluated depending upon the problem in hand. III. Based on results from step (II), selection method is executed to create number of children for each individual (node) in the current generation. . IV. Genetic operators are then applied to each child individual created in step (III) to further reproduce next generation. V. Check number of generations created versus the required target to evaluate next step. If target has not been achieved, then step (II) is executed again, else (VI). VI. End 3. Evolution introduced at the learning level Learning rules is critical to any intelligent system since learning rules should be able to adapt itself to its dynamic environment in which it is operating. The same learning rules are applied to the entire network and the architecture is set up in such a manner, that for every learning rule chromosome, several architecture chromosomes will evolve at a faster rate. The algorithm for Evolutionary search for learning rules as in [73]: this algorithm is very similar to previous two algorithms, with the exception of learning rules are initialized in step (I). I. Initialize the population of N learning chromosomes 68 II. Fitness of each network is evaluated depending upon the problem in hand. III. Based on results from step (II), selection method is executed to create number of children for each individual (node) in the current generation. . IV. Genetic operators are then applied to each child individual created in step (III) to further reproduce next generation. V. Check number of generations created versus the required target to evaluate next step. If target has not been achieved, then step (II) is executed again, else (VI). VI. End The decision on which level to evolve is dependent upon the type of knowledge available. If the knowledge is centered towards the architecture, as oppose to learning rules, then it is better to implement evolution of architecture at the highest level. Through this, we minimize the search space. 4.2.3 Selection of Appropriate Design The question of selecting the proper configuration of design for intelligent systems can take several combinations and permutations of various methodologies available at our disposal. This can lead to an exhaust list of possible solutions, and the best one has to be chosen depending upon the nature of the problem in hand. Choice between soft-computing techniques and hybrid frameworks should be well evaluated depending upon different criterions such as: speed vs. accuracy. Best solution chosen to the problem could be the one which uses the least amount of 69 computational resources, or the one that provides more accuracy irrespective of the computational speed. 4.3 Architecture of Intelligent System with Uncertainty Intelligence is defined as “the ability to act appropriately in an uncertain environment, where appropriate action is that which increases the probability of success, and success is the achievement of behavioral goals [33].” The success of an intelligent system is directly dependent upon the efficiency of the system architecture. An effective and efficient architecture provides a systematic framework which can be used to implement intelligent systems that can deal with uncertainty. These architectures at a higher level identify the main modules that are required during the implementation. There have been various architectures in place to implement intelligent systems that can deal with uncertainty. In this section, few of these architectures have been explored with the focus on how each of them deals with uncertainty. 4.3.1 Architecture for Intelligent System Basic Architecture The basic architecture is a simple architecture which receives an input X; this input represents the problem in hand. This could be the application being worked upon, such as, data mining, or e-services. Function 1 can be implemented through any soft-computing techniques such as fuzzy logic, or rough set theory whichever fits the best depending upon the nature of the problem. Once it receives the input, it processes the data to produce an output Y, as shown in figure 10. 70 Figure 10: Basic Architecture for Intelligent Systems As an example, in case of data mining, if this was the phase of clustering: Function 1 can be implemented through Fuzzy logic. Fuzzy logic is used to calculate degree of belief (grade of membership) of each data value to the clusters. The output is the data clustered together along with the value of degree of belief. Basic architecture as depicted in figure 10 could be implemented for various set of applications. This is usually applied in cases where one soft computing technique can solve the problem in hand. Intelligent systems based on this type of architecture can be easily implemented, but may not be very efficient in solving complex situations. 4.3.2 Architecture for Hybrid Intelligent System Hybrid intelligent systems are becoming very popular due to their ability to be implemented through hybridization of soft computing techniques. Hybridization of different techniques offers the best of both worlds; they utilize the best of AI techniques to implement intelligent systems that are more efficient and effective. There are three general approaches to the architectures of the intelligent system [79]: 71 1. Sequential Type This is a type of architecture where different functions are performed in a defined sequence. Function 1 receives an input X representing the problem in hand. It processes the data and produces an output Y which is fed as an input to function 2; function 2 further processes this data thereby producing the final output Z. This is represented in of Fig 11. Figure 11: Sequential Type of Architecture In this type of architecture Function 1 and Function 2 could be implemented using different algorithms. Function 1 can be implemented through Fuzzy logic and Function 2 could be implemented using Neural Network or vice versa. For example, in the case of e-services; uncertainty exists at the level of user initiating the query. Originally when the user input is received, Function 1 is implemented through Fuzzy logic algorithm. It processes the data, and creates fuzzy sets. Function 2 can be implemented using neural network; once function 2 receives an input Y, it calculates weights thereby producing a final output of Z. In this case uncertainty is dealt at the front level only while interpreting the user’s query. 72 2. Parallel Type This is a type where different functions are performed in parallel as shown in Figure 12; there could be few variations to this. These two functions (Function 1 & Function2) running in parallel could be working on the same problem, and then Function 3, will choose the better solution and give that as a final output. If this was the set up of the problem, then uncertainty needs to be handled at the front level only when input X is received. Function 1 & 2 could be implemented using Fuzzy logic and Rough set theory. Function 3 will choose the better of two solutions which can be performed using neural networks. Figure 12: Parallel Type of Architecture The other variation can be that these two functions could be performing different functions (narrow & broad) and then function 3 will aggregate their inputs to produce the final output. In this case, uncertainty had to be dealt with at two levels; front level when input X is received and secondly when the input Y is received by Function 3. During aggregation of 73 two solutions, if there was still some uncertainty, it could be dealt by Function 3 to produce the output Z. 3. Feedback Type This is a type where function 1 performs the main function required, and function 2 is there to fine tune the parameters of function 1 (figure 13), so that the desired output is an optimal solution. Figure 13: Feedback Type of Architecture This type of architecture can be implemented in two ways: • Selection of behavior before the fact – this is achieved by analyzing goals provided to the system. • Selection of behavior after the fact – achieved through the process of subsumption. 74 Uncertainty can be dealt with at the level of receiving an input. In this architecture, Function 1 and 2 could be implemented through various algorithms such as, fuzzy logic, neural networks, Bayesian network or Evolutionary algorithm. Different types of architecture for hybrid system, is based upon mix and match of different soft-computing techniques. This mix and match offer the best of AI. 4.3.3 Evolutionary Algorithm Architecture This is a specialized type of architecture for hybrid system, where one function is evolutionary algorithm and other one can be chosen from a pool of soft-computing techniques available. Interactions between evolutionary algorithm and intelligent paradigms can take many different variations. Intelligent paradigm refers to computational intelligence techniques such as fuzzy logic, multi-valued logic. Abraham and Grosan in [73] have discussed several architectures for evolutionary intelligent systems. For instance, evolutionary algorithm can help optimize intelligent paradigm and intelligent paradigm in return can help optimize evolutionary algorithm. Hence both help each other to obtain the level of optimization. Figure 14 from [73] shows the architecture of this evolutionary intelligent system; problem refers to real world applications such as data mining, e-services. 75 Figure 14: Evolutionary Intelligent System Architecture [73] Design and architecture of intelligent system plays a crucial role in the success of an intelligent system. With the recent hybridization of various soft- computing techniques, hybrid systems have been developed which are fully capable of handling real world applications. Next section provides us with an example of real world application making use of Intelligent System. This will help us understand design and architecture of intelligent system using soft computing techniques. 4.3.4 Application: Suppression of Maternal ECG from Fetal ECG Soft computing is clearly the emerging technique being used to build intelligent systems. In this section, we will take a look at a real world application of intelligent system called Adaptive Neuro Fuzzy Inference System (ANFIS) [86]. As the name suggests, this Intelligent System is based upon combination of neural networks and fuzzy logic. Neural networks enable recognition of patterns and becoming adaptive to the changing environment the agent operates in. Similarly, fuzzy logic provides the capability of inference and decision making through knowledge base. ANFIS plays a critical role in the field of signal processing; we will see how this intelligent system can help with noise cancellation in signal processing. 76 The application of noise cancellation can be applied to many real world applications such as telecommunication, speech recognition, and medical field. For our purposes we will take a look at one specific application in field of medicine where it is being utilized to suppress maternal ECG from a fetal ECG. Noise can be defined as an “unwanted energy, which interferes with the desired signal [86].” The ultimate goal is to cancel or reduce the noise from the signal so it does not distort the signals which can cause misinterpretation. The underlying principle of noise cancellation is to “filter out an interference component by identifying the non-linear model between a measureable noise source and the corresponding immeasurable interference [86].” This is done by estimating the level of noise in the system and then subtracting this from the signal. The effectiveness of noise cancellation is directly dependent upon the accuracy of estimation of noise level. It is a critical step in translating signals properly to what they truly represent; this poses a challenge in the field of signal processing. ANFIS is used to handle uncertainty by identifying unknown non linear passage dynamics that transforms noise source into noise estimate in a detected signal [86]. ANFIS architecture is composed of neural network and fuzzy logic; we will briefly go over few details: Neural Network Neural Networks has already been mentioned in Section 4.2.2. ANFIS is based upon Back Propagation from Neural Networks. 77 Back Propagation This learning algorithm is based upon Widrow-Hoff learning rule which is used to train multi layer feed forwards networks. Training of networks involve usage of input and their corresponding output vectors until network is trained to approximate a function, and is able to provide association between input and output vectors as expected. Through this training, network learns to associate input with output. Back propagation refers to the manner in which gradient is computed for nonlinear multi layer networks. When a back propagation is properly trained, they are able to associate, infer and make precise decisions when presented with an unknown input. Usually through similarity in inputs, it will lean towards the correct output. This is based on two phases of data flow. First phase is where the input is propagated from the input layer to the output layer, producing the output. Second phase is where error signal is being propagated from the output layer to the previous layer to update the weights [86]. Fuzzy Logic Fuzzy logic has already been discussed in greater details in section 2.2. ANFIS is based upon Fuzzy Inference System. Fuzzy Inference System “Fuzzy Inference system is the process of formulating the mapping from a given input to an output using fuzzy logic [86].” Figure 15 [89] illustrates the functional block of fuzzy inference system. This system takes an input which is a crisp set, and returns the crisp output through weighed average. 78 Figure 15: Basic Configuration of a Fuzzy Logic System [89] Suppressing Maternal ECG from Fetal ECG Noise cancellation application is used and implemented in various real world problems; once such problem is to suppress maternal ECG from fetal ECG. Pregnancy is a very critical stage where utmost precaution should be taken by mother for safety of both the mother and the baby. Many health problems of a new born baby can be reduced by monitoring fetus’s heart rate, since heart rate is an important indicator of health [90]. ECG, which stands for electrocardiogram, can be recorded and processed to derive this heart rate. Maternal ECG represents mother’s ECG and Fetal ECG represents fetus’s ECG. While trying to get measurements for fetal ECG, there is interference from maternal ECG. Hence it is crucial to suppress maternal ECG from Fetal ECG while measuring abdominal signal to get the accurate reading by cancelling noise. 79 ANFIS comes into play to deal with maternal ECG; we will look at details of how maternal ECG is handled through ANFIS as discussed in [86]. Fetal ECG x(k) is recorded through abdominal signal y(k) via a sensor in abdominal region. During the process of recording y(k), this signal gets mixed (noisy) with mother’s heartbeat n(k) which acts as a noise. n(K) can be easily measured in this case through thoracic signal obtained via a sensor placed at thoracic region. Noise does not appear directly in y(k), but only appears in bits and pieces which distorts the signal y(k). This is represented as: Y(K) = x(k) + d(k), where d(k) represents distorted noise (equation 1) =x(k) + f(n(k), n(k-1)….) Let B = f(n(k), n(k-1), …) Function B represents the path that noise signal n(k) takes; if path was known, then we would get the original signal through y(k) – d(k). Since, it is an unknown factor and time variant due to changing environment, it is not simple enough. Ď(k) is distorted noise signal which is an added component on top of y(k). Learning rule of ANFIS which is implemented through neural networks aims at minimizing the error: E(k)2 = ( Y(k) - Ď(k))2 = ( x(k) + d(k) - Ď(k))2 (from equation 1) Error is directly dependent upon d(k); hence by reducing d(k), we can minimize error. The ANFIS approach to cancel noise cancellation works only when [86]: 1. Noise signal n(k) is available and independent of information signal x(k) 80 2. Zero mean value for x(k) 3. Passage dynamic is known (path n(k) will take) In our case of suppressing maternal ECG from fetal ECG; information signal x(k) is of sinusoidal form and noise is a random signal. ANFIS performs calculation and information signal is obtained. Overview of algorithm as discussed in [86]: 1. Abdominal signal is generated 2. Thoracic signal is generated 3. Interference signal is generated 4. Interference and abdominal is mixed to generate a measured signal 5. Ď(k) is calculated (distorted noise signal) 6. Subtract Ď(k) from the measured signal to get an estimated signal 7. Calculate the error signal through error calculation. Figure 16 [87] is a high level overview of ANFIS cancelling maternal ECG from the signal. Figure 16: Maternal ECG Cancellation in Abdominal Signal using ANFIS [87] 81 In the real world, issue of accurately predicting Fetal ECG without the implementation of uncertainty models would definitely provide us with a wrong measurement. There will be lot of interference from Maternal ECG which could not be cancelled or reduced since it is an unknown variable. While when the similar issue is handled through an intelligent system which can handle uncertainty, we were able to get a good estimation of fetal ECG. Even though this value contains some error, but in comparison outperforms and gives a better prediction of fetal ECG along with the measurement of error. Intelligent system that was implemented through neural network only would be a little complex to train the network, and the measurement of error in the estimated signal will be higher. On the contrary, if fuzzy logic was used, then it will be hard to create all if-then rules, since the environment is complex. While ANFIS is dependent upon both neural network and fuzzy logic, it gets the best of both worlds. Measurement of error obtained through ANFIS is not zero and does represent high frequency noise, but the mean of error is zero. ANFIS is just one of many examples of intelligent system that are being used in real world applications to solve complex problems that involve uncertainty. ANFIS architecture has been evolved through combination of various uncertainty models. Lot of work has been conducted in this field; R. Swarnalath, and D.V. Prasad incorporated ANFIS with wavelets for maternal ECG cancellation as in [87]; more details can be found in [87]. Knowledge uncertainty plays a critical role in any intelligent system from beginning to end; it preprocesses the input, so the input is accepted by the intelligent 82 system, transforms this input through various uncertainty models to effectively handle uncertainty that exists in data and then finally produces the output. An intelligent system that is implemented to handle uncertainty can handle real world situations accurately and effectively than a situation where uncertainty is fully ignored. 83 CHAPTER 5 CONCLUSION AND RECOMMENDATIONS 5.1 Conclusion Artificial intelligence is an ever growing field with a lot of scope for research and advancement. It has recently gained a lot of popularity through its ability to handle real world situations; since then, many new theories and methodologies have been introduced. Knowledge uncertainty plays a crucial role in the field of AI because uncertainty is a part of our day to day lives. For the invention of more robust intelligent system, that can think and act like a human being, we have to apply uncertainty models which can accept approximations instead of exactness. Approximation is a reality of today’s world; hence it is important to use models for intelligent systems which can handle this variable. Several numbers of theories are in place to deal with uncertainty, ranging from probabilistic and possibilistic theory to combination of these two. This essay looked at four main uncertainty models: Fuzzy logic, Rough set theory, Multi-valued logic, and Bayesian network.. These models share one common goal: to handle uncertainty, impreciseness and incompleteness in the knowledge base. The approach to handle uncertainty varies amongst these models, and some are more effective in certain domain depending upon the nature of domain in question. Hence we cannot claim that one model is better than another because the solution to be implemented is very much dependent upon the type of application. 84 Hybridization of soft computing techniques provides a cutting edge to the hybrid intelligent systems. These systems can handle complex system efficiently and effectively if implemented accurately by understanding the needs of the task to be solved. More and more research is conducted in hybridization and lot of work has been conducted to start using this in our day to day lives. Data mining and semantic web services are two different applications with the need to handle uncertainty that exist at different levels. Different models have been identified and used for these purpose; this essay recommends fuzzy logic for handling different uncertainties that exist in these applications. For data mining, fuzzy logic proves to be a good model to deal with uncertainty in Data mining. This algorithm based on membership function and degree of belief, and can handle what data mining needs. It’s ability to transform crisp sets into fuzzy sets along with the value of degree of belief which signals which objects belongs more to the set in comparison to other objects in the same set; It’s ability to search for hidden patterns through huge amounts of data. These features make fuzzy logic suitable for data mining. Similar to Data mining, fuzzy logic is recommended for semantic web service domain. Uncertainty exists in semantic web service at different levels, from user initiating query to finding results that match the query. Fuzzy logic generates a fuzzy set to understand user query and when they system retrieves the results against user’s query; it creates two fuzzy sets which contains weights and distance information to display the results in the order of most relevant to least relevant. 85 Concept of fuzzy sets in fuzzy logic sets itself apart from other soft-computing techniques. The design and architecture play a central role in the success of intelligent system. More and more algorithms have been developed recently to achieve success without compensating on speed and without using too much of computational resources. Concept of natural selection is an interesting principle which is applied in the field AI; this definitely helps to get rid of features that are not very viable, thereby reducing the search space. At the design level, dealing with uncertainty at object, environment and goal level help to deal with uncertainty at an architecture level. Therefore, having a right design and architecture for intelligent system defines the success of intelligent systems. As discussed in Section 4, ANFIS is an excellent example of an intelligent system based upon hybridization of neural network and fuzzy logic useful in suppressing maternal ECG from fetal ECG. As more and more work is conducted in the field of Artificial Intelligence and uncertainty, new architectures are being evolved which can handle any complex problem with efficiency and accuracy. That day is not far away, when Artificial Intelligent will provide solution to every real world problems. 5.2 Future Work Knowledge uncertainty in intelligent system has come a long way from the initial state where intelligent systems were used for basic computation, to today’s era, where intelligent systems have been practically evolved to handle complicated real life situations. The success of these intelligent systems depends upon their abilities 86 to handle uncertainty. Future research should be conducted to create more hybrid models which are generated through mix and match of available models. To handle real world applications, we should be able to increase the speed of computation by using algorithms which operate in the environment of lower search space by compacting the environment which is smaller yet a true representation for the world it represents. 87 REFERENCES [1] L.A. Zadeh, “The Role of Fuzzy Logic in the Management of Uncertainty in Expert Systems,” Fuzzy Sets and Systems, Volume 11, Issues 1–3, pp. 199– 227, 1983. [2] J. Y. Halpern, Reasoning About Uncertainty, p. 434. Cambridge, MA: MIT Press, 2003. [3] “Uncertainty,” http://en.wikipedia.org/wiki/Uncertainty, 3 July 2010. [4] A. Motro, P. Smets, Uncertainty Management in Information Systems: from Need to Solutions, p. 459. Norwell, Massachusetts: Kluwer Academic Publisher, 1997. [5] A. Celikyilmaz and I.B. Turksen, Modeling Uncertainty with Fuzzy Logic, p. 400. Heidelberg, Germany: Springer, 2009. [6] “Fuzzy Logic,” http://en.wikipedia.org/wiki/Fuzzy_logic, 24 July 2010. [7] Y.Y. Yao, “A Comparative Study of Fuzzy Sets and Rough Sets,” Information Sciences, Volume 109, Issues 1-4, pp. 227–242, 1998. [8] “Multi-valued Logic,” http://en.wikipedia.org/wiki/Multi-valued_logic, 5 August 2010. [9] S. Greco, B. Matarazzo and R. Slowinski, “Rough Sets theory for Multicriteria Decision Analysis,” European Journal of Operational Research, Volume 129, Issue 1, pp. 1-47, 2001. [10] “Rough Set,” http://en.wikipedia.org/wiki/Rough_set, 15 August 2010. [11] M.Bit, T. Beaubouef, “Rough Set Uncertainty for Robotic Systems,” Journal of Computer Sciences in Colleges, Volume 23, Issue 6, pp. 126-132, 2008. [12] Stuart Russell, and Peter Norvig, Artificial Intelligence: A Modern Approach, Second Edition, p. 986. Upper Saddle River, N.J.: Prentice Hall, 2002. [13] Zdzislaw Pawlak, “Vagueness and Uncertainty: A Rough Set Perspective,” Computation Intelligence, Volume 11, Issue 2, pp. 227-232, 1995. [14] H.G. Solheim, “Discerning Objects,” 15 August 2010, http://www.pvv.ntnu.no/~hgs/project/report/node40.html 88 [15] L.A. Zadeh, "Knowledge Representation in Fuzzy Logic," 1989 IEEE Transactions on Knowledge and Data Engineering, Volume 1, Issue 1, pp. 89-100, March 1989. [16] Zdzislaw Pawlak, “Rough Set Approach to Knowledge Based Decision Support,” European Journal of Operational Research, Volume 99, Issue 1, pp. 48-57, May 1997. [17] Pawlak, Z., and Skowron, A., “Rough membership functions,” Advances in the Dempster Shafer Theory of Evidence, p. 251-271, New York, NY: John Wiley and Sons Inc., 1994. [18] E. Orlowska, "Many-valuedness and uncertainty," Multiple-Valued Logic, 27th International Symposium on Multiple-Valued Logic (ISMVL '97), pp. 153, 1997. [19] M. Richardson, and P. Domingos, “Markov Logic Networks,” SpringerLink, Volume 62, pp. 107-136, 2006. [20] D. Dubois, and H. Prade, “Possibility Theory, Probability Theory, and Multiple Valued Logics,” Journal of Mathematics and Artificial Intelligence, Volume 32, Issues 1-4, pp. 35-66, August 2001. [21] B.G. Buchanan and R.O. Duda, “Principles of Rule-Based Expert Systems,” Advances in Computers, Volume 22, pp. 164-218, 1984. [22] “Bayesian Network,” http://en.wikipedia.org/wiki/Bayesian_network, 29 July 2010. [23] Eugene Charniak, “Bayesian Network Without Tears,” AI Magazine, Volume 12, Number 14, pp. 50-63, 1991. [24] “NP-Hard,” http://en.wikipedia.org/wiki/NP-hard, 1 August 2010. [25] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, p. 552. San Francisco, CA: Morgan Kaufmann Publishers Inc., 1988. [26] S. Easterbrook and M. Chechik, “A Framework for Multi-valued Logic over Inconsistent Viewpoints,” Proceedings of the 23rd International Conference on Software Engineering, pp. 411-420, 2001. [27] N.D. Belnap. “A Useful Four-Valued Logic,” Modern Uses of Multiple-Valued Logic, pp. 30-56, 1977. 89 [28] B. Sarif and M. Abd-El-Barr, “Synthesis of MVL Functions – Part I: The Genetic Algorithm Approach,” Proceedings of the International Conference on Microelectronics, pp. 154-157, 2006. [29] G. Pomper and J. A. Armstrong, "Representation of Multivalued Functions Using Direct Cover Method," 1981 IEEE Transactions on Computing, Volume C-30, Issue 9, pp. 674-779, Sept. 1981. [30] P.W. Besslich, "Heuristic Minimization of MVL functions: A Direct Cover Approach," 1986 IEEE Transactions on Computing, Volume C-35, Issue 2, pp. 134-144, Feb 1986. [31] Dueck, G. W. and Miller, D. M., "A Direct Cover MVL Minimization: Using the Truncated Sum," Proceeding of the 17th International Symposium on multivalued logic, pp. 221-227, May 1987. [32] A. Borgi, K. Ghedira, and S.B.H. Kacem, “Generalized Modus Ponens Based on Linguistic Modifiers in a Symbolic Multi-valued Framework,” Multi Valued Logic, 38th International Symposium, pp. 150-155, 2008. [33] J.S. Albus, “Outline for a Theory of Intelligence,” Proceedings of the 1991 IEEE International Conference on Systems, Man, and Cybernetics, Volume 21, Issue 3, pp. 473-509, 1991. [34] C.J. Butz and J.Liu, “A Query Processing Algorithm for Hierarchical Markov Networks,” 2003 IEEE/WIC International Conference on Web Intelligence (WI’03), pp. 588-592, 2003. [35] C.Beeri, R. Fagin, D. Maier, and M. Yannakakis, “On the Desirability of Acyclic Database Schemes,” Journal of the Association for Computing Machinery, Volume 30, Issue 3, pp. 479-513, 1983. [36] L.R. Rabiner, B.H. Juang, “An Introduction to Hidden Markov Models,” IEEE ASSP Magazine, Volume 3, Issue 1, pp. 4-16, 1986. [37] Ralph L. Wojtowicz, “Non-Classical Markov Logic and Network Analysis,” 12th International Conference on Information Fusion, pp. 938-947, 2009. [38] M. Richardson, and P. Domingos, “Markov Logic Networks,” Machine Learning, Spring Science Business Media, Volume 62, Issues 1-2, pp. 107136, 2006. [39] “Clique (graph theory),” http://en.wikipedia.org/wiki/Clique_%28graph_theory%29, 6 September 2010. 90 [40] D. Hand, H. Mannila, and P. Smyth, Principles of Data Mining, p. 546. Cambridge, England: MIT Press, 2001. [41] D. Gunopulos, M. Halkidi, and M. Vazirgiannis, Uncertainty Handling and Quality Assessment in Data Mining, p. 421. London, England: SpringerVerlag London Limited, 2003. [42] U. Fayyad, G. Piatelsky-Sharpio, and P. Smyth, “From Data Mining to Knowledge Discovery in Databases,” AI Magazine, Volume 17, Number 3, pp. 37-54, 1996. [43] T. Hastie, R. Tibshirani, J. Friedman, and J. Franklin, “The Elements of Statistics Learning: Data Mining, Inference, and Prediction,” Springer, Volume 27, Number 2, pp. 83-85, 2001. [44] Y. Xia, “Integrating Uncertainty in Data Mining,” Ph.D Dissertation. University of California at Los Angeles, Los Angeles, CA. Advisor(s) Richard R. Muntz, pp. 1-185, 2005. [45] J. Han, and M. Kamber, Data Mining: Concepts and Techniques, Second Edition, p. 386. San Francisco, CA: Morgan Kaufmann Publishers, 2006. [46] “Data,” http://en.wikipedia.org/wiki/Data, 24 September 2010. [47] Han Jing, “Application of Fuzzy Data Mining Algorithm in Performance Evaluation of Human Resources,” IEEE Transactions on Computing, Volume 1, pp. 343-346, 2009. [48] N. Bissantsz, and J. Hagedorn, “Data Mining,” Business & Information Systems Engineering, Volume 1, Issue 1, pp. 118-121, 2009. [49] W.J. Frawley, G.P. Shapiro, C.J. Matheus, “Knowledge Discovery in Databases: an Overview,” AI Magazine, Volume 13, Number 3, pp. 57-70, 1992. [50] Berkhin, P., “Survey of Clustering Data Mining Techniques,” http://citeseer.ist.psu.edu/berkhin02survey.html, 20 September 2010. [51] G. Raju, B. Thomas, S. Kumar, and S. Thinley, “Integration of Fuzzy Logic in Data Mining to Handle Vagueness and Uncertainty,” Advanced Intelligent Computing Theories and Applications, Volume 5227, pp. 880-887, 2008. [52] S. Mitra, S.K. Pal, and P. Mitra, “Data Mining in Soft Computing Framework: A Survey,” IEEE Transactions on Neural Networks 13, Volume 1, pp. 3–14, 2002. 91 [53] D. Berardi, D. Calvanese, G. Giacomo, M. Lenzerini, and M. Mecella, “A Foundational Vision of E-Services,” Web Services, E-Business, and the Semantic Web, Volume 3095, pp. 28-40, 2004. [54] M. Aiello, M.P. Papazoglou, J. Yang, M. Carman, M. Pistore, L. Serafini, and P. Traverso, “A Request Language for Web-Services Based on Planning and Constraint Satisfaction,” Proceedings of the Third International Workshop on Technologies for E-Services, Volume 2444/2002, pp. 9-38, 2002. [55] J. Rowley, “An analysis of the e-service literature: towards a research agenda,” Internet Research, Emerald Group Publishing Limited, Volume 16, Number 3, pp. 339-359, 2006. [56] Z. Yang, “Consumer Perceptions of Service Quality in Internet-Based Electronic Commerce,” Proceedings of the EMAC Conference, pp. 339-359, 2001. [57] G.J. Klir and M.J. Wierman, Uncertainty-Based Information: Elements of Generalized Information Theory, p. 165. Heidelberg, Germany: Springer, 1999. [58] G. Yee, and L. Korba, “Negotiated Security Policies for E-Services and Web Services,” Proceedings of the 2005 IEEE International Conference on Web Services, pp. 1-8, 2005. [59] Z. Cob, and R. Abdullah, “Ontology-based Semantic Web Services Framework for Knowledge Management System,” IEEE Transactions on Computing, Volume 2, pp. 1-8, 2008. [60] F. Martin-Recuerda, and D. Robertson, “Discovery and Uncertainty in Semantic Web Services,” URSW (LNCS Vol.) 2008, pp. 108-123, 2008. [61] D. Parry, “Tell Me the Important Stuff” - Fuzzy Ontologies And Personal Assessments For Inter action With The Semantic Web,” Proceedings of the 2008 IEEE World Conference on Computational Intelligence, pp. 1295-1300, 2008. [62] E. Sirin, and B. Parsia, “Planning for Semantic Web Services,” International Workshop “Semantic Web Services” at ISWC, pp. 1-15, 2004. [63] J. Hoffmann, P. Bertoli, and M. Pistore, “Web Service Composition as Planning, Revisited: In Between Background Theories and Initial State Uncertainty,” Proceedings of the 2007 National Conference on Artificial Intelligence, pp. 1013 – 1018, 2007. 92 [64] H. Haas and A. Brown (2004). Web Services Glossary, http://www.w3.org/TR/wsgloss/, 16 September 2010. [65] A. Polleres, “Services as Application Areas for Answer Set Programming,” Dagstuhl Seminar Proceedings 05171, pp. 1-6, 2005. [66] B. Sandvik, “Thematic Mapping on the Semantic Web,” http://blog.thematicmapping.org/2008_07_01_archive.html, 19 September 2010. [67] J. Carbonell, “Semantic Web Services o la Web Activa,” http://www.lacofa.es/index.php/general/semantic-web-services-o-la-webactiva, 20 September 2010. [68] P. Oliveira, and P. Gomes, “Instance-based Probabilistic Reasoning in the Semantic Web,” Proceedings of the 18th International Conference on World Wide Web, pp. 1067-1068, 2009. [69] M. Holi and E. Hyvonen, “A Method for Modeling Uncertainty in Semantic Web Taxonomies,” Proceedings of the 13th International World Wide Conference, pp. 296-297, 2004. [70] H. Zimmermann, “Fuzzy Set Theory,” Computational Statistics, Wiley Interdisciplinary Reviews, Volume 2, Issue 3, pp. 317-332, 2010. [71] A. Sheth, C. Ramakrishnan, and C. Thomas, “Semantics for the Semantic Web: the Implicit, the Formal, and the Powerful,” International Journal on Semantic Web and Information Systems, Volume 1, Issue 1, pp.1-18, 2005. [72] K. Shehzad, and M. Javed, “Multithreaded Fuzzy Logic based Web Services Mining Framework,” European Journal of Scientific Research, Volume 41, Iusse 4, pp.632-644, 2010. [73] A. Abraham, C. Grosan J. Kacprzyk and W. Pedrycz, Studies in Computational Intelligence, Volume 82, p. 441. Berlin, Germany: Springer, 2008. [74] L.A. Zadeh, “Soft Computing and Fuzzy Logic,” IEEE Transactions on Computing, Volume 11, Issue 6, pp. 48-56, 1994. [75] T. Ito, “Dealing with Uncertainty in Design and Decision Support Applications,” International Journal of Soft Computing Applications, Issue 1, pp. 5-16, 2007. [76] A. Korvin, H. Lin, and P. Simeonov, Knowledge Processing with Interval and Soft Computing, p. 233. London, England: Springer, 2008. 93 [77] J. Jang, C. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing: A Computation Approach to Learning and Machine Intelligence, p. 614. Upper Saddle River, N.J.: Prentice Hall, 1996. [78] L.A. Zadeh, “The Roles of Soft Computing and Fuzzy Logic in the Conception, Design and Deployment of Intelligent System, Proceedings of IEEE Asia Pacific Conference on Circuits and Systems, pp. 3-4, 1996. [79] V. Vasilyev, and B. Ilyasov, “Design of Intelligent Control Systems with Use of Soft Computing: Conceptions and Methods,” Proceedings of the 15th IEEE International Symposium on Intelligent Control, pp. 103-108, 2000. [80] E. Simoudis, “Reality Check for Data Mining,” http://cs.salemstate.edu/hatfield/teaching/courses/DataMining/M.htm, 26 September 2010. [81] Y. Fujiwara, Y. Sakurai, and M. Kitsuregawa, “Fast Likelihood Search for Hidden Markov Models,” ACM Transaction on Knowledge Discovery from Data, Volume 3, Issue 4, pp. 1-37, 2009. [82] S. Kok, and P. Domingos, “Learning Markov Logic Network Structure via Hypergraph Lifting,” ACM Proceedings of the 26th Annual International Conference on Machine Learning, pp. 505-512, 2009. [83] J.S. Albus, “A Reference Model Architecture for Intelligent System Design,” Proceedings of the 1996 IEEE International Conference on Systems, Volume 1, Issue 1, pp. 15-30, 1996. [84] H. Boudali, and J.B. Dugan, “A Discrete-Time Bayesian Network Reliability Modeling and Analysis Framework,” Engineering and System Safety, Volume 87, Issue 3, pp. 337-349, March 2005. [85] J. Lampinen, and A. Vehtari, “Bayesian Approach for Neural Networks – Review and Case Studies,” Neural Networks, Volume 14, Issue 3, pp. 257274, April 2001. [86] C.K.S Vijila, S. Renganathan, and S. Johnson, “Suppression of Maternal ECG from Fetal ECG using Neuro Fuzzy Logic Technique,” Proceedings of the International Joint Conference on Neural Networks, Volume 2, pp. 1007-1012, 2003. [87] R. Swarnalath, and D.V. Prasad, “Maternal ECG Cancellation in Abdominal Signal Using ANFIS and Wavelets,” Journal of Applied Sciences, Volume 10, Issue 11, pp. 868 – 877, 2010. 94 [88] B.B. Jovanovic, I.S. Reljin, and B.D. Reljin, “Modified ANFIS architecture – improving efficiency of ANFIS technique,” Neural Network Applications in Electrical Engineering, pp. 215-220, 2004. [89] G. Luiz, C.Abreu, and J. Ribeiro, “On-line Control of a Flexible Beam Using Adaptive Fuzzy Controller and Piezoelectric Actuators,” SBA Control and Automation, Volume 15, Issue 14, pp. 377-383, 2003. [90] G. Clifford, “Fetal and Maternal ECG,” Biomedical Signal and Image Processing, Volume 2, Issue 1, pp. 1-10, 2007. 95

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Knowledge Uncertainty in Intelligent System