Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
[SURVEY OF]VISUALIZATION [TECHNIQUES] IN PROCESS MINING M. Amiyo1, W.M.P. van der Aalst2, and P. J. Ogao1 1. Makerere University P. O. Box 7062, Kampala, Uganda {mamiyo, [email protected]} 2. Technische Universiteit Eindhoven PO Box 513, 5600 MB Eindhoven, The Netherlands [email protected] Abstract: Visualization of process mining results is a very important for one to get a good understanding of a business process. This becomes more crucial as process become more complex. Unfortunately, the methods currently used to visualize these results do not put into consideration automatic processing feature. As a result, as the complexity of the processes increase, the complexity and size of a resulting model increase and so does the difficulty in interpretation. This paper seeks to review the visualization techniques that are being used to present process mining results and to evaluate their effectiveness. 1. INTRODUCTION Business companies or organizations are always looking for ways in which they can improve on their business so that they can have a competitive advantage. In this Dot.com age, most organizations have information systems to support their daily activities, making them more efficient. In addition, companies have invested in finding hidden knowledge from the information they already have through data mining techniques. These techniques are used to identify patterns within, a usually large, data set. This knowledge is useful in for a number of things which include decision making. However, the current trend shows that there has been shift from the data-driven or data oriented information systems to process-driven or process oriented information systems (Van der Aalst, 2004). As a way to understand business processes, the research field of process mining was born. The development and increased use of process mining techniques was and is based on the data mining techniques (Turner and Tiwari, 2008). Whereas data mining focuses on getting or extracting knowledge from large sets of data, Process mining focuses on extracting knowledge from data logs that are recorded during the execution of process (Turner and Tiwari, 2008), by the information system. Process mining can be done for three major reasons; discovery, conformance check and extension of process models. Therefore it may be defined as a method of extracting a structured description of a process from a set of process logs, recorded during process execution. This description is a model that gives insight to the behaviour of a process according to different perspectives (Van der Aalst, 2004). The mined models are generated based on a process modelling language. Some of the languages used include Petri Nets and Event-driven Process Chains, which are known as graphical business process modelling languages thus can be used to understand and give a visual interpretation of a process (Streit, Pham, and Brown, 2005). The behaviour of a process may also be understood by distilling the process logs in light of several perspectives that can be viewed or considered in combination and with interesting aspects such as the time, and location. (Van der Aalst, 2004). Empowering users in organizations with the ability to understand the behaviour of processes has lead to the increased use of process mining. Regardless of this, visualization of process mining results is still a challenge (Van der Aalst, 2004). As the complexity of a business process increases, the resulting graph or mined model becomes bigger and more difficult to interpret (Streit, Pham, and Brown, 2005) i.e., “Petri Net-based models tend to become too large for analysis even for a modest-size system” (Murata, 1999 pp 542). Techniques such as zooming (Streit, Pham, and Brown, 2005) and abstraction (Murata, 1999 pp 542) that have been used in attempt to overcome this problem have limitations. In addition to this, features such as colour that could be used to carry additional information, find limited use in most graphical business process languages (Streit, Pham, and Brown, 2005). For example, in most business process modelling languages (BPML), controlled visual processing features are more dominant as compare to automatic processing features such as colour (Streit, Pham, and Brown, 2005) which is a powerful tool in visualization. EPC, an example of a BPML language, contains some automatic processing features like colour; however the colour does not additional information that can be used in the interpretation of the mined process models (Streit, Pham, and Brown, 2005). In light of these, it is clear that the presentation of a mined model is very critical to understanding the behaviour of a process, especially large complex processes. In order for large mined models to be understood, the level of controlled visual processing features should be reduced (Streit, Pham, and Brown, 2005). This leaves a need to find appropriate visualization techniques that can be used to help in the visualization of complex process models. This paper seeks to review the visualization techniques used in the presentation of process mining results. This is done with an aim of evaluating their effectiveness in presentation of complex processes. The rest of the paper is organized as follows; the next section discusses key concepts and definitions, section 3. A review of visualization techniques used in process mining, section 4 gives an evaluation of the techniques identified in section 3 and section 5 and 6 give a discussion and concluding remarks. 2. KEY CONCEPTS AND DEFINITIONS PROCESS MINING In the past, information gathered from in the enactment stage in the life cycle of a business process was rarely used except for security and audit purposes. However, today these event logs can be used to analyze and understand the underlying business process through process mining. Process mining may be defined as method of extracting a structured description of a process from a set of process logs, recorded during process execution (Van der Aalst, 2004). This makes process mining an important means by which processes can be analyzed and monitored. The types of process mining include process discovery, conformance checking and process extension. Process mining therefore gives as insight to what is actually going on within an organisation as compared to what is expected (Van der Aalst, 2008). The information extracted from such event (process) logs and mainly in form process models. The business process model consists of an ordered set of activities that take place in a given process showing the relationships/interactions between then, thus describing the process. These models are presented in various business process modelling languages notations for example the Event-driven Process chains (EPC) and Petri Nets. Process mining results Process mining results are mainly in form of a process model based on a given process modelling language notation. Different process modelling languages have different notations but all have four basic concepts; timepoint, activity, event and state. The time point represents the instant of time that is not decomposable, activity; some kind of performance, event; a note worthy occurrence and state; a set of properties of something, (Sӧderstrӧm, Andersson, Johannesson, Perjons and Wangter, 2002). However they all perform the same purpose of describing what a given process does and the objects and information it works on, (Xinming and Haikun, 2005). The results process models show one of many perspectives. These perspectives include; the functional, behavioural, organizational and informational. This basically means that process models are perspective oriented; a user’s the view of interest determines the perspective of a model and thus what the model can be used for. In process mining, it is possible to discover models depending on three perspectives; case (data/informational) perspective, process (functional) perspective and organizational (behavioural) perspectives (Weijters, Van der Aalst, B. van Dongen, Günther, Mans, Alves de Medeiros, Rozinat, Song, and Verbeek, 2007). It is therefore the nature of all models to be incomplete and to focus on simplified specific views or perspectives for a particular purpose (Xinming and Haikun, 2005). The purpose of process mining results may be summarized into two broad categories; to inform and to aid action. As a means to Inform, process mining results provide a manager and/or decision makers with information about the process; what is done with this information is totally left to the intended recipient. When the purpose is to aid in action, the outcome of the findings must be used to determine next line of action (Van der Aalst, 2006). Process Mining Framework (ProM) This is an open source framework, providing an environment for performing different types of process mining for example, process discovery and conformance checking. It supports process mining by providing a number of plug-ins based on different techniques that can be used to extract information from event logs. ProM also supports a number of process modelling language notations which gives the user the ability to have various presentations of a given model. The ProM environment is a versatile and extendible one i.e. plug-ins can be developed and easily incorporated into the framework (Van Dongen et al., 2005) VISUALIZATION Visualization is a means by which abstract information, may be in form of text and numbers, is represented in a more graphical manner for easy communication or interpretation. It can therefore be defined as is any technique for creating images, diagrams, or animations to communicate a message; giving insight to the meaning of data (Streit, Pham, and Brown, 2005). The main purposed of visualization is providing insight into complex scenarios. It mainly focuses on finding meaning in unexplored data for instance, in this process mining we look at the event logs as unexplored data. Visualization avails users with graphical representations and animations techniques that have been used in several research fields for identifying data problems, provide insight and show relationships within a given dataset. There are a number major categories of visualization methods under which there are several techniques. These methods include; data, information, concept, strategy, metaphor and compound visualization. 3. VISUALISATION TECHNIQUES There are many/several visualization techniques that have been developed over the years. This section gives a review of the visualization techniques that are currently being used in presenting process mining results. There are a number of visualization technique concepts that have been used in presentation of process mining results include; matrices, clustering, graph visualization and a combination of these. Graph Visualization This is a technique that is used to display data that has elements that have a relation between them. Data presentation is characterized by nodes and edges; the nodes representing the data elements and the edges represent the relation between them. Graph visualization has been applied in various domains including real-time systems represented as state-transition systems or Petri nets (Herman et al., 2000). In process mining, process models may be constructed as Petri nets or EPCs. As Petri nets, the data elements can take either of two kinds of nodes; a place or a transition and the relation between them is represented as directed edges. Each graph has a layout associated to it. This determines the way the nodes and edges appear or are displayed. These are usually presented or implemented as algorithms. A summery of the different kinds of graph layouts is summarized in the figure below. Figure 1: Overview of Graph layout algorithms (Herman et al., 2000) The most commonly used graph layout in process mining is the tree layout. Matrix Visualization This involves the presentation of data inform of a matrix (rows and columns in a matrix) and converting this data matrix to a matrix map. Each data entry is represented as by a dot or square of a given colour (Legát, 2005). This has been used in the dotted chat and the cloud chamber. Some of the performance analysis results are displayed using this technique. Clustering Clustering is a process of grouping data or discovering classes of data based on a number of conditions or rules. Clustering is a concept used in visualization to group related data, as a means to reduce on complexity and space being utilized to display a large graph (Herman et al., 2000). This technique has not been used independently in process mining results. It has been used to clusters related tasks or events in EPC process model diagrams for example it is has been used in conjunction with the fuzzy miner. 4. EVALUATION OF TECHNIQUES This section will contain an evaluation of the challenges or loop holes associated with the visualization techniques identified in the section above in relation to process mining results. 5. CONCLUSION This section will contain concluding remarks and possible recommendations to address the challenges or loop holes identified in the section above. REFERENCES Conradi R., and Jaccheri L., M., (1999). Process Modelling Lanaguages. Lecture Notes in Computer Science: Software Process, 1500, 27-52 Murata T. (1989). Petri Nets: Properties, Analysis and Application. Proceedings of the IEEE, 77(4), 541-580 Herman I., Malancon G. and Marshall S. 2000. Graph Visualization and Navigation in Information Visualization: A Survey. IEEE Transactions on Visualization and Computer Graphics, 6(1), 24-43 Sӧderstrӧm E., Andersson B., Johannesson P., Perjons E., and Wangter B. (2002). Towards a Framework for Comparing Process Modelling Languages. Lecture Notes in Computer Science: Advanced Information Systems Engineering, 2348, 600-611 Streit A., Pham B., Brown R., (2005). Visualization Support for Managing Large Business Process Specifications. Lecture Notes in Computers Science: Business Process Management, 3649, 205-219 Streit A., Pham B., Brown R., (2005). Resource-Centric Worklist Visualization. Lecture Notes in Computers Science: Workflow and Business Processes, 3760, 94-111 Turner C. J. and Tiwari A., (2008). A Review of Business Process Mining: State-of-the-Art and Future trends. Business Process Management Journal, 14(1), 5-22 Van der Aalst W.M.P., (2008). Process Mining. Retrieved January 18, 2008, from Process mining research tools application, Website: http://ga1717.tm.tue.nl/wiki/ Van der Aalst W.M.P., (2006). Process Mining and Monitoring Processes and Services: Workshop Report. Dagstuhl Seminar Proceedings 06291: The Role of Business Processes in Service Oriented Architectures. Retrieved June 5, 2008, from Website: http://drops.dagstuhl.de/opus/volltexte/2006/834/pdf/06291.SWM.Paper.834.pdf Van der Aalst W.M.P., (2005). Process Mining. Retrieved January 18, 2008, from Business Process Management (BPM) Center, Website: http://is.tm.tue.nl/staff/wvdaalst/BPMcenter/process%20mining.htm Van der Aalst W.M.P., (2004). Business Process Management Demysified: A tutorial on models, Systems and Standards for Workflow Management. Lecture Notes in Computers Science: Lectures on Concurrency and Petri Nets, 3098, 1-65 Van der Aalst W.M.P and A.J.M.M. Weijters, (2004). Process mining: A Research Agenda. Computers in Industry, 53(3), 231-244. Van Dongen B., Alves de Medeiros A.K., Verbeek H.M.W., Weijters A.J.M.M., and Van der Aalst W.M.P.2005. The ProM framework: A New Era in Process Mining Tool Support. Lecture Notes in Computer Science: Application and Theory of Petri Nets, 3536, 444–454. Weijters A.J.M.M., Van der Aalst W.M.P., Van Dongen B., Günther C., Mans R., Alves de Medeiros A.K., Rozinat A., Song M., and Verbeek E., (2007). In M. Dastani and E. de Jong, editors, Proceedings of the 19th Belgium-Netherlands Conference on Artificial Intelligence (BNAIC). Weske M. (2007). Business Process Management: Concepts, Languages, Architectures. Berlin: Springer Xinming J. and Haikun W. (2005). Scenario-based comparison and evaluation: issues of current business process modeling languages. Engineering Manufacture: IMECHE, 220 pp 15271538