Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SELF-ORGANISING MAPS FOR USER COMMUNITIES Sennaike O. A., Ojo A. K. and Sofoluwe A. B. Abstract User communities are usually constructed from user data which are fairly static. However, these data are usually explicitly acquired from the user. Usage data, on the other hand, is very dynamic and can be acquired unobtrusively but presents a number of challenges including processing of sequential data. We propose the use of Self Organising Maps (SOM) in constructing user communities based on usage data. We also introduce use of the transition matrix for the representation of usage data. We further show the applicability of our approach by applying it to call data from a mobile telecommunications network operator. 1 Introduction Recent advances in information technology have made collection and storage of data commonplace. However, obtaining information from the collected data is becoming increasingly difficult due to both the huge amount of data available, and the diversity in the background and interests of users, making conventional information retrieval systems rather inefficient. Also, a lot of systems are built with a lot of functionality for users with various background and interests. Such systems tend to either be quite complex for casual users or inadequate for expert users. For these complex software systems to be usable, it is important for the systems to be able to identify the user interacting with it and be able to provide functionalities only relevant to the present needs of the user. User models and user modeling are attempts to make good predictions about the users of a system and the use to which such a system will be put. In this paper, we propose a methodology for constructing user communities from unobtrusively acquired user interaction data (usage data). Constructing user communities from usage data introduces a number of issues as opposed to user data. User data are fairly static and is not structured and can be easily be used in available clustering/classification algorithms with little or no data transformation. Usage data, on the other hand, is very dynamic and structured (Hierarchical, sequences, time series, etc) with variable length of interaction history. Available clustering/classification algorithms are not able to process these forms of data without extensive data transformation. Our methodology employs the transition matrix for representing usage data and subsequently using SOM in discovering the user communities. This research was partly supported by CRC No 99/2000/02 of University of 1 Lagos, Akoka, Nigeria. The rest of this section gives an insight to some important concepts used in our work including user modeling and Self-Organising Maps. Section 2 discusses our SOM based approach to constructing user communities. In section 3, we present a case study of mobile phone subscribers and use our approach in discovering user communities. In section 4, our conclusions are presented and current research effort is discussed. 1.1 User Modeling User models attempt to provide a model of a user. The different kinds of data in a user model can be classified as user data, usage data and environment data [Kobsa et al 01]. User data comprise the various characteristics of the user. This includes demographic data, user knowledge, user skills and capabilities, user interests and preferences, user goals and plans [Kobsa et al 01]. Usage data is related to data about user interaction with the system. As noted in [Kobsa et al 01], there are potential overlaps between usage data and user data. User data can be inferred from usage observation. In particular, Brusilovsky limits usage data to comprise data about user interaction with the systems that can not be resolved to user characteristics but can still be used to make adaptation decisions. Environment data relates to data about the user’s environment that are not related to users themselves. This is particularly relevant to web-based systems where the range of different hardware and software is very diverse. Furthermore, location information is also important to some applications [Brusilovsky 01]. User modeling can be viewed as the process of constructing and applying (often computer based) models of individuals and (or) groups of users. User Modeling Systems consists of components that provide user modeling facilities to other systems. They provide three (3) essential facilities [Kass et al 88]: acquisition, representation and maintenance, and access facilities. An extension of these requirements is presented in [Kobsa 01a] with a number of services listed as being required of user modeling systems. It is desirable that initial user groups in the domain in question be identified. These groups share the same interests according to a set of criteria [Benaki et al 97] and are referred to as stereotypes. Stereotypes are organized in a singlerooted hierarchical structure, with a stereotype being able to inherit information from several immediate subsumers as in a Lattice [Kass et al 88]. An individual user model is thus represented as a leaf node in the hierarchy. Users can be assigned to one or more stereotype with the users inheriting the characteristics of these stereotypes [Kass et al 88]. Some problems associated with stereotypes are listed below [Paiva 95]: It is difficult to determine what stereotypes to define for a certain applications. It is difficult to establish the boundaries between stereotypes 2 The information created needs to be constantly revised in order to maintain the consistency of the models The inferences are too general to be used in fine-grained models. Further, the user classifications of the stereotyping method are ad hoc and unprincipled and they can be exploited by the adaptive system only after a large number of trials by various kinds of users. 1.2 User Communities The concept of user communities refers to explicitly clustering users with similar behaviour through the users’ interaction with the system. The idea of user communities was introduced by Jon Orwant in the user modeling system called Doppelganger [Orwant 93]. The idea of user communities is similar to stereotypes in that they permit prediction of default values for the user model. They differ from the stereotypes in that they are defined from a collection of user models and are dynamic, being recomputed periodically. Perhaps, more significantly, a user had probabilistic membership, matching some communities better than others. The Doppelganger system employed the ISODATA clustering algorithm to generate its communities. Other approaches, COBWEB and ITERATE, based on conceptual clustering technique has also been applied [Paliouras et al 98], [Paliouras et al 99]. Some other approached that have been employed in the construction of User communities include Quinlan's C4.5 [Quinlan, 1987], Feedforward NN, ART (Adaptive Resonance Theory), SOM, Watkin's Q Learning algorithm [Watkins 1989] and Case based Reasoning 1.3 Self-Organising Maps The Self-Organising Map (SOM) was developed by Teuvo Kohonen in the early 1980's [Kohonen 00]. This artificial neural network tries to emulate the development of topological maps in the brain using locally interconnected networks and an algorithm based on local neighborhoods. The cerebral cortex of the brain is arranged as a two-dimensional plane of units and spatial mappings are used to model complex data structures. This means that topological relationships in external stimuli are preserved and complex multi-dimensional data can be represented in a lower (usually two) dimensional space. Kohonen also uses two-dimensional networks where the units are arranged on a flat grid using a regular topology (e.g. rectangular, hexagonal). The SOM projects a high dimensional input data into a two-dimensional layer which preserves order, compacts sparse data, and spreads out dense data. In other words, if two input vectors are close, they will be mapped to processing elements that are close together in the two-dimensional Kohonen layer that 3 represents the features or clusters of the input data. Thus, the SOM is used to visualize topologies and hierarchical structures of higher-order dimensional input spaces and to discover patterns in a pool of data. The learning model adopted by Kohonen’s self-organising Maps is a variation of the competitive learning. The SOM consists of (usually) two-dimensional array of identical neurons (nodes). This array of neurons can attain a regular topology (e.g., rectangular or hexagonal) or irregular. Every node has, associated with it, a reference vector (also called codebook or model vector) m i = [i1, i2, …….., in]T n, where ij represent scalar weights. The codebook vectors are initialized to random numbers. n is the set of all possible n-tuples of real numbers, each of which is from the interval (-, +). The input vector x = [1, 2,…………, n]T n is connected to all neurons in parallel via variable scalar weights ij. Note that the scalar weights are initialized to random values thus they are in general different for different neurons. The input x is then compared with all the codebook vectors mi and the location of the best match in some metric is defined as the location of the “response”. In many practical applications, the smallest of the Euclidean distances ||x – mi|| can be made to define the best-matching node (the winning node), signified by the subscript c: c arg min x m i which means the same as i x mc min x mi (1) i where the Euclidean distance x mi is defined as j ij n j 1 2 The weights of the nodes that are topographically close in the array up to a certain geometric distance (neighborhood) to the winning node are then adjusted (relaxed). A continuous local relaxation or smoothening effect on the weight vectors of neurons in this neighborhood leads to global ordering. The relaxation process, which is the learning process, can be represented as: mi(t + 1) = mi(t) + hci(t) [x(t) – mi(t)] (2) where t = 0, 1, 2, ……… is an integer, the discrete-time coordinate. In the relaxation process, the function hci(t) acts as the neighborhood function, a smoothing kernel defined over the lattice points. For convergence, it is necessary that hci(t) 0 when t . Usually hci(t) = h(||rc – ri||, t) where rc 2 and ri 2 are the location vectors of nodes c and i, respectively, in the array. With increasing ||rc – ri||, hci 0. The average width and form of hci define the 4 “stiffness” of the “elastic surface” to be fitted to the data points. neighborhood functions are discussed in [Kohonen 01] Some After presenting the input samples and the codebook vectors have converged to practically stationary values, the map is calibrated. Calibration of the map is done to locate images of different input data items on it. In practical applications, for which such maps are used, it may be self-evident how a particular input data set ought to be interpreted and labeled. By inputting a number of typical, manually analysed data sets, looking where best matches on the map lie, and labeling the map units correspondingly, the map becomes calibrated. Since the mapping is assumed to be continuous along a hypothetical “elastic” surface, the closest reference vectors approximate the unknown input data. It must be noted that the order resulting in the Self-Organising Map always reflects the properties of the probability density function p(x). A number of ways of improving the performance of the SOM algorithm and a number of variants of the SOM is presented in [Kohonen 01]. 2 Construction of User Communities from Usage data A user classification serves as a basis for an adaptive system; it saves and analyzes the data pertaining to each particular user and makes available information relevant to the program’s adaptation to the user in each successive stage. User communities basically capture generalizations about large classes of users. Thus, given incomplete information on a user, the user community the user belongs to will help in eliciting and filling the missing information. 2.1 Why SOM? Our proposed approach is based on SOM, an unsupervised learning technique which automatically discovers hidden (implicit) relationships in data. It is also able to identify clusters in data making it a natural candidate for the automatic construction of user communities. The fact that SOM preserves topological order inherent in the data makes it a particularly attractive approach since users with similar characteristics automatically become neighbours on the SOM grid. 2.2 Methodology The processes involved the construction of our user communities are outlined as follows: 2.2.1 Data collection Usage data may be extracted from an existing database, through some empirical research, sensors, or from some other means. Usage data is collected in an unobtrusive manner. Our data set thus consists of usage data of different users. 5 We refer to each user’s usage data as data item. Each data item is a series of ordered vectors. Data items may have varying lengths. 2.2.2 Feature Extraction With usage data available, we need to decide on a representation of the data items and subsequently the metric to be used. The representation chosen will determine the structure that will eventually be discovered. The metric used will depend on (or sometimes inform) our representation decision. A number of metrics for different representations is presented in [Kohonen 01]. It may also be important to scale the selected features before applying the SOM algorithm. If knowledge of the relative importance of the components of the data items is available, the corresponding dimensions of the input space can be scaled according to this information [Kaski 97]. 2.2.3 The transition matrix representation We employ the use of the transition matrix for representing data. This matrix keeps a count of the number of transitions between states. For a problem with one variable and N states, we have a two dimensional N X N matrix. An element a(i,j) in the matrix will indicate the number of transitions from state i to state j. In general, for a problem with k variables and N1, N2, N3, ……, Nk states (where Nk is the states for variable k), we will need 2k dimensional matrix. The first k dimensions will represent the source states while the last k dimensions will represent the destination states. The definition of the states for each variable will usually depend on the nature of the data. With the states defined, each data point is categorised into its constituent state(s) and the transition matrix is constructed. For a one variable problem, it is easy to visualize the two dimensional transition matrix. However, for problems with more than one variable, it becomes quite complex to visualize. To give a visual example of a problem with more than one variable, we give a hypothetical example for a 2 variable problem, say, colour and temperature. We define the states as follows: Color = {Black, Red, White} Temperature = {Hot, Cold} We further give the following ordered data as observed in a hypothetical experiment: (Red, Hot), (Black, Cold), (Black, Cold), (White, Hot), (Red, Hot) The transitions can be outlined as follows: Transition A: (Red, Hot), (Black, Cold) 6 Transition B: (Black, Cold), (Black, Cold) Transition C: (Black, Cold), (White, Hot) Transition D: (White, Hot), (Red, Hot) The transitions are visualized below: Black Red Hot Black Hot Cold B Hot Red Hot Cold Hot White Cold Cold A Hot Cold Hot Cold Cold Hot Cold Hot Cold Hot Hot Cold Hot Cold Hot Cold White Cold C Hot Cold Hot Cold Hot Cold Hot D Cold Hot Cold Table 1: Four dimensional Transition matrix The labels (A, B, C, D) in the example above are for illustration only. In practice, the entries will be a count of the number of transitions from one state to the other. Each transition matrix constructed is subsequently transformed into a vector which is used as input for the construction the SOM. This transformation can simply be achieved by listing the elements of each dimension of the transition matrix in order. 2.2.4 Determining the states For some problems, states may not be easily identifiable. Sometimes states may be defined using some form of heuristics. Defining states is usually dependent on the problem at hand and guided by experience. Too many states will result in an unnecessarily large feature vector resulting in a longer processing time. On the other hand, too few states will result in loss of information. 2.2.4.1 Construction of Maps The construction of the maps follows from the SOM algorithm presented earlier. Appropriate speed up techniques and guidelines for the construction of good maps are presented in [Kohonen 01]. While the SOM can be of any dimension, the two dimensional SOM is usually preferred because it makes it easy to build efficient visualizations and user interfaces. After learning the data are projected onto a two dimensional map surface. 7 A number of maps can be generated for the same data set. Since the SOM algorithm is a stochastic process, the maps generated will be different. A number of techniques that can be used to determine the best map is presented in [Kaski 97], [Honkela 97], [Vesanto 00] and [Ypma et al 97]. During the construction of the map, each data point is labeled so that the training data example can easily be referenced. 2.2.4.2 Interpretation and use of map On visual inspection of the resulting map, clusters are usually noticed. Data points that are close on the map are more similar than ones that are far apart. In many applications, it may be necessary to identify and label these clusters. Identification of the clusters may be done manually through visual inspection. A number of techniques for cluster visualization have been presented [Merkl et al 97]. However, automated approaches to cluster identification have also been researched into. See [Siponen et al 01] and [Galliat et al 00]. In this research, we are interested in attaching default properties to a new user given that we have incomplete information on the user. Identifying and labeling the clusters of the resulting SOM is not required since we are interested in the winning node for a particular input data. When presented with user data, the SOM may present a number of scenarios. The winning node may have zero, one or more training examples attached to it. Different strategies can be adopted here. When there are no training examples attached to the winning node, the nearest neighbor (with the smallest distance) can be considered. For winning nodes with one training data, the values are derived directly from the associated exemplary data. For winning nodes with more than one training data, an ‘average’ measure can be defined which is used to select the default value. Another possible strategy is to associate the default values to the reference vectors when the maps are initially constructed with a similar strategy to the one proposed in the preceding paragraph. Missing values from the user data can then be inferred directly from the reference vector of the winning node. It should be pointed out that communities, unlike stereotypes, are constructed from actual (rather than potential) user data, are dynamic and will require periodic re-computation as user data becomes available. 8 2.2.4.3 Soft bounded user communities One of the problems identified in [Paiva 95] is that it is difficult to establish boundaries between stereotypes. Rather than trying to define distinct boundaries between our user communities, we propose soft boundaries between our user communities. This means that we do not strictly define boundaries for our user communities. Since the main aim of identifying a user’s community is to be able to predict default values for missing information in the user model, defining hard boundaries between our user communities is of little or no value. In fact, allocating users to hard bounded user communities tends to limit the generality of predicted values in the user model. Doppelganger tried to introduce some generality by assigning a user probabilistic membership, matching some communities better than others. 3 The Call Data Example 3.1 Background The telecommunications industry is a rapidly expanding and highly competitive industry. The industry generates and stores a large amount of intrinsically multidimensional data which is difficult to process manually. These data include call detail data, which describes the calls that traverse the telecommunication networks. This is also referred to as call data records (CRDs) and includes caller number, receiver's number, date, startingending time, call duration, location, etc. network data, which describes the state of the hardware and software components in the network. customer data, which describes the telecommunication customers. This is also referred to as contractual data and can include name of subscriber, address, age, sex, occupation, phone number, payment type, contract starting-ending date etc. It is imperative that telecommunication companies develop strategies for identifying market trends, detecting key characteristics and patterns for market segments, improving the quality of products and services offered, detecting fraud and insolvency early enough and focusing on customers likely to stay with the company longer and profitable customers. 3.2 The Data We collected call detail data from a mobile telecommunications operator that consist of calls made by 500 subscribers over a period of six months. The subscribers were all on “pay as you go” arrangement. Only selected fields in the call detail database were extracted. For our experiment, we decided to investigate only calls initiated from the network under consideration (originating calls). The transaction type field for these calls has a value of 1. 9 3.3 Preprocessing In order to have the data in a form suitable for our purpose, a number of preprocessing tasks were carried out. There were 808610 call records for the 6month period under consideration. Calls with third party present (conference calls) were normalized. The normalization process removed the third party field by creating an extra record with the same values in all the fields as the original one except for the other party field which is replaced by the value in the third party field. Thus all the records are now less by one field. The total number of normalized calls was 853484. Calls originating from the mobile phone operator in question totaled 227318. Of these calls, records that had no other party field were removed reducing the number of call records to 225292. For simplicity, we select one variable, the duration of the call. The date and time the call was made provided us with information about the order in which the calls were made. 3.3.1 Feature extraction The resultant data after preprocessing were records with variable lengths which will result in variable length feature vectors. The next task is to generate the fixed length feature vector for each subscriber while keeping information about the order in which the calls were made. To achieve this, we employ the use of a state transition matrix. We define the states (using heuristics) for call duration as follows: State 1: State 2: State 3: State 4: State 5: Call duration less than 10sec Call duration from 10 seconds to 39 seconds Call duration from 30 seconds to 59 seconds Call duration from 60 seconds to 120 seconds Call duration from 120 seconds and above We then construct a 5 X 5 transition matrix to represent the each user’s call record. The transition matrix is then transformed into a vector with 25 features for each user. 3.4 Construction of the Maps Two kohonen maps were constructed, a two-dimensional network with the units arranged on a flat grid using a regular rectangular topology, and another network arranged in a regular hexagonal topology. The maps were trained using the standard SOM algorithm [Kleiweg 01]. The maps generated are shown in the appendix below. 10 3.5 Visualizing the maps The visualization method used here employs the use of color to indicate the similarity between adjacent nodes. For the rectangular grid network, the square difference between neighboring units on the trained map is calculated and the value is used to color the edged separating the units. Dark lines are used indicate strong difference, and light lines to indicate strong resemblance. The results show that the fact that nodes are close together does not necessarily indicate strong similarity. 3.6 The user communities We will not attempt to define hard boundaries for our user communities since it is of no value for our purpose. We will define soft bounded communities based on a neighbourhood definition. Given an exemplar caller with incomplete information, we can easily identify the community as the winning node and nodes within its neighbourhood. This community is used to predict the missing values. The closer a node is to the winning node, the stronger the influence it has in predicting the missing information of the exemplar caller. 3.7 Conclusions We have been able to represent usage data (call profile) of a user with a transition matrix. The use of the transition matrix helps to preserve important information on the temporal nature of the usage data. Based on the transition matrix, we generated map (using kohonen SOM) that automatically defined soft bounded user communities on the given data set. It is also desirable to be able to characterize a group of users based on their usage profile rather than individual users. 4 Conclusion 4.1 Discussion User modeling offers a cheap way of tailoring applications to meet the needs of diverse users. In order to fully achieve its potentials, a lot of research effort has to go into developing standards, techniques and methodologies in the area of user model acquisition, representation and inferencing, and communication between user modeling systems and other with external systems. We have presented a Kohonen SOM based approach that can be employed in some defining user communities based on usage data. 11 4.2 Research Directions There are a number of issues that still need attention in our proposed use of SOM in discovering user communities. These issues are presented in the following sections. 4.2.1 Determining States In our approach, determining the states inherent in data is done using heuristics. The nature of the problem at hand and experience usually come to play. There is need to investigate more rigorous approaches to determining the states inherent in data. 4.2.2 Associating raw data with identified communities In its current form, our approach does not associate raw data directly with its identified user community. Rather it associates the feature vectors (which resulted from a transformation of the raw data) with the user community. However, a more robust approach will have to provide a way in which raw data can be directly associated with the identified user community without having to go through the transition matrix transformation process and the learning/clustering process. A possible approach to this is to build a feed forward neural network to learn the association between raw data and the classes/categories that evolve in the data. 4.2.3 Feature selection The features that are selected to represent our feature space will determine the quality of our results. Wrong set of feature may lead to misleading results. Determining the salient features to use in representing our problem space is very important. Including all the available features may lead to including a lot of redundant features. Too few features may lead to loss of relevant information. The features to be selected in a particular situation are usually dependent on the nature of the problem being considered. There is need to investigate and establish general factors and techniques that can be used in eliciting salient features in a given data set. 4.2.4 Variants of SOM In our proposal for the use of SOM in user modeling, the basic SOM algorithm was assumed. However, where there are some specific additional requirements (e.g., need for a hierarchical structure) there may be need for an appropriate variant of the SOM (e.g. hierarchical SOM) to be implemented. 4.2.5 Improving the SOM algorithm The dynamic nature of communities require frequent computation these communities. This is usually done periodically because the process is computationally intensive (e.g., Doppelganger does it every night 12 [Orwant 93]). However, we acknowledge the fact that some applications may require that communities be recomputed more often. There is need for more research effort in this direction. 4.3 Concluding Remarks This paper proposed a novel methodology for constructing user communities using usage data. In order to preserve important temporal information on the usage data, we employed the use of a transition matrix to represent usage data. This matrix was then used as an input to the Kohonen SOM to generate soft bounded user communities. We demonstrated the applicability of our methodology by successfully applying it to call data from a mobile telecommunications network operator. Appendix Sample subscriber record CallDate CallDuration 09/04/2003 11:55 7 21/04/2003 14:18 57 23/04/2003 09:30 109 24/04/2003 18:12 9 29/04/2003 09:30 5 30/04/2003 09:26 53 05/05/2003 13:48 112 08/05/2003 15:05 94 09/05/2003 14:25 26 21/05/2003 18:02 7 27/05/2003 18:31 6 28/05/2003 09:24 26 29/05/2003 10:21 64 CallDate 26/06/2003 17:38 06/07/2003 16:55 07/07/2003 10:06 07/07/2003 10:07 14/07/2003 13:42 16/07/2003 17:37 22/07/2003 12:44 23/07/2003 08:35 23/07/2003 08:44 29/07/2003 12:32 29/07/2003 17:40 04/08/2003 09:13 04/08/2003 10:34 CallDuration 1 10 6 79 44 48 16 27 6 60 17 62 47 13 29/05/2003 13:03 29/05/2003 14:00 29/05/2003 16:08 29/05/2003 16:34 04/06/2003 18:46 06/06/2003 09:13 10/06/2003 13:56 11/06/2003 09:24 19/06/2003 18:25 19/06/2003 18:31 22/06/2003 13:52 25/06/2003 17:38 11 66 19 7 52 7 55 20 242 25 6 33 04/08/2003 19:09 04/08/2003 19:09 06/08/2003 14:44 06/08/2003 20:50 06/08/2003 20:59 06/08/2003 21:07 20/08/2003 16:04 22/08/2003 10:53 17/09/2003 09:56 18/09/2003 18:05 26/09/2003 16:18 28/09/2003 15:46 13 64 43 228 202 21 2 89 28 30 29 17 Sample transition matrix vector 22530 62141 24121 15310 02001 Sample feature vector 2, 2, 5, 3, 0, 6, 2, 1, 4, 1, 2, 4, 1, 2, 1, 1, 5, 3, 1, 0, 0, 2, 0, 0, 1 Sample normalized feature vector 0.16667, 0.16667, 0.41667, 0.25, 0.00, 0.42857, 0.14286, 0.07143, 0.28571, 0.07143, 0.2, 0.4, 0.1, 0.2, 0.1, 0.1, 0.5, 0.3, 0.1, 0.00, 0.00, 0.66667, 0.00, 0.00, 0.33333 Sample Maps constructed from the same subscriber training data. 14 15 References [Benaki et al 97] Benaki, E., Karkaletsis, V. A. and Spyropoulos, C. D., Integrating User Modeling Into Information Extraction: The UMIE Prototype, In Anthony Jameson, Cécile Paris, and Carlo Tasso (Eds.), User Modeling: Proceedings of the sixth International Conference, UM97, 1997. [Brusilovsky 01] Brusilovsky, P., Adaptive Hypermedia, User Modeling and UserAdapted Interaction, 11:87-110, 2001 [Galliat et al 00] Galliat, T., Huisinga, W. and Deuflhard, P., Self-Organizing Maps Combined with Eigenmode Analysis for Automated Cluster Identification, Proceeding of the ICSC Symposia on Neural Computation (NC'2000), Berlin, Germany, 2000. [Honkela 97] Honkela, T., Comparisons of self-organized word category maps, In Proceedings of WSOM'97, Workshop on Self-Organizing Maps, Espoo, Finland, pages 298-303, 1997. 16 [Kaski 97] Kaski, S., Data exploration using self-organizing maps, Acta Polytechnica Scandinavica, Mathematics, Computing and Management in Engineering Series No. 82. DTech Thesis, Helsinki University of Technology, Finland, 1997. [Kass et al 88] Kass R. and Finin T., A general User Modeling Facility, CHI 88, ACM, 145-150, 1988. [Kleiweg 01] http://odur.let.rug.nl/~kleiweg/kohonen/kohonen.html, 2001. [Kobsa 90] Kobsa, A., User Modeling in Dialog Systems: Potentials and Hazards, AI and Society, 4(3): 214-240, 1990. [Kobsa 01a] Kobsa A., Generic User Modeling Servers, User Modeling and UserAdapted Interaction, 11:49-63, 2001 [Kobsa et al 01] Kobsa, A., Koenemann, J. and Pohl, W., Personalised hypermedia presentation techniques for improving online customer relationships, The Knowledge Engineering Review, 16(2): 111-155, Cambridge University Press, 2001. [Kohonen 00] Kohonen, T., Self-Organising Maps of Massive Document Collections, IEEE, 2000 [Kohonen 01] Kohonen, T., Self-Organising Maps, Springer-Verlag, Berlin, 2001. [Merkl et al 97] Merkl, D. and Rauber, A., Alternative ways for cluster visualization in self-organizing maps, In Proceedings of WSOM'97, Workshop on Self-Organizing Maps, Espoo, Finland, pages 106-111, 1997. [Orwant 93] Orwant J., Dopelgänger Goes to School : Machine Learning for User Modeling, M.Sc. Thesis, MIT, 1993 [Paiva 95] Paiva, A., M., About User and Learner Modeling – an Overview, 1995. [Paliouras et al 98] Paliouras, G., Papatheodorou, C., Karkaletsis, V., Spyropoulos, C., and Malaveta, V., Learning User Communities for Improving the Services of Information Providers, Lecture Notes in Computer Science, SpringerVerlag, 1513 : 367-384, 1998. [Paliouras et al 99] Paliouras, G., Karkaletsis, V., Papatheodorou, C. and Spyropoulos, C. D., Exploiting learning techniques for the acquisition of user stereotypes and communities, in J. Kay (ed.) UM99 User Modeling: Proceedings of the Seventh International Conference Springer-Verlag 45–54, 1999. 17 [Siponen et al 01] Siponen, M., Vesanto, J., Simula, O. and Vasara P., An approach to automated interpretation of SOM, Advances in Self-Organising Maps, 89-94, Springer, 2001. [Vesanto 00] Vesanto J., Using SOM in Data Mining. Licentiate Thesis, Helsinki University of Technology, Finland, 2000. [Ypma et al 97] Ypma, A. and Duin, R. P. W., Novelty detection using selforganizing maps, In Nikola Kasabov, Robert Kozma, Kitty Ko, Robert O'Shea, George Coghill, and Tom Gedeon, editors, Progress in Connectionist-Based Information Systems, volume 2, pages 1322{1325. Springer, London, 1997. 18