Download Knowledge Discovery on Web Information Repository

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Copyright
International Journal of Advance Computing Technique and Applications (IJACTA), ISSN : 2321-4546
Knowledge Discovery on Web Information Repository
Punyaban Patel, Bibekananda Jena, Bibhudatta Sahoo
Asst. Professor, Department of Information Technology, PIET, Rourkela, ORISSA
Asst. Professor, Department of Electronics and Telecommunication, PIET, Rourkela, ORISSA
Asst. Professor, Department Of Computer Science & Engineering, NIT Rourkela, ORISSA
Abstract
Web mining techniques seek to extract knowledge
from Web data. Web mining is a cross point of
database, information retrieval and artificial
intelligence. Web content mining (WCM), web
structure mining (WSM) and web usage-mining
(WUM) buildup the whole web mining. This article
critically reviews the research trends in three major
area of Web mining(— content, structure, and
usage—as well as emerging work in Semantic Web
mining
and research issues, techniques and
development efforts with a goal of providing a broad
overview rather than an in-depth analysis are
presented in this paper.
1. Introduction
Web mining is an integrated technology in which
several research fields are involved, such as data
mining,
computational
linguistics,
statistics,
informatics and so on. Different researchers from
different communities disagree with each other on
what Web mining is exactly [2]. Many unrelated
projects have explored different aspects of this
problem as well.
Since Web mining derives from data
mining, its definition is similar to the well-known
definition of data mining [3]. Nevertheless, Web
mining has many unique characteristics compared
with data mining. Firstly, the source of Web mining
is web documents. We consider the use of the Web as
a middleware in mining database and the mining of
logs, user profiles on the Web server still belong to
the category of traditional data mining. Secondly, the
Web is a directed-graph consists of document nodes
and hyperlinks. Therefore, the pattern identified can
be possibly about the content of documents or about
the structure of the Web. Moreover, the Web
documents are semi-structural or non-structural with
little machine-readable semantic while the source of
data mining is confined to the structural data in
www.ijacta.com
database. As a result, some traditional data mining
methods are not applicable to Web mining. Even if
applicable, they must be based on the preprocessing
of documents. Data mining is used to identify valid,
novel,
potentially
useful
and
ultimately
understandable pattern from data collection in
database community [3]. Data mining tools predict
future trends and behaviors, allowing businesses to
make proactive, knowledge-driven decisions. The
most commonly used techniques in data mining are:
Artificial neural networks, Association Rule,
Classification, Clustering, Data visualization,
Decision trees, Genetic algorithms, Nearest neighbor
method, and role induction. However, there is little
work that deals with unstructured and heterogeneous
information on the Web. Web mining aims to
discover useful information or knowledge from the
Web hyperlink structure, page content and usage log.
Based on the primary kind of data used in the mining
process, Web mining tasks are categorized into three
main types: Web structure mining, Web content
mining and Web usage mining. Web mining is a new
research issue under dispute, which draws great
interest from many communities. Currently, there is
no agreement about Web mining yet. It needs more
discussion among researchers in order to define what
it is exactly. Meanwhile, the development of Web
mining system will promote its research in turn.
In this paper a preliminary discussion about Web
mining is given. Firstly we present the definition of
Web mining and expound the relationships between
Web mining, traditional data mining and information
retrieval on the Web. Then, we present the taxonomy
of Web mining and briefly describe the functions of
Web text mining. Finally, we present the conclusions
and challenges for Web mining in future.
Page 49
Copyright
International Journal of Advance Computing Technique and Applications (IJACTA), ISSN : 2321-4546
2. Web Mining Process Model
The Web mining can be defined as: Web mining is
the activity of identifying patterns p implied in large
document collection C, which can be denoted by a
mapping  : C  p.
The general process of Web Mining may be four
different stages(Figure-1): (i) resource discovery, (ii)
information pre-processing, (iii) generalization, and
(iv) pattern analysis.
Resource
Discovery
Information
Pre-processing
Generalization
Pattern
Analysis
Figure 1: General Process of Web Mining
Resource Discovery, whose task is retrieving web
documents, is the process of retrieving the web
resources. Web information retrieval is the process to
find a subset S of appropriate number of documents
relevant to a certain query q from large document
collection C, which can also be denoted by a
mapping  : (c, q) S .
Information Pre-processing is the transform process
of the result of resource discovery. Generalization is
to uncover general patterns at individual web sites
and across multiple sites [4]. In this step machine
learning and traditional data mining techniques are
typically used. Pattern Analysis is the validation of
the mined patterns.
3. A Taxonomy Of Web Mining
Researchers have identified three broad categories of
Web mining: Web content mining, Web structure
mining and the Web usage mining [18]. The
taxonomy is shown in figure-2 below.
Web mining
Web Content Mining
Text Mining
Multimedia
Mining
Web Structure Mining
Search Result
Mining
Web External
Structure Mining
Web Usage Mining
General Access
Pattern Tracking
Customize Usage
Tracking
Web URL
Mining
Web Internal
Structure Mining
Figure 2: A Taxonomy of Web Mining
www.ijacta.com
Page 50
Copyright
International Journal of Advance Computing Technique and Applications (IJACTA), ISSN : 2321-4546
3.1 Web content mining is the application of data
mining techniques to content published on the
Internet, usually as HTML (semi-structured),
plaintext (unstructured), or XML (structured)
documents.
Nearest Neighbor algorithm [8], Naive Bayes
algorithm [9], etc.
b) Text clustering: The difference between text
clustering and categorization lies in that no taxonomy
is predefined for clustering. The goal of text
clustering is usually to divide documents collection C
into a set of clusters such that inter- cluster similarity
is minimized and intra-cluster similarity is
maximized. We can apply text clustering to the
organization of retrieval results returned by search
engine. Then users merely need to examine the
clusters relevant to their query, which dramatically
reduces the time and the efforts spent in sifting
through the long list of documents. Numerous textclustering algorithms have been proposed so far, all
of which fall into two types. One is hierarchical
clustering such as G-HAC algorithm [8], the other is
partitioned clustering represented by k-Means
algorithm [l0]. The Web Text Mining is given in
figure-3 below.
Web content mining can be divided into text mining
(including text file, HTML, DHTML, XML
document, etc.), multimedia mining and the Search
Result Mining. The main categories of Web text
mining are text categorization, text clustering,
association analysis, trend prediction and so on.
3.1.1 Text mining
a) Text categorization: Given a predefined taxonomy,
each document in collection C is categorized into one
appropriate class or more. In this way, it is not only
convenient for users to browse documents but also
easier to search documents by specifying class.
Currently there are many text categorization
algorithms, in which the most commonly used are k-
Categorizing
Agent
The
web
Gathering
Agent
Preprocess
Agent
Document
Repository
Feature
DB
Document
Cube
MDDA
Engine
Interface
Agent
Document
View
Clustering
Agent
Figure-3: Web Text Mining System
3.1.2 Multimedia Mining
Multimedia Data essentially refers to data such as
text, numeric, images, video, audio, graphical,
temporal, relational and categorical data. Multimedia
Mining is discovering knowledge from large amounts
of different types of multimedia data. It involves the
extraction of implicit knowledge, multimedia data
relationships, or other patterns not explicitly stored in
multimedia databases. The multimedia mining
involves two basic steps:
• Extraction of appropriate features from the data;
and
www.ijacta.com
• Selection of data mining methods to identify the
desired information.
Multimedia mining attempts to automatically
discover knowledge in multimedia datasets. It is a
comparatively young discipline and the research
community is still adapting techniques from more
established fields, such as artificial intelligence and
pattern recognition, and developing new techniques
particularly tailored for multimedia mining
applications. The exponential growth of multimedia
data in consumer as well as scientific applications,
Page 51
Copyright
International Journal of Advance Computing Technique and Applications (IJACTA), ISSN : 2321-4546
however, requires fast progress of research in the
field in order to find adequate techniques to analyzes
and manage such data, including feature extraction,
high-dimensional indexing, similarity based search,
and personalized search and retrieval. The major
challenge is to unify all these existing methods from
various fields into a multimedia-mining framework.
Although multimedia mining is drawing more and
more interests, text mining is the most fundamental
and important task as text is the primary information
vehicle. The issue of text mining is of importance to
publishers who hold large databases of information
requiring indexing for retrieval. This is particularly
true in scientific disciplines, in which highly specific
information is often contained within written text.
3.1.3 Search Result Mining
(i) Automatic classification in documents using
searching engine Search engine can index a mass of
disordered data on Web. For example, firstly, Web
crawlers down load Web pages from Web sites.
Secondly, searching engine extracts describable
index information from these Web pages to store
them with URL into searching engine base. Thirdly
using data mining methods we automatically class
them into usable Web page classification system
organized by hyperlink structure.
(ii) Implement friendly interface from searching
result: In the classification system there are many
unrelated information. If we can analyze and cluster
them in the searching results, searching effect will be
highly improved.
3.2 Web structure mining operates on the Web’s
hyperlink structure.
Conventionally, a website structure may be evaluated
with the Card Sorting technique; however, this
technique is difficult to implement for a large and
information-centric website. There are a number of
commercial and free evaluation tools available. Most
of the tools are based on the user logging results that
are stored in the proxy server or through the
embedment of client-side programming code to
capture the client’s events.
A recent publication by Miller and Remington [14]
pointed out that the structure of linked pages (the
site’s information architecture) has a decisive impact
on the usability. Previous studies including
www.ijacta.com
Shneiderman [16], and Larson and Czerwinski [13]
also provided suggestions on how to create the best
structure. Larson and Czerwinski [13] found that
users took significantly longer time to find items in a
structure with depth than breadth. They compared a
three-tiered, eight-links-per-page (8 x 8 x 8) structure
with two-tiered, 16 and 32 links per page structures
(16 x 32 and 32 x 16). Bernard [1] devised a metric
called Hypertext Accessibility Index (HAI) to model
the informational accessibility of a particular
hypertext structure compared. It explained what
Larson and Czerwinski [13] as well as Kiger [12]
found about the navigation time of shallow and deep
structure in quantitative terms. Kao et al. [11]
proposed a LAMIS method with the entropy analysis
to distill the information of a website. Empirically,
the probability is directly linked to the transition
probability of the web log.
A study on the use of entropy theory to merge the
website content was performed by Chen et. al. [6].
Based on the Markov model, Jenamani et al. [2, 15]
proposed several algorithms to examine (1.) the most
accessed pages, (2.) the company’s interest, (3.) the
visitor’s interest pages, (4.) the current visitor’s
interest pages, (5.) the customized index generation
algorithms. However, our system objectives are not
to provide a dynamic hyperlink structure, we will
concentrate on the study of building a static optimal
website structure.
Web structure mining can be further divided into
external structure (hyperlink between web page)
mining, internal structure (of a web page) mining and
URL mining. Craven et al. [7] used first-order
learning in categorizing hyperlinks to estimate the
relationship between web pages. They also made use
of the anchor text in hyperlink to categorize the target
web page. Brin et al. [9] took both the citation
counting of referee pages and the importance of
referrer pages into account to find pages that are
“authorities” on particular topics. Spertus et al. [6]
proposed some heuristic rules by investigating the
internal structure and the URL of web pages.
3.3 Web usage mining (WUM) analyzes results of
user interactions with a Web server, including Web
logs, clickstreams, and database transactions at a
Web site or a group of related sites(shown in figure4). Web usage mining introduces privacy concerns
and is currently the topic of extensive debate.
Page 52
Copyright
International Journal of Advance Computing Technique and Applications (IJACTA), ISSN : 2321-4546
Data
cleaning
Server
Log
Data
Transaction
identification
Clean
log
Registrat
ion data
Data
Integration
Formatted
data
Trans
action
data
Pattern analysis
Path
Analysis
OLAP/
Visualizatio
n Tools
Associati
on Rules
Integr
ated
data
Document
& usage
Attributes
Pattern
Discovery
Transfo
rmation
Database
Query
language
Sequentia
l Patterns
Clusters
&
Classifica
tion rules
Knowledge
Query
Mechanism
Intelligent
Agents
Fig 4: A General Architecture for Web Usage Mining
Web usage mining is the type of web mining activity
that involves the automatic discovery of user access
patterns from one or more web servers. As more
organizations rely on the Internet and the World
Wide Web to conduct business, the traditional
strategies and techniques for market analysis need to
be revisited in this context. Organizations often
generate and collect large volumes of data in their
daily operations. Most of this information is usually
generated automatically by web servers and collected
in server access logs. Other sources of user
information include referrer logs, which contain
information about the referring pages for each page
reference, and user registration or survey data
gathered via tools such as CGI scripts.
Analyzing such data can help these organizations to
determine the life time value of customers, cross
marketing
strategies
across
products,
and
effectiveness of promotional campaigns, among other
things. Analysis of server access logs and user
registration data can also provide valuable
information on how to structure a web site in order to
create a more effective presence for the organization.
Using intranet technologies in organizations, such
analysis can shed light on more effective
management of workgroup communication and
organizational
infrastructure.
Finally,
for
www.ijacta.com
organizations that sell advertising on the World Wide
Web, analyzing user access patterns helps in
targeting ads to specific groups of users. The
problems existed in WUM is that the physical
structures of Web sites are different from user
accessing patterns and it is very difficult to find out
users, sessions and transactions.
WUM can be decomposed into the following
subtasks [17] is shown in figure-5:
3.3.1 Data Pre-processing for Mining:
It is necessary to perform a data preparation to
convert the raw data for further process. It
has separate subsections as follows:
Content Preprocessing: Content preprocessing is the
process of converting text, image,
scripts and other files into the forms that can be used
by the usage mining.
Page 53
Copyright
International Journal of Advance Computing Technique and Applications (IJACTA), ISSN : 2321-4546
Raw Logs
User Session Files
Preprocessing
Rules, Patterns
and Statistics
Mining Algorithm
“Interesting”
Rules, Patterns
and Statistics
Pattern Analysis
Site Files
techniques can be used to discover unordered
Figure 5: The Subtasks of WUM
correlation between items found in a database of
Structure Preprocessing: The structure of a web site
is formed by the hyperlinks between page views. The
structure preprocessing can be treated similar as the
content preprocessing. However, each server session
may have to construct a different site structure than
others.
Usage Preprocessing: The inputs of the preprocessing
phase may include the web server logs, referral logs,
registration files, index server logs, and optionally
usage statistics from a previous analysis. The outputs
are the user session file, transaction file, site
topology, and page classifications.
3.3.2 Mining Algorithm and Pattern Discovery
This is the key component of the web mining. Pattern
discovery covers the algorithms and techniques from
several research areas, such as data mining, machine
learning, statistics, and pattern recognition. It has
separate subsections as follows.
Statistical Analysis: Statistical analysts may perform
different kinds of descriptive statistical analyses
based on different variables when analyzing the
session file. By analyzing the statistical information
contained in the periodic web system report, the
extracted report can be potentially useful for
improving the system performance, enhancing the
security of the system, facilitation the site
modification task, and providing support for
marketing decisions.
Association Rules: In the web domain, the pages,
which are most often referenced together, can be put
in one single server session by applying the
association rule generation. Association rule mining
www.ijacta.com
transactions.
Clustering: Clustering analysis is a technique to
group together users or data items (pages) with the
similar characteristics. Clustering of user information
or pages can facilitate the development and execution
of future marketing strategies.
Classification: Classification is the technique to map
a data item into one of several predefined classes.
The classification can be done by using supervised
inductive learning algorithms such as decision tree
classifiers, naïve Bayesian classifiers, k-nearest
neighbor classifier, Support Vector Machines etc.
Sequential Pattern: This technique intends to find
the inter-session pattern, such that a set of the items
follows the presence of another in a time-ordered set
of sessions or episodes. Sequential patterns also
include some other types of temporal analysis such as
trend analysis, change point detection, or similarity
analysis.
Dependency Modeling: The goal of this technique is
to establish a model that is able to represent
significant dependencies among the various variables
in the web domain. The modeling technique provides
a theoretical framework for analyzing the behavior of
users, and is potentially useful for predicting future
web resource consumption.
3.3.3 Pattern Analysis
Pattern Analysis is a final stage of the whole web
usage mining. The goal of this process is to eliminate
the irrelative rules or patterns and to extract the
interesting rules or patterns from the output of the
pattern discovery process. The output of web mining
algorithms is often not in the form suitable for direct
human consumption, and thus need to be transform to
a format can be assimilate easily. There are two most
common approaches for the patter analysis. One is
to use the knowledge query mechanism such as SQL,
while another is to construct multi-dimensional data
cube before perform OLAP operations. All these
Page 54
Copyright
International Journal of Advance Computing Technique and Applications (IJACTA), ISSN : 2321-4546
methods assume the output of the previous phase has
been structured.
Rules, Fuzzy Clustering, Web Site Evaluation,
Recommender Systems, and Privacy Issues.
4. Web Mining-Research and Practice
Since the Researchers have identified three broad
categories of Web mining: [19, 20] Web content
mining, Web Structure mining and the Web Usage
mining, we will discuss some important research
contributions in Web mining for knowledge
discovery, with a goal of providing a broad overview
rather than an in-depth analysis.
Web Content and Structure Mining
Some researchers combine content and structure
mining to leverage the techniques’ strengths.
Although not all researchers agree to such a
classification, we list research in these two areas
together. Fabrizio Sebastini[21] and Soumen
Chakrabarti[22] discuss Web content mining
techniques in detail, and Johannes Fürnkranz[23]
surveys work in Web structure mining , which
suggests the following areas (.i)Web as a Database,
(ii) Document Classification, (iii)Hubs and
Authorities, (iv) Clever: Ranking by Content, (v)
Identifying Web Communities, (v) Web Usage
Mining
5.Conclusion
Web mining is a new field and in its infancy, there
are rare researchers ventured in this field, especially
the Chinese and Japanese word and the hybrid text
mining techniques in our country. The key
component of web mining is the mining process
itself. A lot of work still remains to be done in
adapting known mining techniques as well as
developing new ones. The open research problems in
the area of web mining are related to: New Types of
Knowledge, Distributed Web Mining, Incremental
Web Mining, Improved and Mining Algorithms.
Recent research has mostly focused on Web usage
analysis, partly because of its applicability in ebusiness. We expect privacy issues, distributed Web
mining, and Semantic Web mining to attract equal, if
not more, interest from the research community.
Increased use of Web mining techniques will require
that privacy issues be addressed, however. Similarly,
aggregating data in a central site and then mining it is
rarely scalable; hence the need for distributed mining
techniques. Finally, researchers will need to leverage
the semantic information the Semantic Web provides.
Exposing content semantics and the link explicitly
can help in many tasks, including mining the hidden
Web—that is, data stored in databases and not
accessible through search engines. As new data is
published every day, the Web’s utility as an
information source will continue to grow.
Web usage mining has several applications in ebusiness, including personalization, traffic analysis,
and targeted advertising. The development of
graphical analysis tools such as Webviz [24]
popularized Web usage mining of Web transactions.
The main areas of research in this domain are Web
log data preprocessing and identification of useful
patterns from this preprocessed data using mining
techniques. Several surveys on Web usage mining
exist.[25, 26]
Most data used for mining is collected from Web
servers, clients, proxy servers, or server databases, all
of which generate noisy data. Because Web mining is
sensitive to noise, data cleaning methods are
necessary. Jaideep Srivastava and colleagues[25]
categorize data preprocessing into subtasks and note
that the final outcome of preprocessing should be
data that allows identification of a particular user’s
browsing pattern in the form of page views, sessions,
and clickstreams. Clickstreams are of particular
interest because they allow reconstruction of user
navigational patterns. Recent work by Yannis
Manolopoulos and colleagues[27] provides a
comprehensive discussion of Web logs for usage
mining and suggests novel ideas for Web log
indexing. Such preprocessed data enables various
mining techniques. Some of the notable research
areas are follows: Adaptive Web Sites, Association
www.ijacta.com
References:
[l] Bernard, M. L., “Examining a Metric for Predicting the
Accessibility of Information within Hypertext Structures”,
Dissertation Thesis, Wichita State University, 2002
[2] Jenamani M., Mohapatra P. K. J., and Ghose S.,
“Online Customized Index Synthesis in Commercial
Websites”, IEEE Intelligent Systems, Vol.17, No.6, 2002,
pp.20-26.
[3] Usama Fayyad et al, “The KDD Process for Extracting
Useful
Knowledge
from
Volumes
of
Data”,
Communications of the ACM, Vol. 39, No. 11, Nov. 1996,
pp. 27-34.
[4] O.etzioni. The world wield web: Quagmire or Gold
Mining. Communicate of the ACM, (39)11:65-68, 1996
[5] J. Han, M. Kamber, ”Data Mining”, Morgan Kaufmann
Publishers, 2001.
[6] Chen, Z., Liu, S., Liu, W., Pu, G. and Ma, W.,
“Building a Web Thesaurus From Web Link Structure”,
Page 55
Copyright
International Journal of Advance Computing Technique and Applications (IJACTA), ISSN : 2321-4546
Proceedings of the 26th annual international ACM SIGIR
conference on
Research and development in information retrieval, 2003,
pp 48-55.
[7] M. Craven, S. Slattery and K. Nigam, “First-Order
Learning for Web Mining”, In Proceedings of the 10th
European Conference on Machine Learning, Chemnitz,
1998.
[8] P. Willet, “Recent Trends in Hierarchical Document
Clustering: a Critical Review”, Information Processing and
Management, Vol. 24, 1988, pp. 577-597.
[9] Sergey Brin, Lawrence Page, “The Anatomy of Largescale Vol. 1, NO. I, 1997, pp. 68-88. Hypertextual Web
Search Engine”, In Proceedings of the Seventh
International World Wide Web Conference, 1998.
[10] J. J. Rocchio, “Document Retrieval Systems –
Optimization Evaluation”, Ph.D. Thesis, Harvard
University, 1966.
[11] Kao, H., Lin, S., Ho, J., Chen, M., “Mining Web
Informative Structures and Contents Based on Entropy
Analysis”, IEEE Transactions on Knowledge and Data
Engineering, Vol.16, Iss.1, 2004, pp.41 – 55.
[12] Kiger, J. I. , “The Depth / Breadth Tradeoff in the
Design of Menu-Driven Interfaces.”, International Journal
of Man-Machine Studies, Vol.20, 1984, pp.201-213.
[13] Larson, K., & Czerwinski, M., “Web page design:
Implications of memory, structure and scent for
information retrieval”, CHI’98: Human Factors in
Computing Systems, New York: ACM Press, 1998, pp.2532.
US Nat’l Science Foundation Workshop on NextGeneration Data Mining (NGDM), Nat’l Science
Foundation, 2002.
[20] R. Kosala and H. Blockeel, “Web Mining Research: A
Survey,” ACM SIGKDD Explorations, vol. 2, no. 1, 2000,
pp. 1–15.
[21] F. Sebastini, Machine Learning in Automated Text
Categorization, tech. report B4-31, Istituto di Elaborazione
dell’Informatione, Consiglio Nazionale delle Ricerche,
Pisa, 1999.
[22] S. Chakrabarti, “Data Mining for Hypertext: A
Tutorial Survey,” ACM SIGKDD Explorations, vol. 1, no.
2, pp. 1–11, 2000.
[23] J. Fürnkranz, “Web Structure Mining: Exploiting the
Graph Structure of the World Wide Web,” Österreichische
Gesellschaft für Artificial Intelligence (ÖGAI), vol. 21, no.
2, 2002, pp. 17–26.
[24] J.E. Pitkow and K. Bharat, “WebViz: A Tool for
WWW Access Log Analysis,” Proc. 1st Int’l Conf. World
Wide Web, Elsevier Science, 1994, pp. 271–277.
[25] J. Srivastava et al., “Web Usage Mining: Discovery
and Applications of Usage Patterns from Web Data,” ACM
SIGKDD Explorations, vol. 1, no. 2, 2000, pp. 12–23.
[26] M Eirinaki and M Vazirgiannis, “Web Mining for
Web Personalization,” ACM Trans. Internet Technology,
vol. 3, no. 1, 2003, pp. 1–27.
[27] Y. Manolopoulos et al., “Indexing Techniques for
Web Access Logs,” Web Information Systems, IDEA
Group, 2004.
[14] Miller, C. S. and Remington, R. W., “Modeling
Information Navigation : Implications for Information
Architecture”, Human-Computer Interaction, Vol.19, No.3,
2004.
[15] Pitkow, J. E. and Pirolli, P. “Mining Longest Repeated
Subsequences to Predict World Wide Web Surfing.”,
Second USENIX Symposium on Internet Technologies and
System,1999.
[16] Shneiderman, B., “Designing the User Interface”,
Strategies for Effective Human-Computer Interaction, 3rd
ed, Reading, MA : Addison-Wesley, 1998.
[17] Yan Wang, Web Mining and Knowledge Discovery of
Usage Patterns, February, 2000.
[18] Margaret H. Dunham, S. Sridhar,” Data Mining:
Introductory and Advanced Topics”, Pearson Education,
2006.
[19] J. Srivastava, P. Desikan, and V. Kumar, “Web
Mining: Accomplishments and Future Directions,” Proc.
www.ijacta.com
Page 56