Download Abstract

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Human–computer interaction wikipedia , lookup

Embodied cognitive science wikipedia , lookup

World Wide Web wikipedia , lookup

Personal information management wikipedia , lookup

Collaborative information seeking wikipedia , lookup

Semantic Web wikipedia , lookup

Transcript
Abstract
The World Wide Web (Web) is a popular medium to disseminate information today.
If we define accessibility of information as the ease with which the information can be
obtained, then the accessibility of information on the Web leaves much to be desired.
Thus, many opportunities exist to improve the current search tools on the Web. Data
mining and Web mining play an important role in such developments. In order to
improve the accessibility of information on the Web, there is a need for more intelligent
search engines. How to build systems that return relevant information from a large
information repository is currently the subject of several research domains.
Text categorization is a technology for successful filtering and sharing of
information. In this way, quick and efficient access to the required information is
possible. Although it is a recently emerged research field for computer science, it has
acquired remarkable progress and many application fields. The increasing demand on it
brought forth the concept of automatic text categorization, by the help of machine
learning, and some implementation algorithms for text categorizer systems, employing
the machine learning approach, appeared.
The main goal of my project is propose a portal to access to a categorized set of
information, so that the users can find their interest information in minimum time. In this
project I present the application of machine learning and related technologies to the
problem of information extraction from structured documents found on the web, such as
HTML documents. In a sense, my work fit within the category of web content mining. In
order to evaluate these researches I design and implement a system,
AUT_UniversitiesPortal. This is a portal that helps the students to find their interest
university in the whole of the world. One of the most significantly efficient algorithms
among text Categorization algorithms, support vector machines, is used in my project for
designing an efficient text categorization system.
Keywords: Data mining, Web mining, Text categorization approaches, Portal