Download Abstract

Abstract The World Wide Web (Web) is a popular medium to disseminate information today. If we define accessibility of information as the ease with which the information can be obtained, then the accessibility of information on the Web leaves much to be desired. Thus, many opportunities exist to improve the current search tools on the Web. Data mining and Web mining play an important role in such developments. In order to improve the accessibility of information on the Web, there is a need for more intelligent search engines. How to build systems that return relevant information from a large information repository is currently the subject of several research domains. Text categorization is a technology for successful filtering and sharing of information. In this way, quick and efficient access to the required information is possible. Although it is a recently emerged research field for computer science, it has acquired remarkable progress and many application fields. The increasing demand on it brought forth the concept of automatic text categorization, by the help of machine learning, and some implementation algorithms for text categorizer systems, employing the machine learning approach, appeared. The main goal of my project is propose a portal to access to a categorized set of information, so that the users can find their interest information in minimum time. In this project I present the application of machine learning and related technologies to the problem of information extraction from structured documents found on the web, such as HTML documents. In a sense, my work fit within the category of web content mining. In order to evaluate these researches I design and implement a system, AUT_UniversitiesPortal. This is a portal that helps the students to find their interest university in the whole of the world. One of the most significantly efficient algorithms among text Categorization algorithms, support vector machines, is used in my project for designing an efficient text categorization system. Keywords: Data mining, Web mining, Text categorization approaches, Portal

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Abstract