Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Automatic Classification of Bookmarked Web Pages Individual APT Presentation January 2007 1 Introduction • There are no pre-requisites • May be useful for students who intend to follow CSA3200 (Adaptive Hypertext Systems) in 4th year 2 Aims • Helping to keep bookmark files organised • When a user chooses to bookmark a web page, system recommends one of the user’s existing categories (instead of just last location saved to, or bookmark root) 3 How? • 2 algorithms to perform bookmark classification – One builds a representative document of each category (will be provided) – Second approach is up to you • An additional utility may be proposed to improve results – E.g., synonym recognition 4 Why? • Having organised bookmark files will enable us to do… – Automatic query generation from bookmark files – Web page recommendation based on other people’s bookmark files –… 5 How? • Start with Open Source framework provided by Ian Bugeja in his HyperBK project • Build algorithms • Build evaluation platform for your system – I will provide 8 bookmark files for you to use • You can remove some URLs at random to see if your algorithms classify them correctly • You will also attempt to reconstruct each bookmark file from scratch! 6 Evaluation • I will provide another 20 bookmark files (with some URLs randomly removed) for you to use to evaluate your algorithms • Students who have the best performing algorithms and best reports will have opportunity to continue working on system for FYP and to submit co-authored paper to leading IR/Adaptive systems conference 7 Tools • I recommend… – Mozilla Firefox – Xul (XML User Interface Language) and JavaScript • A tutorial on Xul will be provided – Google API – You’ll be able to use Ian Bugeja’s framework and your plugin will be portable! • But you’re free to use any other browser, platform, language 8