Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ACCTG 6910, Spring 2003 DESB, University of Utah Project Milestone 5 (April 3 – 17) Question 1 (25%): Discover access patterns in web logs. The supervisory council for University of Utah’s web portal has contacted the e.bis Research Lab to discover user access patterns from its web logs. As a volunteer in the Lab, you have been asked to perform association rule and sequential pattern mining tasks on a small sample web log. It contains 4736 users, 10000 sessions and 11042 visit with the following attributes: 1-5 7-11 13-17 user id session id URL id Step 1: Download from the Project section in the class website the data set – weblog.txt and a text file – urlmapping.txt that describes mappings of URL codes in weblog.txt to URLs in UU’s web site. Step 2: Use IBM Intelligent Miner to mine the data set for large item sets, association rules and large sequential patterns. Use 0.3 % for support level for association rule and sequential pattern mining and 50 % for confidence level for association rule mining. Mine the data set again using two different support levels for both association rule and sequential pattern mining. Step 3: Report and analyze the results. Please identify 10 interesting association rules and 10 large sequential patterns respectively. Use the urlmapping.txt to help find the URLs that match URL ids in the rules/patterns Write up a short (one to two paragraphs) of analysis of these rules/patterns and any actions you recommend the supervisory council to consider. Question 2 (25 %): If the data file includes referrer and visit duration information for each visit, please discuss how you might use clustering to help identify clusters in the data file.