Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COMP1942 Project: Nursery Application Prepared by Raymond Wong Presented by Raymond Wong raywong@cse COMP1942 1 Dataset Two real datasets First dataset “training.txt” Contain the information about nursery applications in a city 9,945 records 8 attributes 1 additional Boolean attribute called “success” indicating whether the nursery application is successful or not Second dataset “test.txt” 3,015 records 8 attributes No additional Boolean attribute COMP1942 2 Dataset 8 attributes parent-occupation child-nursery form-of-the-family no-of-children housing-condition finance-standing-of-the-family social-condition health-condition COMP1942 3 Dataset Boolean attribute success COMP1942 4 Objective Objective To predict whether each nursery application in the second dataset is successful or not COMP1942 5 Project There are three phases in this project COMP1942 6 Project Deadlines Due Date for Phase 1 Due Date for Phase 2 24 Feb, 2017 9am 21 April, 2017 9am Due Date for Phase 3 5 May, 2017 9am COMP1942 7 Tasks to be done Phase 1 Phase 2 To generate an Excel file from two raw files together with attribute names To write a design report for this project To list 5 possible data mining models you want to try Phase 3 To follow the design report in Phase 2 To predict whether each nursery application in the second dataset is successful or not by using a data mining tool (XLMiner) To write a final report COMP1942 8 Grading Policy Phase 1 Phase 2 Excel file (10%) (via CASS) Design Report (30%) (in class) Phase 3 Final Report (40%) (in class) Predicted Attribute Files for the Second Real Dataset (20%) (via CASS) COMP1942 9 Mark Deduction No late submission is allowed for Phase 1 and Phase 2 Late submission is allowed for Phase 3 Number of Days Late 1 2 3 4 or above COMP1942 Deduction (out of 100 marks) 10 30 70 100 10 Very easy to obtain full scores! Grading Scheme Specify the data mining models clearly (e.g., what model you are using and what exact set of parameters you are using) Note: There are no STANDARD answers. Phase 1 Excel file Phase 2 Design Report Follow the design report Write observations clearly Analyze the results in a “logical” way Note: There are no STANDARD answers. Phase 3 Final Report Predicted Attribute Files We will compare your predicted files with our files. Note: There is a STANDARD answer. COMP1942 11