Panel: The Art of Data Mining, and the Quest for Greater Insight Moderator: Kate Smith-Miles, Deakin University, Australia Panelists: Kristin Bennett, Rensselaer Polytechnic Institute, USA Sven Crone, Lancaster University, UK Wlodzislaw Duch, Nicolaus Copernicus University, Poland Isabelle Guyon, ClopiNet, USA Nik Kasabov, Auckland University of Technology, New Zealand Zhi-Hua Zhou, Nanjing University, China Overview The data mining process requires a number of decisions to be made in each stage: selection of data and variables, choice of suitable sampling methods, data pre-processing steps, selection of the best knowledge discovery algorithms selection of parameters. With so many choices that can have significant impact upon the eventual success of the results, data mining can sometimes be seen as more art than science unless the user is highly knowledgeable. Is there a science to data mining? Or is it still more art than science? What insights do our experts have about which methods to use when? Aims This panel discussion aims to bring together experts in data mining to see if we can come up with some ideas about: our collective knowledge of when certain techniques (algorithms, pre-processing methods, etc.) are expected to perform well. How much insight do we have into the most effective data mining process? How can recent research in model selection and meta-learning help us to gain greater insight into the most effective data mining steps for a given problem? Can we take some of the mystery and need for trial and error out of the process, and come up with some expert guidelines, and lay the foundations for merging this information with large scale empirical analysis in the future? Questions for discussion Is there a science to data mining? Do you have your own rules (developed by experience) about when certain methods should be used, or not used? selection of data and variables, choice of suitable sampling methods, data pre-processing steps, selection of the best knowledge discovery algorithms selection of parameters. What about empirical studies (meta-learning, model selection, etc.) aimed to learn these rules? What would we need to do to take the trial-and-error and art out of the process to make data mining more userfriendly and effective? Next steps?