Download Predictive Analytics, Data Mining and Big Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
DEMO
DEMO
10.1057/9781137379283 - Predictive Analytics, Data Mining and Big Data, Steven Finlay
Copyright material from www.palgraveconnect.com - licensed to npg - PalgraveConnect - 2017-05-02
Predictive Analytics, Data Mining and Big Data
DEMO
10.1057/9781137379283 - Predictive Analytics, Data Mining and Big Data, Steven Finlay
Copyright material from www.palgraveconnect.com - licensed to npg - PalgraveConnect - 2017-05-02
This page intentionally left blank
DEMO
Myths, Misconceptions and Methods
Steven Finlay
10.1057/9781137379283 - Predictive Analytics, Data Mining and Big Data, Steven Finlay
Copyright material from www.palgraveconnect.com - licensed to npg - PalgraveConnect - 2017-05-02
Predictive Analytics,
Data Mining and
Big Data
© Steven Finlay 2014
All rights reserved. No reproduction, copy or transmission of this
publication may be made without written permission.
No portion of this publication may be reproduced, copied or transmitted
save with written permission or in accordance with the provisions of the
Copyright, Designs and Patents Act 1988, or under the terms of any licence
permitting limited copying issued by the Copyright Licensing Agency,
Saffron House, 6–10 Kirby Street, London EC1N 8TS.
The author has asserted his right to be identified as the author of this
work in accordance with the Copyright, Designs and Patents Act 1988.
First published 2014 by
PALGRAVE MACMILLAN
Palgrave Macmillan in the UK is an imprint of Macmillan Publishers Limited,
registered in England, company number 785998, of Houndmills, Basingstoke,
Hampshire RG21 6XS.
Palgrave Macmillan in the US is a division of St Martin’s Press LLC,
175 Fifth Avenue, New York, NY 10010.
Palgrave Macmillan is the global academic imprint of the above companies
and has companies and representatives throughout the world.
DEMO
Palgrave® and Macmillan® are registered trademarks in the United States,
the United Kingdom, Europe and other countries.
ISBN 978–1–137–37927–6
This book is printed on paper suitable for recycling and made from fully
managed and sustained forest sources. Logging, pulping and manufacturing
processes are expected to conform to the environmental regulations of the
country of origin.
A catalogue record for this book is available from the British Library.
A catalog record for this book is available from the Library of Congress.
Typeset by MPS Limited, Chennai, India.
10.1057/9781137379283 - Predictive Analytics, Data Mining and Big Data, Steven Finlay
Copyright material from www.palgraveconnect.com - licensed to npg - PalgraveConnect - 2017-05-02
Any person who does any unauthorized act in relation to this publication
may be liable to criminal prosecution and civil claims for damages.
DEMO
10.1057/9781137379283 - Predictive Analytics, Data Mining and Big Data, Steven Finlay
Copyright material from www.palgraveconnect.com - licensed to npg - PalgraveConnect - 2017-05-02
To Ruby and Samantha
DEMO
10.1057/9781137379283 - Predictive Analytics, Data Mining and Big Data, Steven Finlay
Copyright material from www.palgraveconnect.com - licensed to npg - PalgraveConnect - 2017-05-02
This page intentionally left blank
Figures and Tables
x
Acknowledgments
xii
1
Introduction
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1
What are data mining and predictive analytics?
2
How good are models at predicting behavior?
6
What are the benefits of predictive models?
7
Applications of predictive analytics
9
Reaping the benefits, avoiding the pitfalls
11
What is Big Data?
13
How much value does Big Data add?
16
The rest of the book
19
DEMO
2 Using Predictive Models
2.1
2.2
2.3
2.4
2.5
21
What are your objectives?
22
Decision making
23
The next challenge
31
Discussion
34
Override rules (business rules)
36
Analytics, Organization and Culture
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
39
Embedded analytics
40
Learning from failure
42
A lack of motivation
43
A slight misunderstanding
45
Predictive, but not precise
50
Great expectations
52
Understanding cultural resistance to predictive analytics
The impact of predictive analytics
60
54
vi
i
3
Copyright material from www.palgraveconnect.com - licensed to npg - PalgraveConnect - 2017-05-02
Contents
10.1057/9781137379283 - Predictive Analytics, Data Mining and Big Data, Steven Finlay
3.9
4
Combining model-based predictions and human
judgment
62
The Value of Data
65
4.1 What type of data is predictive of behavior?
66
4.2 Added value is what’s important
70
4.3 Where does the data to build predictive
models come from?
73
4.4 The right data at the right time
76
4.5 How much data do I need to build a predictive model?
5
Ethics and Legislation
5.1
5.2
5.3
5.4
5.5
5.6
5.7
6
85
A brief introduction to ethics
86
Ethics in practice
89
The relevance of ethics in a Big Data world
Privacy and data ownership
92
Data security
96
Anonymity
97
Decision making
99
DEMO
Types of Predictive Models
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
90
104
Linear models
106
Decision trees (classification and regression trees)
112
(Artificial) neural networks
114
Support vector machines (SVMs)
118
Clustering
120
Expert systems (knowledge-based systems)
122
What type of model is best?
124
Ensemble (fusion or combination) systems
128
How much benefit can I expect to get from using an
ensemble?
130
6.10 The prospects for better types of predictive models in
the future
131
7
The Predictive Analytics Process
7.1
7.2
7.3
7.4
7.5
134
Project initiation
135
Project requirements
138
Is predictive analytics the right tool for the job?
Model building and business evaluation
143
Implementation
145
142
10.1057/9781137379283 - Predictive Analytics, Data Mining and Big Data, Steven Finlay
79
Copyright material from www.palgraveconnect.com - licensed to npg - PalgraveConnect - 2017-05-02
vi
ii
Contents
7.6 Monitoring and redevelopment
149
7.7 How long should a predictive analytics project take?
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
8.10
8.11
9
10
157
Exploring the data landscape
158
Sampling and shaping the development sample
162
Data preparation (data cleaning)
Creating derived data
163
Understanding the data
164
Preliminary variable selection (data reduction)
Pre-processing (data transformation)
166
Model construction (modeling)
170
Validation
171
Selling models into the business
172
The rise of the regulator
176
Text Mining and Social Network Analysis
9.1
9.2
9.3
9.4
9.5
9.6
9.7
154
165
179
Text mining
179
Using text analytics to create predictor variables
Within document predictors
181
Sentiment analysis
184
Across document predictors
185
Social network analysis
186
Mapping a social network
191
DEMO
Hardware, Software and All that Jazz
159
181
194
10.1 Relational databases
197
10.2 Hadoop
200
10.3 The limitations of Hadoop
202
10.4 Do I need a Big Data solution to do predictive
analytics?
203
10.5 Soft ware for predictive analytics
206
Appendix A. Glossary of Terms
209
Appendix B. Further Sources of Information
Appendix C. Lift Charts and Gain Charts
Notes
227
Index
246
218
223
10.1057/9781137379283 - Predictive Analytics, Data Mining and Big Data, Steven Finlay
Copyright material from www.palgraveconnect.com - licensed to npg - PalgraveConnect - 2017-05-02
8 How to Build a Predictive Model
ix
Contents