Download Data Science - LIACS Data Mining Group

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Market sentiment wikipedia , lookup

Transcript
Data Mining, Data
Science, Big Data
Data Science
 Data Science aims to extract insights from large
data
 Less emphasis on algorithms
 More emphasis on ‘outreach’
 Term Data Science is about 10 years old, very
popular nowadays
 Many people reinvent themselves as Data
Scientists
 data miners, statisticians, BI people, analysts, database
developers
Data Mining & Data Science
Data Science
Data Mining
Statistics
 Computational methods
 Dealing with large data
 Visualisation
 Involving domain knowledge
 Interpretable and interpreted results
Big Data
 Because you can…
 cheap storage
 Administrative/financial reasons
 Internet and social computing
 Internet of Things, ubiquitous computing
cost per Gigabyte
in dollars
$1,000,000
$10,000
$100
$1
$0.01
1980
1990
2000
2010
Cheap Storage
1956, IBM 350, 5 Mb
90 Tb
Big Data
Many facets, often people focus on only one
 Very, very large data
 CERN, Google, Facebook, Twitter, …
 Analytics
 Internet-generated
 Social data
 Heterogeneous, unstructured data
 Large-scale technologies
 MapReduce, Hadoop
Size-complexity trade-off
 Technological restrictions produce a trade-off
 Many Big Data projects algorithmically not so complex
 Embarrassingly parallel
size
CERN
complexity