Download LINK - Xtra Effort Solutions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
BHow is “Big Data” different than traditional Business Intelligence?
Big Data
Traditional Business Intelligence
Unstructured data
Structured data
Don’t know what you are looking for
Know what you are looking for
Data is stored is not stored in a data warehouse
Data is stored in a data warehouse
Real time insight from data
Extensive storing and sorting is required; slower
More effective in predicting the future (to
Interpreting the past or present
advantage, reduce risk)
What trends drive the increase in unstructured data?
Social media
video
IP Telephony
Geological data from mining operations
Factory Plant data
Web behavior
Financial trading data
Medical and life science data
Scientific data
What are the most compelling “Big Data” applications?
Keep even or ahead of competition in developing services and products
Keep even or ahead of competition in identifying consumer and customer behavior and preferences
Military, defense, and criminal applications
Fraud prevention
IT security
Speed of Health, pharma, and medical research
What is MapReduce?
Invented by Google to index and interpret rich textural web data (data that does not easily into tables
of a traditional database engine)
Improved upon by Yahoo for enterprise use
Software that shares the processing demand across several disparate commodity computers:
improving speed and reducing hardware and software costs (or at a minimum solves the problem of
too much data for one machine)
Hadoop is the most popular and open source version of Mapreduce
Accesses and keeps track of the disparate data sources that are sent to the disparate computers
Storing and tracking the data across multiple computers enables the process to continue even if some
computers stop operating
Coordinates the processing power of these computers to gain speed
Commercial technology companies and other open source initiatives are developing tools to make it
easier to move data in and out of a Hadoop cluster, i.e., “Pig” as a programming language; “Hive” that
provides a data warehouse like structure to Hadoop