Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
BHow is “Big Data” different than traditional Business Intelligence? Big Data Traditional Business Intelligence Unstructured data Structured data Don’t know what you are looking for Know what you are looking for Data is stored is not stored in a data warehouse Data is stored in a data warehouse Real time insight from data Extensive storing and sorting is required; slower More effective in predicting the future (to Interpreting the past or present advantage, reduce risk) What trends drive the increase in unstructured data? Social media video IP Telephony Geological data from mining operations Factory Plant data Web behavior Financial trading data Medical and life science data Scientific data What are the most compelling “Big Data” applications? Keep even or ahead of competition in developing services and products Keep even or ahead of competition in identifying consumer and customer behavior and preferences Military, defense, and criminal applications Fraud prevention IT security Speed of Health, pharma, and medical research What is MapReduce? Invented by Google to index and interpret rich textural web data (data that does not easily into tables of a traditional database engine) Improved upon by Yahoo for enterprise use Software that shares the processing demand across several disparate commodity computers: improving speed and reducing hardware and software costs (or at a minimum solves the problem of too much data for one machine) Hadoop is the most popular and open source version of Mapreduce Accesses and keeps track of the disparate data sources that are sent to the disparate computers Storing and tracking the data across multiple computers enables the process to continue even if some computers stop operating Coordinates the processing power of these computers to gain speed Commercial technology companies and other open source initiatives are developing tools to make it easier to move data in and out of a Hadoop cluster, i.e., “Pig” as a programming language; “Hive” that provides a data warehouse like structure to Hadoop