Download Git Version Control and Projects

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Clusterpoint wikipedia, lookup

Data Protection Act, 2012 wikipedia, lookup

Predictive analytics wikipedia, lookup

Data center wikipedia, lookup

Big data wikipedia, lookup

Data model wikipedia, lookup

Forecasting wikipedia, lookup

Web analytics wikipedia, lookup

Database model wikipedia, lookup

Data analysis wikipedia, lookup

3D optical data storage wikipedia, lookup

Data vault modeling wikipedia, lookup

Information privacy law wikipedia, lookup

Business intelligence wikipedia, lookup

Big Data
Different Types of Analytics
Descriptive and Predictive Analytics:
◦ Descriptive analytics is reporting what happened and analyzing the data that contributed to figuring out
why it happened.
◦ Predictive Analytics is using statistics and data mining techniques to make predictions about the future.
Prescriptive Analytics:
◦ Analytics that recommends actions
Social Media Analytics:
◦ Doing analysis on public opinion (behavioral patterns, tastes, targeted marketing)
Entity Analytics:
◦ Analytics that groups/clusters data about entities (and learns from the raw data)
Cognitive Computing:
◦ Human/Computer Interaction that is targeted for information exchange
Big Data
Big data are datasets whose size
exceeds the typical reach of a
DBMS to capture, store, manage,
and analyze.
One way to categorize the
different types of big data is
according to the 4 V's.
Big Volume
Volume - the size of the data managed by the
Often automatically collected information can
lead to huge amounts of data.
◦ Sensor Data (environmental or
◦ Scanning Equipment (card readers)
◦ Industrial Internet of Things (heavily sensored
manufacturing processing / RFID)
◦ Multimedia Data (Video / Audio / Everything Else)
Velocity - the speed at which data is created,
accumulated, ingested and processed.
Even if a database can handle the amount of data
that needs to be stored, it also needs to be fast
enough to process the information as quickly as
◦ High Frequency Stock Trading
◦ Detection of Malicious Activity in Call Network
◦ Real-Time Processing of Trends on Facebook / Twitter
Big data includes structured, semi structured, and
unstructured data in different proportions based on context.
Structured data feature a formally structured data model,
such as the relational model (rows and columns) or
hierarchical (nested structures).
Unstructured data has no identifiable formal structure.
◦ In MongoDB, we used semi structured document-oriented data.
◦ In Neo4j, we stored data as a graph.
◦ Other unstructured data:
◦ Emails, web content (blogs), pdfs, audio, video, images, clickstreams (cookie
Veracity is composed of two components:
◦ Credibility of the source
◦ Suitability of the data for its target audience
Much of the data in big data stores has different
levels of trustworthiness and must go though
quality testing and credibility analysis before
being used.
Many sources generate data that is uncertain,
incomplete, and inaccurate.
Databases holding such information needs to be
able to manage such questionable data.