Download The 7 pillars of Big Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

Transcript
Technology
COMPUTING
The
7 pillars
of
Big Data
The oil and gas sector has to deal with large amounts of data. Big Data
is about how to draw value and competitive advantage from that data.
Dr Satyam Priyadarshy, Chief Data Scientist at Halliburton, Landmark,
considers there are seven key components, reports Brian Davis.
T
ypically software experts talk
about Big Data in terms of
the ‘3Vs’ – volume, velocity
and variety (see Petroleum Review,
December 2014, p14–16). However,
Dr Satyam Priyadarshy, Chief Data
Scientist at Halliburton, Landmark,
considers there are actually ‘7Vs’
that describe Big Data – volume,
velocity, variety, veracity, virtual,
variability and value.
Priyadarshy suggests that Big
Data is an emerging landscape
which consists of a variety of
technologies, algorithms, products
and solutions for leveraging a wide
range of disparate data sets with a
high degree of complexity. He sees
Big Data as an enabler for value
creation to remain competitive in a
global environment by leveraging
data ingestion, mining and
visualisation of all the geophysical,
seismic and related E&P data sets.
The 7Vs explained
The first two components of Big
Data are volume and velocity. This
means addressing the volume of
data coming in – which is rising
from terabytes to petabytes – as
well as the velocity at which
data has to be analysed. Some
data is received in real-time
without the need for real-time
analysis. However, some has to be
analysed in real-time for alerts and
operational efficiency.
34 Petroleum Review | January 2015 The third component of Big
Data is variety, consisting of
structured or unstructured data.
Structured data (on temperature,
pressure and other parameters)
generally comes from sensors.
The fourth component – is the
veracity of data, as confusion can
arise because of incomplete
definition regarding ‘How true is
the data?’
The next component is virtual
data. This component enables the
E&P industry to pipeline the data
for analytics, from internal and
external sources, with the right
governance without the extra cost
of de-duplication and loss in
transformation.
Variability can occur in each of
these six components, depending
at what stage of an E&P
programme one is interested in.
For example, seismic data is
generated in high volume, low
velocity and the associated value
of seismic data interpretation is
high. Whereas, on some occasions,
the velocity of drilling data may be
low, variety is low but the value
can be significant. Sometimes, the
volume of data is small, but comes
at faster speed, and needs to be
analysed in real-time, in-field.
The seventh component of Big
Data is most important – value.
Without the value in data, the
other Vs do not matter in E&P.
Big Data can offer a measure of
business performance, both from a
historical perspective, as an
ongoing measure, and for forecast
of future performance. ‘Big Data
has a multi-functional aspect,’
explains Priyadarshy. ‘In order to
create value from these data assets,
the industry should leverage
emerging technologies for four key
areas of business, namely:
innovation, business strategy,
faster and better decision making.’
Big Data is about focusing on
business growth by converting
data assets into valuable products,
ie by finding new patterns in
historical, existing and future data
from the E&P ecosystem, which
can lead to innovation, strategy
change and better decisions.
Priyadarshy recommends an
agile attitude towards new
technology adoption for Big Data
purposes, ‘leveraging multiple
technologies, applications and
programmes (TAPS) to discover
new patterns, and building holistic
capabilities which augment first
principle models’. This requires a
change of mind set, breaking down
data and cultural silos while
maintaining proper governance, in
order to integrate data across the
business in a more effective
manner.
Here again, Big Data requires a
paradigm shift, asking new
Technology
Big Data is
about getting
value from all
data by
leveraging
emerging
technologies and
pattern-based
studies for
innovation,
strategy, faster
decisions and
better decisions
to help the
bottom line
questions about operations and
functions from new patterns that
emerge from pipelining all the
data to determine its impacts on
the E&P business. ‘Big Data
enables an inquiry-based
approach, examining patterns or
answers to questions that lead to
new ways to solve complex
problems,’ he says.
E&P is typified by large volumes
of data sets from disparate sources.
Consequently, there is a need to
connect data from various sources,
using iterative modelling, in
concert with domain expertise and
a well qualified data scientist
team. Full executive buy-in is
necessary to achieve the most
profitable outcomes, safely and
effectively.
New technology
Emerging technology, commonly
called the Internet of Things is
enabling disparate data sets to
be generated at high frequency.
Priyadarshy draws an analogy
between gold mining and Big
Analytics. In the former, it
typically takes 22 tonnes of raw
ore to produce 0.04 ounces of gold.
Hopefully, the odds are better in
the world of Big Data.
‘Advanced analytics helps
understanding of cause-effect
relationships, prediction of future
events and identifies the best
possible courses of action.
Moreover, Big Analytics can give
actionable insights for increasing
revenue and profit, for value
measurement and enabling
innovation,’ he remarks.
Data is only the foundation.
Above that is the ‘information’
layer, whether using algorithms or
creating data-driven models. Here,
visualisation tools are important.
Traditionally, people would
collect data and transform it to
create what are called cubes, which
were loaded into data warehouses.
The focus for Big Data is to ‘extract,
load and transform’. Typically, data
is collected and loaded into a
HADOOP open source database,
which enables distributed
processing of large data sets across
clusters of servers, where it can be
transformed and analysed.
Priyadarshy emphasises: ‘Data
is stored in various places within
the oil and gas industry. But in
order to create value out of it, you
have to create an open data
culture.’ The role of the Chief Data
Scientist is also new to this sector.
So where should companies
focus in terms of Big Data projects?
Priyadarshy recommends
organisations should ask: What are
your business problems? What are
the risk areas? Where can you save
money? ‘You should focus on the
following four areas: innovation,
strategy, and better and faster
decisions,’ he says.
Sample applications
Priyadarshy cites the case where
an operator regularly suffered
stuck pipe during drilling
operations and didn’t know how to
predict when the problem would
arise. ‘This is a classic problem
which can take advantage of Big
Data. It fits into the “better and
faster” decision category because
you want to predict whether there
will be a stuck pipe (say) two hours
or five days from now, so you can
decide fast whether to change the
pressure on the drill bit or bring in
a different team.’
Big Data is also valuable for
analysis of seismic data. Typically,
sequence analysis of waveform
data involves numerous
calculations and takes days to get
results. However, using a scale-out
algorithm with Big Data, you can
run multiple algorithms at the
same time on a large cluster and
get the results far faster, in order to
select which well to drill.
Furthermore, Big Data is useful
for determining non-productive
time or ‘invisible loss’ in
operations.
Priyardashy considers that the
oil and gas industry is ‘way behind
u p42
Petroleum Review | January 2015 35
Technology
t p35
other sectors’ in terms of
leveraging Big Data Analytics. ‘The
oil and gas sector lacks real-time
insight into holistic operations.’
The scale of oil industry data is also
a challenge, because sequential
computation of hundreds of
exploration attributes is involved.
There are granularity differences
throughout the lifecycle of an E&P
programme, and a veritable data
explosion.
The oil and gas sector is faced
with complexity along multiple
dimensions in terms of global and
national regulatory frameworks,
ever expanding geographical and
geophysical reach, volatile markets
with fluctuating demand and
complex compliance, a diverse
workforce, with variable cost and
operational efficiency, and suffers
the domino effect due to lack of
collaboration and communication
among third party vendors.
Areas of concern
Priyadarshy says there are four
areas of concern. ‘There is lack of
knowledge of Big Data across the
board. Confusingly, people think
Big Data is about dealing with
large amounts of data. It isn’t. Big
42 Petroleum Review | January 2015 Data also encompasses small data,
so I prefer to define it as “All Data”.’
Secondly, he suggests Big Data is
not simply about business
intelligence. ‘Big Data is a data
science problem, to help a company
find new patterns using data
driven models. For example,
though several hundred papers
have been published about stuck
pipe, the problem still exists. Your
domain expertise can only be as
good as your openness to new
ideas.’
Thirdly, you must have an open
mind, which means looking for
new patterns in the data, then
asking the right questions. ‘This
requires patience at the leadership
level. It’s like a science experiment,
because often you simply won’t
know which data sets to connect at
the beginning,’ he says.
Many consultants make the
mistake of confusing complex Big
Data problems with business
intelligence. ‘Data scientists try to
find new patterns by merging Big
Data with domain experts.’
He believes Big Data is mired in
stakeholders. ‘Everybody
(numerous consultants) thinks
they can solve the problem, but
nobody looks at the actual
requirement and asks: Why am I
doing this? Why should I connect
the data?’
Fourthly, there is an explosion
of Big Data technology. ‘The
technology boom is welcome but
can also be confusing. Unless the
organisation is agile, you cannot
leverage the explosion of
technology available to handle Big
Data,’ he adds.
Finally, Priyadarshy emphasises
the need for an open data culture.
‘Despite concerns about data
security, the virtual concept of Big
Data allows full governance with
good accessibility for analytics, in a
protected environment. ‘An open
data culture is necessary to create
value from your data asset. If your
data is in silos or takes numerous
forms, you will not be able to
connect the data sets to create new
patterns.’
The big question is: ‘What is my
business strategy and where is the
largest sunk cost? How can this be
avoided to boost profitability and
business efficiency?’ To sum up, Big
Data is about getting value from all
data by leveraging emerging
technologies and pattern-based
studies for innovation, strategy,
faster decisions and better
decisions to help the bottom line. l