Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Technology COMPUTING The 7 pillars of Big Data The oil and gas sector has to deal with large amounts of data. Big Data is about how to draw value and competitive advantage from that data. Dr Satyam Priyadarshy, Chief Data Scientist at Halliburton, Landmark, considers there are seven key components, reports Brian Davis. T ypically software experts talk about Big Data in terms of the ‘3Vs’ – volume, velocity and variety (see Petroleum Review, December 2014, p14–16). However, Dr Satyam Priyadarshy, Chief Data Scientist at Halliburton, Landmark, considers there are actually ‘7Vs’ that describe Big Data – volume, velocity, variety, veracity, virtual, variability and value. Priyadarshy suggests that Big Data is an emerging landscape which consists of a variety of technologies, algorithms, products and solutions for leveraging a wide range of disparate data sets with a high degree of complexity. He sees Big Data as an enabler for value creation to remain competitive in a global environment by leveraging data ingestion, mining and visualisation of all the geophysical, seismic and related E&P data sets. The 7Vs explained The first two components of Big Data are volume and velocity. This means addressing the volume of data coming in – which is rising from terabytes to petabytes – as well as the velocity at which data has to be analysed. Some data is received in real-time without the need for real-time analysis. However, some has to be analysed in real-time for alerts and operational efficiency. 34 Petroleum Review | January 2015 The third component of Big Data is variety, consisting of structured or unstructured data. Structured data (on temperature, pressure and other parameters) generally comes from sensors. The fourth component – is the veracity of data, as confusion can arise because of incomplete definition regarding ‘How true is the data?’ The next component is virtual data. This component enables the E&P industry to pipeline the data for analytics, from internal and external sources, with the right governance without the extra cost of de-duplication and loss in transformation. Variability can occur in each of these six components, depending at what stage of an E&P programme one is interested in. For example, seismic data is generated in high volume, low velocity and the associated value of seismic data interpretation is high. Whereas, on some occasions, the velocity of drilling data may be low, variety is low but the value can be significant. Sometimes, the volume of data is small, but comes at faster speed, and needs to be analysed in real-time, in-field. The seventh component of Big Data is most important – value. Without the value in data, the other Vs do not matter in E&P. Big Data can offer a measure of business performance, both from a historical perspective, as an ongoing measure, and for forecast of future performance. ‘Big Data has a multi-functional aspect,’ explains Priyadarshy. ‘In order to create value from these data assets, the industry should leverage emerging technologies for four key areas of business, namely: innovation, business strategy, faster and better decision making.’ Big Data is about focusing on business growth by converting data assets into valuable products, ie by finding new patterns in historical, existing and future data from the E&P ecosystem, which can lead to innovation, strategy change and better decisions. Priyadarshy recommends an agile attitude towards new technology adoption for Big Data purposes, ‘leveraging multiple technologies, applications and programmes (TAPS) to discover new patterns, and building holistic capabilities which augment first principle models’. This requires a change of mind set, breaking down data and cultural silos while maintaining proper governance, in order to integrate data across the business in a more effective manner. Here again, Big Data requires a paradigm shift, asking new Technology Big Data is about getting value from all data by leveraging emerging technologies and pattern-based studies for innovation, strategy, faster decisions and better decisions to help the bottom line questions about operations and functions from new patterns that emerge from pipelining all the data to determine its impacts on the E&P business. ‘Big Data enables an inquiry-based approach, examining patterns or answers to questions that lead to new ways to solve complex problems,’ he says. E&P is typified by large volumes of data sets from disparate sources. Consequently, there is a need to connect data from various sources, using iterative modelling, in concert with domain expertise and a well qualified data scientist team. Full executive buy-in is necessary to achieve the most profitable outcomes, safely and effectively. New technology Emerging technology, commonly called the Internet of Things is enabling disparate data sets to be generated at high frequency. Priyadarshy draws an analogy between gold mining and Big Analytics. In the former, it typically takes 22 tonnes of raw ore to produce 0.04 ounces of gold. Hopefully, the odds are better in the world of Big Data. ‘Advanced analytics helps understanding of cause-effect relationships, prediction of future events and identifies the best possible courses of action. Moreover, Big Analytics can give actionable insights for increasing revenue and profit, for value measurement and enabling innovation,’ he remarks. Data is only the foundation. Above that is the ‘information’ layer, whether using algorithms or creating data-driven models. Here, visualisation tools are important. Traditionally, people would collect data and transform it to create what are called cubes, which were loaded into data warehouses. The focus for Big Data is to ‘extract, load and transform’. Typically, data is collected and loaded into a HADOOP open source database, which enables distributed processing of large data sets across clusters of servers, where it can be transformed and analysed. Priyadarshy emphasises: ‘Data is stored in various places within the oil and gas industry. But in order to create value out of it, you have to create an open data culture.’ The role of the Chief Data Scientist is also new to this sector. So where should companies focus in terms of Big Data projects? Priyadarshy recommends organisations should ask: What are your business problems? What are the risk areas? Where can you save money? ‘You should focus on the following four areas: innovation, strategy, and better and faster decisions,’ he says. Sample applications Priyadarshy cites the case where an operator regularly suffered stuck pipe during drilling operations and didn’t know how to predict when the problem would arise. ‘This is a classic problem which can take advantage of Big Data. It fits into the “better and faster” decision category because you want to predict whether there will be a stuck pipe (say) two hours or five days from now, so you can decide fast whether to change the pressure on the drill bit or bring in a different team.’ Big Data is also valuable for analysis of seismic data. Typically, sequence analysis of waveform data involves numerous calculations and takes days to get results. However, using a scale-out algorithm with Big Data, you can run multiple algorithms at the same time on a large cluster and get the results far faster, in order to select which well to drill. Furthermore, Big Data is useful for determining non-productive time or ‘invisible loss’ in operations. Priyardashy considers that the oil and gas industry is ‘way behind u p42 Petroleum Review | January 2015 35 Technology t p35 other sectors’ in terms of leveraging Big Data Analytics. ‘The oil and gas sector lacks real-time insight into holistic operations.’ The scale of oil industry data is also a challenge, because sequential computation of hundreds of exploration attributes is involved. There are granularity differences throughout the lifecycle of an E&P programme, and a veritable data explosion. The oil and gas sector is faced with complexity along multiple dimensions in terms of global and national regulatory frameworks, ever expanding geographical and geophysical reach, volatile markets with fluctuating demand and complex compliance, a diverse workforce, with variable cost and operational efficiency, and suffers the domino effect due to lack of collaboration and communication among third party vendors. Areas of concern Priyadarshy says there are four areas of concern. ‘There is lack of knowledge of Big Data across the board. Confusingly, people think Big Data is about dealing with large amounts of data. It isn’t. Big 42 Petroleum Review | January 2015 Data also encompasses small data, so I prefer to define it as “All Data”.’ Secondly, he suggests Big Data is not simply about business intelligence. ‘Big Data is a data science problem, to help a company find new patterns using data driven models. For example, though several hundred papers have been published about stuck pipe, the problem still exists. Your domain expertise can only be as good as your openness to new ideas.’ Thirdly, you must have an open mind, which means looking for new patterns in the data, then asking the right questions. ‘This requires patience at the leadership level. It’s like a science experiment, because often you simply won’t know which data sets to connect at the beginning,’ he says. Many consultants make the mistake of confusing complex Big Data problems with business intelligence. ‘Data scientists try to find new patterns by merging Big Data with domain experts.’ He believes Big Data is mired in stakeholders. ‘Everybody (numerous consultants) thinks they can solve the problem, but nobody looks at the actual requirement and asks: Why am I doing this? Why should I connect the data?’ Fourthly, there is an explosion of Big Data technology. ‘The technology boom is welcome but can also be confusing. Unless the organisation is agile, you cannot leverage the explosion of technology available to handle Big Data,’ he adds. Finally, Priyadarshy emphasises the need for an open data culture. ‘Despite concerns about data security, the virtual concept of Big Data allows full governance with good accessibility for analytics, in a protected environment. ‘An open data culture is necessary to create value from your data asset. If your data is in silos or takes numerous forms, you will not be able to connect the data sets to create new patterns.’ The big question is: ‘What is my business strategy and where is the largest sunk cost? How can this be avoided to boost profitability and business efficiency?’ To sum up, Big Data is about getting value from all data by leveraging emerging technologies and pattern-based studies for innovation, strategy, faster decisions and better decisions to help the bottom line. l