Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Geographic information system wikipedia , lookup
Pattern recognition wikipedia , lookup
Neuroinformatics wikipedia , lookup
Theoretical computer science wikipedia , lookup
Multidimensional empirical mode decomposition wikipedia , lookup
Corecursion wikipedia , lookup
Operational-Log Analysis for Big Data Systems: Challenges and Solutions ABSTRACT Big data systems (BDSs) are complex, consisting of multiple interacting hardware and software components, such as distributed computing nodes, databases, and middleware. Any of these components can fail. Finding the failures' root causes is extremely laborious. Analysis of BDSgenerated logs can speed up this process. The logs can also help improve testing processes, detect security breaches, customize operational profiles, and aid with any other tasks requiring runtime-data analysis. However, practical challenges hamper log analysis tools' adoption. The logs emitted by a BDS can be thought of as big data themselves. When working with large logs, practitioners face seven main issues: scarce storage, unscalable log analysis, inaccurate capture and replay of logs, inadequate log-processing tools, incorrect log classification, a variety of log formats, and inadequate privacy of sensitive data. Some practical solutions exist, but serious challenges remain. This article is part of a special issue on Software Engineering for Big Data Systems. EXISTING SYSTEM Essentially, BDSs designed to process big data usually emit big data (captured in logs) themselves. Of course, not all BDSs generate large volumes of logs. Also, small systems might generate big data. However, most BDS-emitted logs will exhibit at least one big data characteristic. To leverage log data, developers need ways to effectively deliver, store, and crunch large volumes of data. Each of these processes poses challenges. When analyzing large logs for industrial projects at IBM and Ericsson. Disadvantages of Existing System: 1. Scarce storage, 2. Unsalable log analysis, 3. Inaccurate capture and replay of logs, 4. Inadequate log-processing tools, 5. Incorrect log classification, 6. A variety of log formats, and 7. Inadequate privacy of sensitive data. PROPOSED SYSTEM In Proposed System, to pinpoint a problem’s root cause, analysts typically examine operational data—logs and traces— generated by the BDS components. A log or trace is a sequence of temporal events captured during a particular execution of a system. For example, a log can contain software execution paths, events triggered during software execution, or user activities. No clear distinction exists between logs and traces. Often, the term “log” represents how a program is used (such as security logs), whereas “tracing” captures a program’s elements that are invoked in a given execution of the system. Tracing is used for debugging and program understanding. In this article, we primarily use the term “log.” Advantages of Proposed System: 1. We provide some solutions for few challenges. SYSTEM REQUIREMENTS Hardware Requirements: Processor - Pentium –IV Speed - 1.1 Ghz Ram - 256 Mb Hard Disk - 20 Gb Key Board - Standard Windows Keyboard Mouse - Two or Three Button Mouse Monitor - SVGA Software Requirements: Operating System : Windows XP Coding Language : Java