Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Value of Timely Analytics $ value of analytics Web Analytics – Ad placement, Financial Services, Smart Grids, Monitoring – Systems mgmt, Health Care, Manufacturing, etc. Forecasting in Enterprises Historical Trend Analysis years months days Time of interest hrs min sec Present Current Products for Analytics Load barrier is dictated by current choices of the solution, e.g., loading into databases, persisting into files. This is intrinsic because in current approaches no processing can be done till the data is loaded. Facts/sec. Custom-built solutions that carry huge development and customization costs 100000 10000 Active DW analytics 1000 Traditional DW Analytics years months 100 days hrs min sec Time of interest Present ET time in ETL Load time in ETL Operational Intelligence Platform Sources Data Bus Caching Processing Distribution Visualization Refresh (Push) Operational Analytics Devices, Sensors Reference Data Microsoft StreamInsight Automated Decisions Message Bus Cache Operational Dashboard (Ticking - Snapshot) Refresh (Push) Web servers Reporting Dashboard (Refreshed) In-memory Database ETL Re-compute (Pull) Static Reports Intra-Day Cubes Stock tickers & News feeds Service Broker ETL Historic Cubes Mining, Validation, “What-If” Scenarios The Need for an Event-Driven Platform Analytical results need to reflect important changes in business reality immediately and enable responses to them with minimal latency Database Applications Event-driven Applications Query Paradigm Ad-hoc queries or requests Continuous standing queries Latency Seconds, hours, days Milliseconds or less Data Rate Hundreds of events/sec Tens of thousands of events/sec or more Query Semantics Declarative relational analytics Declarative relational and temporal analytics request response Event input stream output stream Scenarios for Event-Driven Applications Latency Months CEP Target Scenarios Days Relational Database Applications hours Operational Analytics Applications, e.g., Logistics, etc. Data Warehousing Applications Web Analytics Applications Minutes Seconds 100 ms Monitoring Applications Manufacturing Applications < 1ms 0 10 100 1000 10000 Financial trading Applications 100000 ~1million Aggregate Data Rate (Events/sec.) 6 Overview: Microsoft StreamInsight .NET C# LINQ Application Development Event sources Devices, Sensors Event StreamInsight Engine Pagers & Monitoring devices Standing Queries Event Event Event Event Event Event Event Output Adapters Input Adapters Web servers Event targets Application at Runtime ` KPI Dashboards, SharePoint UI Event Trading stations Event stores & Databases C_ID C_NAME C_ZIP Stock tickers & News feeds Event stores & Databases Static reference data 7 Virtuous Cycle: Monitor, Manage, Mine CEP advantage Industry trends • Data acquisition costs are negligible • Raw storage costs are small and continue to decrease Monitor KPIs Record raw data (history) Manage business via KPI-triggered actions • Processing costs are non-negligible • Process data incrementally, i.e., while it is in flight • Avoid loading while still doing the processing you want • Seamless querying for monitoring, managing and mining • Data loading costs continue to be significant Mine historical data Devise new KPIs 8 Example Scenarios Manufacturing: • Sensor on plant floor • React through device controllers • Aggregated data • 10,000 events/sec Web Analytics: • Click-stream data • Online customer behavior • Page layout • 100,000 events /sec Financial Services: • Stock & news feeds • Algorithmic trading • Patterns over time • Super-low latency • 100,000 events /sec Power, Utilities: • Energy consumption • Outages • Smart grids • 100,000 events/sec Visual trend-line and KPI monitoring Batch & product management Automated anomaly detection Real-time customer segmentation Algorithmic trading Proactive condition-based maintenance Asset Specs & Parameters Stream Data Store & Archive Data Stream Data Stream Asset Instrumentation for Data Acquisition, Subscriptions to Data Feeds Event Processing Engine Lookup • Threshold queries • Event correlation from multiple sources • Pattern queries 9 Power Utilities Scenario: Smart grid Instrument households with smart power meters Continuous, up-to-date insight into your grid, including generation, distribution, and demand StreamInsight advantage Scales to smart grids requirements Scale to millions of meters Hundreds of thousands of meter readings per second Write validation, editing, estimation (VEE) rules declaratively in LINQ Scale to the high data volumes expected in smart grids React in almost real-time to changing grid conditions to avoid power outages Financial Services Scenario: Real-time Risk Continuous insight into market conditions and exposure Continuous low-latency market monitoring Manage risks across traders and per desk with aggregate and individual thresholds StreamInsight advantage: Implement risk monitoring declaratively in LINQ Detect and notify in near real-time on risk No change to models or LINQ code necessary for back-testing over historical data Web Analytics Scenario: Real-time Behavioral Targeting Continuously analyze online behavior per user Identify relevant content before the next click Define content behind next click based on detected online behavior StreamInsight advantage: Scale to millions of concurrent online users Immediate insight - real time analytics Web logs no longer processed offline in batches Correlate across your web farms and applications Retail (Online and Traditional) Scenario: Real-Time Coupon Provide most relevant/appealing coupon Maximize expected individual customer revenue Correlate current sales transaction with customer purchase history StreamInsight advantage Track current market basket as a real-time stream Use StreamInsight lookup pattern to correlate current market basket with purchase history Easily scale to internet retail with millions of concurrent sessions Event Types StreamInsight events in use the .NET type system Events are structured and can have multiple fields Fields are typed using the .NET framework types StreamInsight engine provisioned timestamp fields capture all the different temporal event characteristics Event sources populate time stamp fields All calculations done based on “business time” Timestamps/Met adata … Long pumpID … String Type String Location … … Double flow … Double pressure … Event Streams & Adapters A stream is a sequence of events Defined over a .NET type Possibly infinite Stream characteristics: Event/data arrival patterns (steady, bursty) Out of order events: Order of arrival of events does not match the order of their application timestamps Adapters Receive/get events from the data source Enqueue events for processing in the engine Insertions of new events Changes to event durations 15 StreamInsight Query Features Operators over streams Calculations (PROJECT) Correlation of streams from different data sources (JOIN) Check for absence of activity with a data source (EXISTS) Selection of events from streams (FILTER) Stream partitioning (GROUP & APPLY) Aggregation (SUM, COUNT, …) Ranking and heavy hitters (TOP-K) Temporal operations: hopping window, sliding window Extensibility – to add new domain-specific operators LINQ Query Examples LINQ Example – JOIN, PROJECT, FILTER: from e1 in MyStream1 join e2 in MyStream2 on e1.ID equals e2.ID where e1.f2 == “foo” select new { e1.f1, e2.f4 }; Join Filter Project LINQ Example – GROUP&APPLY, WINDOW: from e3 in MyStream3 group e3 by e3.i into SubStream from win in SubStream.HoppingWindow( FiveMinutes,ThreeSeconds) select new { i = SubStream.Key, a = win.Avg(e => e.f) }; Grouping Window Project & Aggregate Extensibility SDK Built-in operators do not cover all functionality Need for domain-specific extensions Integrate with functionality from existing libraries Support for extensions in the CEP platform: User-defined operators, functions, aggregates Code written in .NET, deployed as .NET assembly Query operators and LINQ can refer to functionality of the assembly Temporal snap-shot operator framework Interface to implement user-defined operators Manages operator state and snapshot changes Framework does the heavy lifting to deal with intricate temporal behavior such as out-of-order events StreamInsight Deployment Alternatives Web servers Data Sources StreamInsight Sensors StreamInsight Devices Feeds Event processing engines are deployed at multiple places on different scales • At the edge – close to the data source • In the mid-tier – consolidate related data sources, • In the data center – historical archive, mining, large scale correlation. Aggregation & Correlation StreamInsight StreamInsight StreamInsight StreamInsight StreamInsight StreamInsight Complex Analytics & Mining StreamInsight CEP for lightweight processing and filtering StreamInsight CEP for aggregation and correlation of in-flight events StreamInsight CEP for complex analytics including historical data 20 StreamInsight Deployment Lightweight embedded engine StreamInsight is available as a set of DLLs StreamInsight can be included into your applications Low footprint, small overhead Facilitates deployments close to the data source StreamInsight Windows service Runs the engine as a Windows service Applications can share incoming streams Well-suited for more centralized deployments Installation Small, lightweight MSI Installs in 2 minutes SQL Server 2008 R2 Capabilities by Edition Parallel Data Warehouse Workload Standard Enterprise Datacenter Custom/Packaged OLTP Apps 4 procs, 64GB RAM, Backup Compression 8 procs, 2TB RAM, Adv. Security, Backup Compression >8 procs, OS Max, Adv. Security, Backup Compression N/A 1 VM/license 4 VMs/license, Resource Governor App & Multi-Server Mgmt (up to 25 instances) Unlimited Virtualization, Resource Governor, App & Multi-Server Mgmt (> 25 instances) N/A Scale-Up DW, Data Compression Scale-Up DW, Data Compression Scale-Out DW 10s of TBs, Up to 30 TB with FastTrack 10s of TBs 10s - 100s of TBs Enterprise-Scale BI, Master Data Services, PowerPivot Mgmt >5000 events/sec & < 5 s latency Server Consolidation Data Warehousing Business Intelligence Dept/Team BI Enterprise-Scale BI, Master Data Services, PowerPivot Mgmt Complex Event Processing (StreamInsight) <5000 events/sec & > 5 sec latency <5000 events/sec & > 5 s latency Integrated with SSIS, SSAS and SSRS Future coverage StreamInsight Solutions Scenarios: Manufacturing Utilities Oil & Gas Financial Services Web Analytics Telco Alarming AMI/SmartGrid Well Monitoring Risk Management Behavioral Targeting CDR Aggregation Notifications Outage Management Operational Intelligence Market Monitoring OSIsoft Matrikon Telvent ICONICS OSIsoft Matrikon Lab49 MSFT AdCenter XBox DPE Hitachi Consulting Lab49 MSFT AdCenter XBox DPE Load Monitoring Real-Time Analysis ISV: SI: OSIsoft Matrikon ICONICS Recap: Microsoft StreamInsight Development experience with .NET, C#, LINQ and Visual Studio 2008 CEP Application Development CEP platform from Microsoft to build event-driven applications Event sources CEP Engine Event ` C_ID C_NAME Output Adapters Input Adapters Standing Queries Event-driven applications are fundamentally Event Event different from traditional database Event Event applications: queries are continuous, Event consume and produce streams, and Event compute results incrementally Event Event Flexible adapter SDK with high performance to connect to different event sources and sinks Event targets C_ZIP The CEP platform does the heavy lifting for you to deal with temporal characteristics of event stream data Static reference data 24 For More Information StreamInsight main page & download : http://www.microsoft.com/sqlserver/2008/en/us/R2-complexevent.aspx StreamInsight blog: http://blogs.msdn.com/streaminsight/ StreamInsight MSDN documentation: http://msdn.microsoft.com/enus/library/ee362541(SQL.105).aspx StreamInsight E-clinics on Microsoft e-learning https://www.microsoftelearning.com/eLearning ASI07-INT | Real-Time Event Integration with Microsoft SQL Server 2008 R2 StreamInsight and Microsoft BizTalk Server BIE202 | Data Integration at Microsoft: Technologies and Solution Patterns BIP302 | Enabling Real-time Business Insight, Analytics and Reporting DAT23-HOL | Querying Events in Microsoft SQL Server 2008 R2 StreamInsight Using LINQ DAT20-HOL | Working with the Microsoft SQL Server 2008 R2 StreamInsight Event Flow Debugger www.microsoft.com/teched www.microsoft.com/learning http://microsoft.com/technet http://microsoft.com/msdn Sign up for Tech·Ed 2011 and save $500 starting June 8 – June 31st http://northamerica.msteched.com/registration You can also register at the North America 2011 kiosk located at registration Join us in Atlanta next year