Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Microsoft Access wikipedia , lookup
Tandem Computers wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Functional Database Model wikipedia , lookup
Clusterpoint wikipedia , lookup
Relational model wikipedia , lookup
Database model wikipedia , lookup
SQL Server 2005 Business Intelligence 기능 소개 강민석 대리 고객지원부 한국 마이크로소프트 강사 소개 • 한국 마이크로소프트 기술지원부 – RRE (Rapid Response Engineer) • SQL Server • Analysis Services • MCDBA 대상 기술범위: • SQL Server 2005 Business Intelligence 기능 소개 – SQL Server BI를 위한 Tools – SQL Server 2005 Analysis Services • SQL Server 2000 Analysis Services 와 비교 – SSIS (SQL Server Integration Services) – Reporting Services (Report Builder) 이 주제를 이해하는 데 필요한 지식 • • • • SQL Server 2000 (Level 100) SQL Server 2000 Analysis Services (Level 300) SQL Server 2000 DTS (Level 200) SQL Server 2K Reporting Services (Level 200) Level 200 목차 • SQL Server 2005 BI 구성요소 • • • • Analysis Services SSIS(SQL Server Integration Services) Reporting Services & Report Builder Q&A SQL Server 2005 BI 구성요소 • SQL Server 2000 – BI Stack • SQL Server 2005 – BI Stack SQL Server 2000 – BI Stack Reporting Services Analysis Manager Analysis Services (OLAP & DM) Enterprise Manager DTS (ETL) SQL Server (RDBMS) • 2004년 초에 SQL Server 2K Product에 포함됨. • Web Services를 이용하여 사용자 인터페이스 제공 • SQL Server 2K 의 OLAP 서비스 • Data Mining 알고리즘 제공 • Microsoft Decision Trees • Microsoft Clustering • 데이터 추출 변환 통합 수행 • 그래픽 Tool 및 프로그래밍 가능한 Object 집합 SQL Server 2005 – BI Stack BI Development Studio Analysis Enterprise SQL Server Management Studio Manager Manager Reporting Services (Reporting) Analysis Services (OLAP & DM) Integration Services (ETL) SQL Server (RDBMS) • New: Single, integrated SQL Server Management Studio • New: Single, integrated Business Intelligence Development Studio • New: Reporting Tool Reporting Builder SQL Server Management Studio • 통합된 관리 – RDBMS, OLAP, DM, IS, Reporting BI Development Studio Analysis Enterprise SQL Server Management Studio Manager Manager Reporting Services (Reporting) Analysis Services (OLAP & DM) Integration Services (ETL) SQL Server (RDBMS) • DBA move “upstream” – 모든 것이 스크립트 가능 – 강화된 스케줄링 기능 – Auto-Maintenance • 강화된 Query Analyzer – IntelliSense • T-SQL Editor를 위해 구현된 것이 아님. – color coding – MDX and DMX 지원 • 강화된 Profiler – Analysis Services, DTS support • New Managed API’s BI Development • 최초로 통합된 BI IDE BI Development Studio Analysis Enterprise SQL Server Management Studio Manager Manager Reporting Services (Reporting) Analysis Services (OLAP & DM) Integration Services (ETL) SQL Server (RDBMS) – RDBMS, OLAP, DM, ETL, Reports, Code, HTML… • Enterprise Development – Source control, versioning, – Developer & resource isolation – Deployment • Major Productivity & Usability Enhancements – – – – DW Generator Intellicube Wizard Data Source Views More… SQL Server in the BI • Data Warehouse 강화 BI Development Studio Analysis Enterprise SQL Server Management Studio Manager Manager Reporting Services (Reporting) Analysis Services (OLAP & DM) Integration Services (ETL) SQL Server (RDBMS) – – – – – Partitioned Tables Online Index Re-organization NUMA support Bulk Load improvements User Defined Aggregations • XML Web Services – Native XML Data Type – XQuery Support – Sprocs as Web Services • Programmability – CLR lang. Sprocs, • Real Time Log Shipping • Security and Abilities Integration Services in the BI • The New name of DTS • “Big ETL” – Highest end scale and throughput – Rich connectivity BI Development Analysis Enterprise SQL Server Management Studio Manager Manager Reporting Services (Reporting) Analysis Services (OLAP & DM) Integration Services (ETL) SQL Server (RDBMS) • Advanced process flow – Branches, loops… • New : Data Pipeline – Streamed transformations • Programmability – CLR based transforms – Rich debugging • Major productivity & usability enhancements Analysis Services in the BI Enterprise DTS (ETL) SQL Server (RDBMS) • Unified Dimensional Model BI Workbench Analysis Management Workbench Manager Manager Reporting Services (Reporting) Analysis Services (OLAP & DM) – OLAP unified with Relational Reporting – Real-time BI – Advanced Intelligence with KPIs, MDX Scripts, Translations, Currencies… • Web Services – Native XML/A support • Scalability and Other Abilities – Unlimited dimensions,… • High-end Data Mining – x4 DM algorithms Clustering Support – Multi Instance 제공 Reporting Services in the BI Enterprise DTS (ETL) SQL Server (RDBMS) • Rich interactive reporting BI Workbench Analysis Management Workbench Manager Manager Reporting Services (Reporting) Analysis Services (OLAP & DM) – – – – Chart, pivot, slice, drill Native OLAP support Flexible forms & formats Reports as Data Source • Enterprise Management – Catalogs, Schedules, Snapshots, Cache, Security, Clusters… • Reporting platform – Web Services API – Rich Extensibility – Developer tools 데모 •SQL Server 2005 BI 구성요소 – SQL Server Management Studio – SQL Server BI Development Studio – Reporting Services – SQL Server Integration Services – Q&A 목차 • SQL Server 2005 BI 구성요소 • • • • Analysis Services SSIS(SQL Server Integration Services) Reporting Services & Report Builder Q&A Analysis Services • SQL Server 2005 Analysis Services Overview • Demo : Cube 생성 비교 – Cube 생성 in SQL Server 2K – Cube 생성 in SQL Server 2005 Analysis Services Overview • • • • • • Analysis Services (OLAP) Programmability Preview Deployment Measure Groups and Time Alternate Aggregations AMO Development Analysis Services (OLAP) • Database Data Access • Spreadsheet Data Access • “Database” OLAP • “Spreadsheet” OLAP • Analysis Services OLAP Database Data Access • SELECT Sum(«Measure») FROM «Source» WHERE «Slicers» GROUP BY «Dicers» • 느려지는 잠재성을 가짐 Spreadsheet Data Access • Cell Reference =C5 • Retrieves data by position A 1 2 3 4 5 6 7 B C D E F Quarter Product Qtr 1 Qtr 2 Qtr 3 Qtr 4 Grand Total Apples 500 3000 2400 3900 9800 Cherries 700 2200 3800 4600 11300 Grapes 1600 2800 3800 4800 13000 Melons 1000 2900 3800 5900 13600 Grand Total 3800 10900 13800 19200 47700 Spreadsheet Simple Formulas Complex 고객의 요구 Database Small Large Data Formulas Complex “Database” OLAP Simple OLAP Database Small Large Data Spreadsheet Simple Formulas Complex “Spreadsheet” OLAP OLAP Small Large Data Spreadsheet Simple Formulas Complex Analysis Services OLAP OLAP Database Small Large Data Programmability Preview • Analysis Services Scripting Language (ASSL) • Analysis Management Objects (AMO) Analysis Services Scripting Language (ASSL) • Analysis Services Objects의 구조를 정의함. – Cubes, Dimensions, Data Mining models • Data Source에 Analysis Services object들을 연결(binding) • AS 객체를 생성, 변경, 배포하는데 사용되는 언어 • Language is XML/A – XML for Analysis – multi-dimensional server환경에서 query와 command를 보내기 위한 SOAP 기반 언어 – 더 자세한 정보는 www.xmla.org에서 찾아 볼 수 있음. ASSL 사용 • Developer – BI Development Studio Design tool을 이용하여 Cube 집합을 설계 – 프로젝트의 일부로 저장 – XML 편집기를 이용해서 직접 Cube를 정의한 파일을 열수도 있다. • Administrator – Analysis Services objects를 생성 변경하기 위해서 SQL Server Development Studio에서 직접 XML 편집기로 이용. Analysis Management Objects (AMO) • ASSL 명령어를 생성하기 위한 .NET API • Server, Database, Data Source view, cube, dimension, mining model, role을 포함하는 ASO를 관리 할 수 있음. • AMO가 아니라 ASSL 에서 Script 생성 • Replacement for AS2000 Decision Support Objects (DSO) – DSO는 하위 버전과의 호환성을 위해서 이용 가능함 Deployment • • • • Translations Unknown Member Build, Deploy, Process Deployment Utility Translations • Member Caption들을 변환 – Need Column in relational dimension table – Add as Caption Column in Translation section • Localize Heading (e.g. Attribute name) NEW – Add as Constant in Translation section • Localize Constant Members (e.g., All) NEW – Add as Constant in Translation section Unknown Member • For when explicit attribute key is not found – Prime example is fact table – Also works within a dimension • 주의할 점 – Dimension 전체에서 Unknown Member는 하나 (not attribute) – 명시적인 Null 값은 capture 하지 마라 • 만일 attribute가 “Star” 테이블로부터 만들어 진다면, attribute hierarchy는 NULL member를 가질 것이다. – e.g., Gender 컬럼에서 Customer dimension은 NULL Build, Deploy, Process • Save (from BI Development Studio) – Source definition files stored as XML • Build (from BI Development Studio) – Assembles project into ASDatabase XML file • Deploy – Wraps XML with Create command and sends to server – Use Deployment Wizard to configure script • Process – Load actual dimensions and partitions with data – BIDS automatically processes on deploy Deployment Utility • Start with ASDatabase definition file • Supply configuration information – Server name, target database name – Replacement mode for partitions and roles – Locations for error logs and data files – Processing options (full, default, or none) • Optionally creates XMLA deployment script Measure Groups and Time • • • • • Measure Groups Cube Dimension Usage Perspectives Multiple Calendars Measure Groups • • • • Combine fact tables of different grain Similar to cube in AS2K One measure group per (logical) fact table Map grain of dimension to measure group Cube • Assembly of measure groups • Encourages combining fact tables of different grain Similar to AS2K virtual cube (integrates data from different fact tables) • Logical relationships of all relevant data – Unified Dimension Model (UDM) – Physical storage may be OLAP or relational Dimension Usage • Same dimension appears in two fact tables – Sales – Date.Day – Forecast – Date.Month • Dimension Usage allow linkage – Dimension not used at all – Dimension used at leaf grain – Dimension used at higher grain Perspectives • A complete cube may be overwhelming • Perspective is simplified version of a cube – Select measures, hierarchies to include – Similar to AS2K virtual cube (eliminates dimensions or measures from a cube) • Completely logical layer, no physical storage Multiple Calendars • Alternate hierarchies – Within single dimension – Calendar – Fiscal – Weeks • Role-playing dimensions – Separate fact table foreign keys – Order Date – Ship Date – Due Date – Each reuses all hierarchies of date dimension Calculated Members • Average Price – [Measures].[Sales Amount] / [Measure].[Order Quantity] – Calculates for current member of all other dimensions KPIs • Key Performance Indicators • Calculate Value, Goal, Status, Trend, Weight, Gauge • Accessible from client application 데모 • SQL Server 2k와 SQL Server 2005 비교 – SQL Server 2K Analysis Services – SQL Server 2005 Analysis Services 목차 • SQL Server 2005 BI 구성요소 • • • • Analysis Services SSIS(SQL Server Integration Services) Reporting Services & Report Builder Q&A What is SSIS ? • SQL Server Integration Services • A new Microsoft SQL Server Business Intelligence application • Data Transformation Services의 새로운 제품 • The platform for a new generation of high performance data integration technologies Example: before Integration Services Alerts & escalation Call centre data: semi structured Text Mining Staging Staging Legacy data: binary files Application database • • • • Hand coding ETL Data mining ETL Staging Cleansing & ETL ETL Warehouse Reports Mobile data Integration and warehousing require separate, staged, operations. Preparation of data requires different, often incompatible, tools. Reporting and escalation is a slow process, delaying smart responses. Heavy data volumes make this scenario increasingly unworkable. Example: with Integration Services Alerts & escalation Call centre: semi-structured data Text mining components Data mining components Custom source Merges Standard sources Data cleansing components Mobile data Warehouse Legacy data: binary files Application database • • • • SQL Server Integration Services Reports Integration and warehousing are a seamless, manageable, operation. Source, prepare and load data in a single, auditable process. Reporting and escalation can be parallelized with the warehouse load. Scales to handle heavy and complex data requirements. How does it work? Control Flow Data Flow FTP Flat File Source Oracle ADO.NET Source Send Mail Merge Loop De-duplicate Execute SQL Split Data Flow SQL Server Flat File Just loaded Control This Arranged Which Data From And as control Flow can there has data flow be inis its itinto Loops flow enables from merged can acan own special multiple, can be be multiple, object and merged cleansed the into include task Sequences user diverse, amodel. heterogeneous single …in many to … the define Itflow, destinations. flow, is different and used even aso related complex for sources itkinds from can moving bybe varied of workflow constraints. … split tasks data. sources. and …ofpartitioned tasks. … Architecture Standard transforms Custom transforms Data Destination Adapters Data Source Adapters Package XML Package Loops & Sequences Tasks XML Package Event Handlers Wizards DTS Designer Command Line Life Cycle tools • Design – Business Intelligence Designer – Migration wizard for pre SQL 2005 packages – Visual Source Safe Integration • Deployment – – – – Configuration Wizard: flexible package configuration Deployment Utility: Install packages SQL Agent: Schedule package execution Command Line Utility: Execute packages • Supportability – – – – SSIS service to monitor running packages and stored packages Rich Logging Checkpoint - Restart ability WMI Integration Integration Service Overview • • • • • • Cleanse Data Input Split an Output Channel Script to Branch Control Lookup Fuzzy Values Loop through Folder Configure and Deploy Cleanse Data Input • • • • • Connection Manager Flow Types Data Adapter Metadata Validation Data Flow Components Connection Manager • Package object • Layer between environment and outside world – Allows for indirection, configuration – Manages opening, closing, sharing • Important for: – Isolation, 이동성(Portability) – Metadata • Used for both tasks and transformations Flow Types • Two different types of flow – Control Flow = Runtime = Tasks – Data Flow = Pipeline = Transforms • Managed in Designer – Used to be single view in SQL 2000 – Separate views in SQL 2005 • Control Flow handles tasks and precedence • Data Flow handles transformations - zoomed contents of Data Flow task Data Adapter • Data Flow object • Logical use of data – Data Flow Source – Data Flow Destination • Links to Connection Manager Metadata Validation • Confirm and link columns in source and destination – Define source first, then flow data to destination. Destination responds. – Changing the column definition invalidates the “contract” with the destination • Convenient if source exists and is stable – Disable to build/manipulate files in progress Data Flow Components • Components Source Transform Destination • Paths – Data route between one component and the next – Includes metadata about columns moving around – Lineage Identifier tracks item transformations • Pipeline – Components connected by a path Split an Output Channel • Distributors and Collectors • Precedence Constraints Distributors and Collectors • Distributor transformations – Multicast – Conditional Split • Collector transformations – Union All – Merge – Merge Join Tfm Src D Tfm Tfm C Dst Precedence Constraints • Connect one task to another • Give sequential relationship to tasks – Success/Failure/Completion workflow – Establish concurrency • Connected sequence of tasks is a task list – Independent Task Lists execute concurrently – Tasks within a list execute sequentially Script to Branch Control • • • • Package Variables Script Task Complex Precedence Containers Package Variables • Scope – Each container can have variables – Define namespace for user variables – Containers can access variables from higher levels • Accessible from – Expressions (such as loops and constraints) – Use @ – Parameters in Execute SQL task – Parent Package (as part of configuration) – Script Script Task • Currently VB .Net • Can read or modify properties throughout the package • Can’t access inner workings of tasks or transforms • Can’t modify pipeline metadata (e.g., number of columns piped) Complex Precedence • Expressions add control – look at variables – Simulate If or Case logic • And/Or linkage of multiple constraints • Disabled task = Success Containers • Container provides – Grouping of task lists (list of one is allowed) – Transaction scope – Variable scope • A package is a container – Add your own tasks – Insert your own containers – Loops are containers, too Lookup Fuzzy Values • • • • Fuzzy Lookup Transformation Data Visualizers Breakpoints Exception Handling Fuzzy Lookup Transformation • Proximity algorithm to find matches • Builds index – Index can persist • Creates metrics – Similarity – Confidence • Uses a separate connection for reference table Data Visualizers • • • • Add to pipeline Pause pipeline flow to inspect contents Function as breakpoints (can disable) Different degrees of summary – Grid – for low-volume detail – Graphs – for high-volume overview Breakpoints • Stop execution during package – Stop control flow before and after – Stop transformation during (visualizer) • Fire selectively Exception Handling • Event Handlers – Special container runs in response to event • Log Providers – Tasks (and the runtime engine) raise log events – Log events can be saved: Profiler, console, table or file • Transactions – Containers, Packages, Multiple-Packages • Checkpoints Loop through Folder • Control Flow • Multiple Sources • Loops Control Flow • Loops are Container Tasks • Different from conventional programming – DTS container task handles loop – Package variable handles enumerator – Can’t write a loop in C# • “No programming required” Multiple Sources • Typical Examples – – – – Folder full of files Multiple servers with identical tables Multiple partitions of a table Multiple asynchronous data feeds • Solution – For or For Each Loop – MULTIFILE Data connection Loops FOR LOOP • Loops while expression is TRUE • Manually add loop counter – Init: @N = 1 – Eval: @N <= 25 – Increment: @N =@N + 1 • Execute tasks in container on each iteration • More control—and more complex than For Each FOR EACH LOOP • Loops over set of objects – Files – XML nodes – Database objects • Set variable (e.g. file name) for each iteration • Execute tasks in container on each iteration Configure and Deploy • • • • Configurations XML Customizability Deployment Execution Configurations • Take something from the system – Environment variables, registry, XML option file – Apply it to some part of your package – Run the package with the new setting • Useful for multiple “similar” jobs – Similar to Dynamic Properties from SQL 2000 – Facilitate reusability, different environments • Configurable at runtime or during execution XML Customizability • Don’t modify DTS XML definition directly – Schema not published—can change • Create configuration file (can be XML) – Map virtually any package object to a new value • Not able to – Configure collection items – Change package structure Deployment • Same server deployment – Deploy from BI Development Studio • Multiple server deployment – Deployment Utility – Package configuration files with package – Deploy to SQL Server (msdb) or file (dtsx) Execution • Command-line execution – DTEXEC • User Interface execution – DTEXECUI – Can generate command line for DTEXEC • Scheduling – SQL Server Agent Customer benefits of SSIS • Performance – Data flows process large volumes of data efficiently - even through complex operations • Facility – Many pre-built adapters and transformations reduce hand coding – Extensible object model enables specialized custom or scripted components – Highly productive visual environment speeds development and debugging • “Smarts” – Data cleansing features enable difficult data to be handled during loading – Data mining brings intelligent handling of data for imputation of incomplete data, conditional processing of potential problems, or smart escalation of issues such as fraud detection 데모 • Integration Services 구현 Reporting Services • Reporting Services Release Roadmap • Analysis Services Support Reporting Services Release Roadmap • SQL Server 2000 Reporting Services SP1 – Bug Fixes, Scalability, Excel 2000 support • SQL Server 2005 Beta 2 – Parity with SQL2K SP1 version – Cross SQL Server (AS, Management) integration • SQL Server 2000 Reporting Services SP2 – Web Parts, Client Printing • SQL Server 2005 Beta 3 – New Features, 64 Bit support • SQL Server 2005 RTM Analysis Services Support • MDX and data mining query builders • MDX parameter support • Member extended properties • Support for server aggregates Management Studio Integration • Single point of management for all SQL Server components • Superset of Report Manager functions • Script generation from property dialogs SharePoint Web Parts • Report Explorer provides browsing of server namespace and subscription • Report Viewer used to view reports • Parts can be connected or used standalone • Works in both SPS and WSS Visual Studio Integration • Report design completely integrated with Visual Studio language projects • Natural extension of VS data functionality • Included in VS Pro and above Report Builder • A new ad-hoc report design tool for Reporting Services • Targeted at business users who want to find and share answers to interesting questions • Driven from a business model of the data so users do not need to understand the underlying data structures • Fully integrated with Reporting Services and delivered in SQL Server 2005 • Complements the Visual Studio Report Designer • Not designed to be a full analytical client, or a replacement for PivotTables Report Builder Client • Built on top of familiar Microsoft Office paradigms (Examples: Microsoft® Excel and PowerPoint®) • Reports built via report templates (table, matrix, chart) • “Click once” WinForms application deployed from the Report Server • Users can create new reports or modify existing reports • Finished reports can be published to server 참고 자료 • • www.microsoft.com/sql http://www.microsoft.com/sql/2005/techinfo/default.a sp • SQL Server 2005 Business Intelligence – Hitachi Consulting