Download Analysis Services

Document related concepts

Microsoft Access wikipedia , lookup

Tandem Computers wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Functional Database Model wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Open Database Connectivity wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Transcript
SQL Server 2005
Business Intelligence 기능 소개
강민석 대리
고객지원부
한국 마이크로소프트
강사 소개
• 한국 마이크로소프트 기술지원부
– RRE (Rapid Response Engineer)
• SQL Server
• Analysis Services
• MCDBA
대상 기술범위:
• SQL Server 2005 Business Intelligence 기능 소개
– SQL Server BI를 위한 Tools
– SQL Server 2005 Analysis Services
• SQL Server 2000 Analysis Services 와 비교
– SSIS (SQL Server Integration Services)
– Reporting Services (Report Builder)
이 주제를 이해하는 데 필요한 지식
•
•
•
•
SQL Server 2000 (Level 100)
SQL Server 2000 Analysis Services (Level 300)
SQL Server 2000 DTS (Level 200)
SQL Server 2K Reporting Services (Level 200)
Level 200
목차
• SQL Server 2005 BI 구성요소
•
•
•
•
Analysis Services
SSIS(SQL Server Integration Services)
Reporting Services & Report Builder
Q&A
SQL Server 2005 BI 구성요소
• SQL Server 2000 – BI Stack
• SQL Server 2005 – BI Stack
SQL Server 2000 – BI Stack
Reporting
Services
Analysis
Manager
Analysis
Services
(OLAP & DM)
Enterprise
Manager
DTS
(ETL)
SQL
Server
(RDBMS)
• 2004년 초에 SQL Server 2K Product에 포함됨.
• Web Services를 이용하여 사용자 인터페이스 제공
• SQL Server 2K 의 OLAP 서비스
• Data Mining 알고리즘 제공
• Microsoft Decision Trees
• Microsoft Clustering
• 데이터 추출 변환 통합 수행
• 그래픽 Tool 및 프로그래밍 가능한 Object 집합
SQL Server 2005 – BI Stack
BI Development Studio
Analysis
Enterprise
SQL Server Management Studio
Manager
Manager
Reporting
Services
(Reporting)
Analysis
Services
(OLAP & DM)
Integration
Services
(ETL)
SQL
Server
(RDBMS)
• New: Single, integrated
SQL Server Management
Studio
• New: Single, integrated
Business Intelligence
Development Studio
• New: Reporting Tool
Reporting Builder
SQL Server Management Studio
• 통합된 관리
– RDBMS, OLAP, DM, IS, Reporting
BI Development Studio
Analysis
Enterprise
SQL Server Management Studio
Manager
Manager
Reporting
Services
(Reporting)
Analysis
Services
(OLAP & DM)
Integration
Services
(ETL)
SQL
Server
(RDBMS)
• DBA move “upstream”
– 모든 것이 스크립트 가능
– 강화된 스케줄링 기능
– Auto-Maintenance
• 강화된 Query Analyzer
– IntelliSense
• T-SQL Editor를 위해 구현된 것이 아님.
– color coding
– MDX and DMX 지원
• 강화된 Profiler
– Analysis Services, DTS support
• New Managed API’s
BI Development
• 최초로 통합된 BI IDE
BI Development Studio
Analysis
Enterprise
SQL Server Management Studio
Manager
Manager
Reporting
Services
(Reporting)
Analysis
Services
(OLAP & DM)
Integration
Services
(ETL)
SQL
Server
(RDBMS)
– RDBMS, OLAP, DM, ETL, Reports,
Code, HTML…
• Enterprise Development
– Source control, versioning,
– Developer & resource isolation
– Deployment
• Major Productivity & Usability
Enhancements
–
–
–
–
DW Generator
Intellicube Wizard
Data Source Views
More…
SQL Server in the BI
• Data Warehouse 강화
BI Development Studio
Analysis
Enterprise
SQL Server Management Studio
Manager
Manager
Reporting
Services
(Reporting)
Analysis
Services
(OLAP & DM)
Integration
Services
(ETL)
SQL
Server
(RDBMS)
–
–
–
–
–
Partitioned Tables
Online Index Re-organization
NUMA support
Bulk Load improvements
User Defined Aggregations
• XML Web Services
– Native XML Data Type
– XQuery Support
– Sprocs as Web Services
• Programmability
– CLR lang. Sprocs,
• Real Time Log Shipping
• Security and Abilities
Integration Services in the BI
• The New name of DTS
• “Big ETL”
– Highest end scale and throughput
– Rich connectivity
BI Development
Analysis
Enterprise
SQL Server Management Studio
Manager
Manager
Reporting
Services
(Reporting)
Analysis
Services
(OLAP & DM)
Integration
Services
(ETL)
SQL
Server
(RDBMS)
• Advanced process flow
– Branches, loops…
• New : Data Pipeline
– Streamed transformations
• Programmability
– CLR based transforms
– Rich debugging
• Major productivity & usability
enhancements
Analysis Services in the BI
Enterprise
DTS
(ETL)
SQL
Server
(RDBMS)
• Unified Dimensional Model
BI Workbench
Analysis
Management
Workbench
Manager
Manager
Reporting
Services
(Reporting)
Analysis
Services
(OLAP & DM)
– OLAP unified with Relational
Reporting
– Real-time BI
– Advanced Intelligence with KPIs,
MDX Scripts, Translations,
Currencies…
• Web Services
– Native XML/A support
• Scalability and Other Abilities
– Unlimited dimensions,…
• High-end Data Mining
– x4 DM algorithms
Clustering Support
– Multi Instance 제공
Reporting Services in the BI
Enterprise
DTS
(ETL)
SQL
Server
(RDBMS)
• Rich interactive reporting
BI Workbench
Analysis
Management
Workbench
Manager
Manager
Reporting
Services
(Reporting)
Analysis
Services
(OLAP & DM)
–
–
–
–
Chart, pivot, slice, drill
Native OLAP support
Flexible forms & formats
Reports as Data Source
• Enterprise Management
– Catalogs, Schedules, Snapshots,
Cache, Security, Clusters…
• Reporting platform
– Web Services API
– Rich Extensibility
– Developer tools
데모
•SQL Server 2005 BI 구성요소
– SQL Server Management Studio
– SQL Server BI Development Studio
– Reporting Services
– SQL Server Integration Services
– Q&A
목차
• SQL Server 2005 BI 구성요소
•
•
•
•
Analysis Services
SSIS(SQL Server Integration Services)
Reporting Services & Report Builder
Q&A
Analysis Services
• SQL Server 2005 Analysis Services Overview
• Demo : Cube 생성 비교
– Cube 생성 in SQL Server 2K
– Cube 생성 in SQL Server 2005
Analysis Services Overview
•
•
•
•
•
•
Analysis Services (OLAP)
Programmability Preview
Deployment
Measure Groups and Time
Alternate Aggregations
AMO Development
Analysis Services (OLAP)
• Database Data Access
• Spreadsheet Data Access
• “Database” OLAP
• “Spreadsheet” OLAP
• Analysis Services OLAP
Database Data Access
• SELECT Sum(«Measure»)
FROM «Source»
WHERE «Slicers»
GROUP BY «Dicers»
• 느려지는 잠재성을 가짐
Spreadsheet Data Access
• Cell Reference =C5
• Retrieves data by position
A
1
2
3
4
5
6
7
B
C
D
E
F
Quarter
Product
Qtr 1
Qtr 2
Qtr 3
Qtr 4
Grand Total
Apples
500
3000
2400
3900
9800
Cherries
700
2200
3800
4600
11300
Grapes
1600
2800
3800
4800
13000
Melons
1000
2900
3800
5900
13600
Grand Total
3800
10900
13800
19200
47700
Spreadsheet
Simple
Formulas
Complex
고객의 요구
Database
Small
Large
Data
Formulas
Complex
“Database” OLAP
Simple
OLAP
Database
Small
Large
Data
Spreadsheet
Simple
Formulas
Complex
“Spreadsheet” OLAP
OLAP
Small
Large
Data
Spreadsheet
Simple
Formulas
Complex
Analysis Services OLAP
OLAP
Database
Small
Large
Data
Programmability Preview
• Analysis Services Scripting Language (ASSL)
• Analysis Management Objects (AMO)
Analysis Services Scripting Language
(ASSL)
• Analysis Services Objects의 구조를 정의함.
– Cubes, Dimensions, Data Mining models
• Data Source에 Analysis Services object들을 연결(binding)
• AS 객체를 생성, 변경, 배포하는데 사용되는 언어
• Language is XML/A
– XML for Analysis
– multi-dimensional server환경에서 query와 command를 보내기 위한 SOAP
기반 언어
– 더 자세한 정보는 www.xmla.org에서 찾아 볼 수 있음.
ASSL 사용
• Developer
– BI Development Studio Design tool을 이용하여 Cube
집합을 설계
– 프로젝트의 일부로 저장
– XML 편집기를 이용해서 직접 Cube를 정의한 파일을
열수도 있다.
• Administrator
– Analysis Services objects를 생성 변경하기 위해서 SQL
Server Development Studio에서 직접 XML 편집기로 이용.
Analysis Management Objects (AMO)
• ASSL 명령어를 생성하기 위한 .NET API
• Server, Database, Data Source view, cube, dimension, mining
model, role을 포함하는 ASO를 관리 할 수 있음.
• AMO가 아니라 ASSL 에서 Script 생성
• Replacement for AS2000 Decision Support Objects (DSO)
– DSO는 하위 버전과의 호환성을 위해서 이용 가능함
Deployment
•
•
•
•
Translations
Unknown Member
Build, Deploy, Process
Deployment Utility
Translations
• Member Caption들을 변환
– Need Column in relational dimension table
– Add as Caption Column in Translation section
• Localize Heading (e.g. Attribute name) NEW
– Add as Constant in Translation section
• Localize Constant Members (e.g., All) NEW
– Add as Constant in Translation section
Unknown Member
• For when explicit attribute key is not found
– Prime example is fact table
– Also works within a dimension
• 주의할 점
– Dimension 전체에서 Unknown Member는 하나
(not attribute)
– 명시적인 Null 값은 capture 하지 마라
• 만일 attribute가 “Star” 테이블로부터 만들어 진다면, attribute
hierarchy는 NULL member를 가질 것이다.
– e.g., Gender 컬럼에서 Customer dimension은 NULL
Build, Deploy, Process
• Save (from BI Development Studio)
– Source definition files stored as XML
• Build (from BI Development Studio)
– Assembles project into ASDatabase XML file
• Deploy
– Wraps XML with Create command and sends to server
– Use Deployment Wizard to configure script
• Process
– Load actual dimensions and partitions with data
– BIDS automatically processes on deploy
Deployment Utility
• Start with ASDatabase definition file
• Supply configuration information
– Server name, target database name
– Replacement mode for partitions and roles
– Locations for error logs and data files
– Processing options (full, default, or none)
• Optionally creates XMLA deployment script
Measure Groups and Time
•
•
•
•
•
Measure Groups
Cube
Dimension Usage
Perspectives
Multiple Calendars
Measure Groups
•
•
•
•
Combine fact tables of different grain
Similar to cube in AS2K
One measure group per (logical) fact table
Map grain of dimension to measure group
Cube
• Assembly of measure groups
• Encourages combining fact tables of different grain
Similar to AS2K virtual cube (integrates data from different
fact tables)
• Logical relationships of all relevant data
– Unified Dimension Model (UDM)
– Physical storage may be OLAP or relational
Dimension Usage
• Same dimension appears in two fact tables
– Sales – Date.Day
– Forecast – Date.Month
• Dimension Usage allow linkage
– Dimension not used at all
– Dimension used at leaf grain
– Dimension used at higher grain
Perspectives
• A complete cube may be overwhelming
• Perspective is simplified version of a cube
– Select measures, hierarchies to include
– Similar to AS2K virtual cube (eliminates dimensions or
measures from a cube)
• Completely logical layer, no physical storage
Multiple Calendars
• Alternate hierarchies
– Within single dimension
– Calendar – Fiscal – Weeks
• Role-playing dimensions
– Separate fact table foreign keys
– Order Date – Ship Date – Due Date
– Each reuses all hierarchies of date dimension
Calculated Members
• Average Price
– [Measures].[Sales Amount] / [Measure].[Order
Quantity]
– Calculates for current member of all other
dimensions
KPIs
• Key Performance Indicators
• Calculate Value, Goal, Status, Trend, Weight,
Gauge
• Accessible from client application
데모
• SQL Server 2k와 SQL Server 2005 비교
– SQL Server 2K Analysis Services
– SQL Server 2005 Analysis Services
목차
• SQL Server 2005 BI 구성요소
•
•
•
•
Analysis Services
SSIS(SQL Server Integration Services)
Reporting Services & Report Builder
Q&A
What is SSIS ?
• SQL Server Integration
Services
• A new Microsoft SQL Server
Business Intelligence
application
• Data Transformation
Services의 새로운 제품
• The platform for a new
generation of high
performance data
integration technologies
Example: before Integration Services
Alerts & escalation
Call centre data: semi
structured
Text Mining
Staging
Staging
Legacy data: binary files
Application database
•
•
•
•
Hand
coding
ETL
Data mining
ETL
Staging
Cleansing
&
ETL
ETL
Warehouse
Reports
Mobile
data
Integration and warehousing require separate, staged, operations.
Preparation of data requires different, often incompatible, tools.
Reporting and escalation is a slow process, delaying smart responses.
Heavy data volumes make this scenario increasingly unworkable.
Example: with Integration Services
Alerts & escalation
Call centre:
semi-structured data
Text mining
components
Data mining
components
Custom
source
Merges
Standard
sources
Data cleansing
components
Mobile
data
Warehouse
Legacy data: binary files
Application database
•
•
•
•
SQL Server Integration Services
Reports
Integration and warehousing are a seamless, manageable, operation.
Source, prepare and load data in a single, auditable process.
Reporting and escalation can be parallelized with the warehouse load.
Scales to handle heavy and complex data requirements.
How does it work?
Control Flow
Data Flow
FTP
Flat File
Source
Oracle ADO.NET
Source
Send Mail
Merge
Loop
De-duplicate
Execute SQL
Split
Data Flow
SQL Server
Flat File
Just loaded
Control
This
Arranged
Which
Data
From
And
as
control
Flow
can
there
has
data
flow
be
inis
its
itinto
Loops
flow
enables
from
merged
can
acan
own
special
multiple,
can
be
be
multiple,
object
and
merged
cleansed
the
into
include
task
Sequences
user
diverse,
amodel.
heterogeneous
single
…in
many
to
…
the
define
Itflow,
destinations.
flow,
is
different
and
used
even
aso
related
complex
for
sources
itkinds
from
can
moving
bybe
varied
of
workflow
constraints.
…
split
tasks
data.
sources.
and
…ofpartitioned
tasks.
…
Architecture
Standard transforms
Custom transforms
Data
Destination
Adapters
Data Source
Adapters
Package
XML
Package
Loops &
Sequences
Tasks
XML
Package
Event
Handlers
Wizards
DTS Designer
Command Line
Life Cycle tools
• Design
– Business Intelligence Designer
– Migration wizard for pre SQL 2005 packages
– Visual Source Safe Integration
• Deployment
–
–
–
–
Configuration Wizard: flexible package configuration
Deployment Utility: Install packages
SQL Agent: Schedule package execution
Command Line Utility: Execute packages
• Supportability
–
–
–
–
SSIS service to monitor running packages and stored packages
Rich Logging
Checkpoint - Restart ability
WMI Integration
Integration Service Overview
•
•
•
•
•
•
Cleanse Data Input
Split an Output Channel
Script to Branch Control
Lookup Fuzzy Values
Loop through Folder
Configure and Deploy
Cleanse Data Input
•
•
•
•
•
Connection Manager
Flow Types
Data Adapter
Metadata Validation
Data Flow Components
Connection Manager
• Package object
• Layer between environment and outside world
– Allows for indirection, configuration
– Manages opening, closing, sharing
• Important for:
– Isolation, 이동성(Portability)
– Metadata
• Used for both tasks and transformations
Flow Types
• Two different types of flow
– Control Flow = Runtime = Tasks
– Data Flow = Pipeline = Transforms
• Managed in Designer
– Used to be single view in SQL 2000
– Separate views in SQL 2005
• Control Flow handles tasks and precedence
• Data Flow handles transformations - zoomed contents of Data
Flow task
Data Adapter
• Data Flow object
• Logical use of data
– Data Flow Source
– Data Flow Destination
• Links to Connection Manager
Metadata Validation
• Confirm and link columns in source and destination
– Define source first, then flow data to destination.
Destination responds.
– Changing the column definition invalidates the “contract”
with the destination
• Convenient if source exists and is stable
– Disable to build/manipulate files in progress
Data Flow Components
• Components
Source
Transform
Destination
• Paths
– Data route between one component and the next
– Includes metadata about columns moving around
– Lineage Identifier tracks item transformations
• Pipeline
– Components connected by a path
Split an Output Channel
• Distributors and Collectors
• Precedence Constraints
Distributors and Collectors
• Distributor transformations
– Multicast
– Conditional Split
• Collector transformations
– Union All
– Merge
– Merge Join
Tfm
Src
D
Tfm
Tfm
C
Dst
Precedence Constraints
• Connect one task to another
• Give sequential relationship to tasks
– Success/Failure/Completion workflow
– Establish concurrency
• Connected sequence of tasks is a task list
– Independent Task Lists execute concurrently
– Tasks within a list execute sequentially
Script to Branch Control
•
•
•
•
Package Variables
Script Task
Complex Precedence
Containers
Package Variables
• Scope
– Each container can have variables
– Define namespace for user variables
– Containers can access variables from higher levels
• Accessible from
– Expressions (such as loops and constraints) – Use @
– Parameters in Execute SQL task
– Parent Package (as part of configuration)
– Script
Script Task
• Currently VB .Net
• Can read or modify properties throughout the
package
• Can’t access inner workings of tasks or transforms
• Can’t modify pipeline metadata (e.g., number of
columns piped)
Complex Precedence
• Expressions add control – look at variables
– Simulate If or Case logic
• And/Or linkage of multiple constraints
• Disabled task = Success
Containers
• Container provides
– Grouping of task lists (list of one is allowed)
– Transaction scope
– Variable scope
• A package is a container
– Add your own tasks
– Insert your own containers
– Loops are containers, too
Lookup Fuzzy Values
•
•
•
•
Fuzzy Lookup Transformation
Data Visualizers
Breakpoints
Exception Handling
Fuzzy Lookup Transformation
• Proximity algorithm to find matches
• Builds index – Index can persist
• Creates metrics
– Similarity
– Confidence
• Uses a separate connection for reference table
Data Visualizers
•
•
•
•
Add to pipeline
Pause pipeline flow to inspect contents
Function as breakpoints (can disable)
Different degrees of summary
– Grid – for low-volume detail
– Graphs – for high-volume overview
Breakpoints
• Stop execution during package
– Stop control flow before and after
– Stop transformation during (visualizer)
• Fire selectively
Exception Handling
• Event Handlers
– Special container runs in response to event
• Log Providers
– Tasks (and the runtime engine) raise log events
– Log events can be saved: Profiler, console, table or file
• Transactions
– Containers, Packages, Multiple-Packages
• Checkpoints
Loop through Folder
• Control Flow
• Multiple Sources
• Loops
Control Flow
• Loops are Container Tasks
• Different from conventional programming
– DTS container task handles loop
– Package variable handles enumerator
– Can’t write a loop in C#
• “No programming required”
Multiple Sources
• Typical Examples
–
–
–
–
Folder full of files
Multiple servers with identical tables
Multiple partitions of a table
Multiple asynchronous data feeds
• Solution
– For or For Each Loop
– MULTIFILE Data connection
Loops
FOR LOOP
• Loops while expression
is TRUE
• Manually add loop counter
– Init: @N = 1
– Eval: @N <= 25
– Increment: @N =@N + 1
• Execute tasks in container on
each iteration
• More control—and more complex
than For Each
FOR EACH LOOP
• Loops over set of objects
– Files
– XML nodes
– Database objects
• Set variable (e.g. file name) for
each iteration
• Execute tasks in container on
each iteration
Configure and Deploy
•
•
•
•
Configurations
XML Customizability
Deployment
Execution
Configurations
• Take something from the system
– Environment variables, registry, XML option file
– Apply it to some part of your package
– Run the package with the new setting
• Useful for multiple “similar” jobs
– Similar to Dynamic Properties from SQL 2000
– Facilitate reusability, different environments
• Configurable at runtime or during execution
XML Customizability
• Don’t modify DTS XML definition directly
– Schema not published—can change
• Create configuration file (can be XML)
– Map virtually any package object to a new value
• Not able to
– Configure collection items
– Change package structure
Deployment
• Same server deployment
– Deploy from BI Development Studio
• Multiple server deployment
– Deployment Utility
– Package configuration files with package
– Deploy to SQL Server (msdb) or file (dtsx)
Execution
• Command-line execution
– DTEXEC
• User Interface execution
– DTEXECUI
– Can generate command line for DTEXEC
• Scheduling
– SQL Server Agent
Customer benefits of SSIS
• Performance
– Data flows process large volumes of data efficiently - even through
complex operations
• Facility
– Many pre-built adapters and transformations reduce hand coding
– Extensible object model enables specialized custom or scripted
components
– Highly productive visual environment speeds development and
debugging
• “Smarts”
– Data cleansing features enable difficult data to be handled during loading
– Data mining brings intelligent handling of data for imputation of
incomplete data, conditional processing of potential problems, or smart
escalation of issues such as fraud detection
데모
• Integration Services 구현
Reporting Services
• Reporting Services Release Roadmap
• Analysis Services Support
Reporting Services Release Roadmap
• SQL Server 2000 Reporting Services SP1
– Bug Fixes, Scalability, Excel 2000 support
• SQL Server 2005 Beta 2
– Parity with SQL2K SP1 version
– Cross SQL Server (AS, Management) integration
• SQL Server 2000 Reporting Services SP2
– Web Parts, Client Printing
• SQL Server 2005 Beta 3
– New Features, 64 Bit support
• SQL Server 2005 RTM
Analysis Services Support
• MDX and data
mining query
builders
• MDX parameter
support
• Member extended
properties
• Support for server
aggregates
Management Studio Integration
• Single point of
management for all
SQL Server
components
• Superset of Report
Manager functions
• Script generation
from property
dialogs
SharePoint Web Parts
• Report Explorer
provides browsing
of server
namespace and
subscription
• Report Viewer
used to view
reports
• Parts can be
connected or used
standalone
• Works in both SPS
and WSS
Visual Studio Integration
• Report design
completely
integrated with
Visual Studio
language projects
• Natural extension
of VS data
functionality
• Included in VS
Pro and above
Report Builder
• A new ad-hoc report design tool for Reporting Services
• Targeted at business users who want to find and share
answers to interesting questions
• Driven from a business model of the data so users do not
need to understand the underlying data structures
• Fully integrated with Reporting Services and delivered in
SQL Server 2005
• Complements the Visual Studio Report Designer
• Not designed to be a full analytical client, or a replacement
for PivotTables
Report Builder Client
• Built on top of familiar Microsoft Office paradigms (Examples:
Microsoft® Excel and PowerPoint®)
• Reports built via report templates (table, matrix, chart)
• “Click once” WinForms
application deployed
from the Report Server
• Users can create
new reports or modify
existing reports
• Finished reports can be
published to server
참고 자료
•
•
www.microsoft.com/sql
http://www.microsoft.com/sql/2005/techinfo/default.a
sp
•
SQL Server 2005 Business Intelligence
– Hitachi Consulting