Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
What's NEW in SQL 2005 Integration Services Matthew Stephen SQL Server Specialist http://blogs.technet.com/mat_stephen Agenda • • • • • Overview Business Intelligence Development Studio Workflow Features Dataflow Features Other Features Integration Services : A Short History • DTS - SQL Server 7.0 – “Visual BCP” – a useful utility • DTS - SQL Server 2000 – Easy (but slow) workflow & transform engine – Customizable • SSIS – SQL Server 2005 – – – – A completely new codebase Enterprise class ETL Exceptional BI integration – and more Rich APIs and extensibility New Features and Components in SQL IS SQL Destination Derived Column Transform File Extractor Transform Row Count Transform XML Source Adapter Data Reader Source Adapter Shrink Database Task Time Gen Source Adapter Property Expressions Dataflow Task Merge Join Transform Merge Transform Sort Transform Audit (Lineage) Transform Union All Transform SQL CE Destination Adapter DTExec Execution Stack Window HTTP Connection Manager For Loop Container For Each Loop Container Evaluate As Expression DTUtil Aggregate Transform Lookup Transform Enhanced Error Reporting Execute DTS 2000 Package Task Web Services Task New Migration Wizard Conditional Split Transform SQL Mobile Connection Manager Flat File Connection Manager Enhanced Import/Export Wizard Upgrade Analyzer for SSIS ActiveX Script Task Pivot Transform Un-pivot Transform Recordset Destination Adapter OLE DB Source Adapter OLE DB Destination Adapter Flat File Source Variable Namespaces Data Conversion Transform Excel Source Adapter Excel Destination Adapter OLEDB Reader Adapter Visualizers Integration Services Server Component Enumeration Caching Debugging Support in the Designer Data Reader Destination Adapter Debugging Breakpoints IS Projects in VS Watch Windows Log Viewer Windows Deployment Wizard Transfer Logins Task MSMQ Task F1 Integrated Help Copy/Paste in the Designer ADO Connection Manager ADO.NET Connection Manager File Connection Manager Analysis Services Processing Task Row sampling Transform WMI Event Watcher Task Percent sampling Transform Execute Process Task Text Log Provider Multicast Transform WMI Query Task FTP Connection Manager Raw Source Adapter MSOLAP90 Connection Manager Multi-file Connection Manager Multi-Flat File Connection Manager ODBC Connection Manager OLEDB Connection Manager Easy Dataflow Adapter Extensibility SMO Server Connection Manager Transfer Errors Task ForEach Item Enumerator Runtime Execution Engine Data Source Views DTExecUI Intellisense Help Designer Grouping Workflow Designer Dataflow Designer Event Handler Designer Output Window Error Window Right Click Online Error Access Tools Options pages For Each ADO.NET Enumerator Import/Export Wizard Auto-layout algorithms Raw Destination Adapter Runtime Execution Engine Configuration Wizard Integration Services Roles Package Encryption Options Notify Operator Task Package Signing SAP .NET Provider Managed Task Base Class ADO.NET Script Task File Inserter Transform MSMQ Connection Manager ADO.NET Script Component For Each From Variable Enumerator Easy Transform Extensibility Transfer SQL Server Objects Task Excel Connection Manager SQL Server Log Provider SQL Profiler Log Provider Easy ForEach Enumerator Extensibility Windows Event Log Provider ForEach Nodelist Enumerator SQL Server Configurations XML Configurations Registry Configurations Environment Variable Configurations Sequence Container Event Handlers ForEach File Enumerator Slowly Changing Dimension Transform Control Flow Designer Send Mail Task SQL Task File System Task XML Task Native Object Model SMTP Connection Manager Bulk Insert Task Analysis Services Execute DDL Task Back Up Database Task Check Database Integrity Task Execute T-SQL Statement Task Maintenance Cleanup Task Rebuild Index Task Reorganize Index Task Workflow Checkpoint Restartability Real time Designer Execution Monitoring User defined transaction scope FTP Task Easy Task Extensibility Transfer Stored Procedures Task Easy Connection Manager Extensibility Update Statistics Task Easy Log Provider Extensibility Integrated Development Environment WMI Connection Manager History Cleanup Task .NET Compatible Flat File Destination Transfer Database Task Data Convert Transform Managed Object Model XML Persisted System Variables New Events OnPreExecute etc. For Each ADO Enumerator Read Only Variables OLEDB Command Transform XML Log Provider Dataflow Execution Engine Package Explorer before Integration Services Alerts & escalation Call centre data: semi structured Text Mining Staging Staging Legacy data: binary files Hand coding Cleansing & ETL Application database ETL Data mining ETL Staging ETL Warehouse Reports Mobile data •Integration and warehousing require separate, staged, operations. •Preparation of data requires different, often incompatible, tools. •Reporting and escalation is a slow process, delaying smart responses. •Heavy data volumes make this scenario increasingly unworkable. With Integration Services Alerts & escalation Call centre: semi-structured data Text mining components Data mining components Custom source Merges Standard sources Data cleansing components Mobile data Warehouse Legacy data: binary files Application database SQL Server Integration Services Reports •Integration and warehousing are a seamless, manageable, operation. •Sourced, prepare and load data in a single, auditable process. •Reporting and escalation can be parallelized with the warehouse load. •Scales to handle heavy and complex data requirements. How SSIS Works •Data sources can be diverse, including custom or scripted adapters •Transformation components shape and modify data in many ways. •Data is routed by rules or error conditions for cleansing and conforming. •Flows can be as complex as your business rules, but highly concurrent. •And finally data can be loaded in parallel to many varied destinations. Architecture Standard transforms Custom transforms Data Destination Adapters Data Source Adapters Package XML Package Loops & Sequences Tasks XML Package Event Handlers Wizards DTS Designer Command Line Agenda • • • • • Overview Business Intelligence Development Studio Workflow Features Dataflow Features Other Features Business Intelligence Development Studio – Centralized Project Management • • • • Analysis Services Reporting Services Integration Services .Net and Visual Studio – Visual Studio Interface • Integrated package debugging • Edit, debug, fix loop in one UI Business Intelligence Development Studio • Project – Packages, Cubes, Reports – Data Source Views – Deployment • Solution – – – – Multiple Projects Additional files in Misc Files Your designer settings Source Control with Visual Source Safe Work Flow vs. Data Flow • Work Flow – Process-oriented – Do something to an object – Success, fail, or completion result – Environmental Inputs – Distinct beginning and end • Data Flow – Data-oriented – Do something to data – Produce and/or consume rows of data – Rows as inputs and outputs – Parallel in nature – Starts with first row, ends with last row, for each row Business Intelligence Development Studio • Package Editor • Design Surfaces – Work Flow, Data Flow, Event Handlers – Properties Pane • Variables, Connections UI – Toolbox – Pop-up editor windows • Output, Watch, Log, Error Windows Agenda • • • • • Overview BI Workbench Workflow Features Dataflow Features Other Features Work Flow Features • • • • • • • • • Tasks Precedence Constraints Containers Looping Events Configurations Logging Property Expressions Deployment Work Flow: The Nature of Tasks • • • • • • • • • • • • File System Task Smallest Unit of Work File Transfer Protocol Task Execute Process Task Individually successful or Send Mail Task unsuccessful Execute SQL Task WMI Task Disabled or enabled MSMQ Task Database Management Isolated : Unaware of other Tasks tasks • Execute Package Task • VB.Net Script Task, and more! Work Flow: Precedence Constraints • Order the execution of • Constraint is met based on – Success/Failure/Completion of tasks previous task • Establish task sequence – Evaluation result of an expression – Establish concurrency – Establish order – Establish dependency – A combination of the two • May be Logically OR’ed or AND’ed Work Flow: Containers • Arbitrary grouping of Executables • Containers also use Precedence Constraints • Containers execute concurrently • The package is a container – Loops, Sequences, Taskhosts and Event Handlers are containers, too • Containers may contain other containers • Containers provide scope – – – – Variables Transactions Execution Logging • Manage complexity – Break packages into units – Can be collapsed – Move parts of a package as one – Disable container and all children Work Flow: Looping • For Loop - Expressions • • – Utilize container advantages – ForEach Loop – For Loop • For Each Loop – – ForEach Enumerators – Variable Mappings Dim pkg Dim stp set pkg = DTSGlobalVariables.Parent set stp = pkg.Steps("DTSStep_DTSMyTask") ‘Do stuff here ‘Now loop stp.ExecutionStatus = DTSStepExecStat_Waiting No more loop hacks Functionality baked-in • • Better interaction with environment No coding necessary Work Flow: Events • Eventing Expanded – Event Handlers • In package event response • Has information about event in System Variables • Contains workflow – Custom Events • Tasks may register custom events • May create event handlers for custom events • Automatically logged – Event Driven Architecture • Tasks may be built to wait for events • Graceful shutdown with “System::CancelEvent” • Enables event driven paradigm eg. WMI Event Task Work Flow: Configurations • Enable “no-touch” package editing • Support multiple types – – – – SQL Server XML Environment Variables Registry • Configuration Wizard • Indirect Configurations – Configuring Configurations • Use Package Paths for describing location of configured objects Work Flow: Logging • Rich array of log providers – SQL Server – XML – Text – Windows event viewer – SQL Profiler • • • Integrated logging throughout package Log to separate providers Write your own Work Flow: Property Expressions • Modifying Properties with Expressions – All tasks, logproviders, connection managers and ForEach Enumerators and some dataflow components • • • • Subtle and Surgical Innumerable applications May reference variables Actuated at key edge events – – – – Load/Save Execute/Validate AcquireConnection GetEnumerator Agenda • • • • • Overview Business Intelligence Development Studio Workflow Features Dataflow Features Other Features Data Flow • • • • High performance data movement Components and Paths Adapters Transforms Data Flow Features • Components, each with zero or more inputs and/or zero or more outputs • Paths connect outputs to inputs • Data Flow Task – No rules broken, but bent a few – Dedicated UI – Own object model Data Flow: Components and Paths • Components – Data Flow Adapters – Transforms • Paths – The route between components – Includes metadata about columns Data Flow: Adapters • Reference connection managers to access data • Move rows into buffers • Convert native types to dataflow task types • Describe their output • Have no inputs • OLEDB Source and Destination • Flat File Source and Destination • XML Source • SQL Server Destination • SQL CE Destination • Recordset Destination • Raw Source and Destination Data Flow: Transforms • May have multiple inputs and outputs. • Must have at least one input • Must have at least one output • Process rows by buffer • May output more or less rows than take on input • • • • • • • • • Aggregate Conditional Split Fuzzy Lookup Lookup Merge / Merge Join / Multicast Slowly Changing Dimension Script Component Sort Custom Transforms Agenda • • • • • Overview Business Intelligence Development Studio Workflow Features Dataflow Features Other Features Other Features : Platform • Easily Extensible – – – – – Custom tasks Custom transforms Custom Adapters Custom log providers Custom foreach enumerators – Custom data providers – Develop in native or managed code – COM and Managed Object models • Built in Features – Ships with SQL Server – “In the box” – Ships with multiple transforms • Data Cleaning, Data Mining • Aggregation, Sort, SCD – Ships with tools – Integrated with SQL Agent – Integrated with Analysis Services • Flexible – May be used in a wide array of applications, not just ETL Work Flow: Deployment • • Makes Moving Packages Simpler Includes Miscellaneous files in projects • Builds self installer deployment bundle • • Integrated with configurations Built when project is built in designer Other Features : Debugging • Expanded Error, Warning and Information Reporting • Breakpoints – Script task and package breakpoints integrated • Watch windows • Data Viewers in the data flow task • Error, warning and Information view window • Log event viewing in log window • All events logged Other Features : Security • Encryption - Using encryption to secure packages or parts of packages • Sensitive Data Protection - Identifying and protecting passwords and other sensitive data • Roles - SQL Server roles to control access to packages stored in SQL Server. Operator, Limited User, and Admin. • Digital Signing - Code signing certificate to ensure a package hasn't changed • Agent Integration - Agent subsystems to securely store and execute packages Other Features : Checkpoint Restartability • • • • • Enables recovery from failed state Saves work at the task level Variable values saved Saves to XML temporary file Checkpoint file removed when package restarts • Runtime feature, not dataflow restartability Other Features : DTExec, DTExecUI, DTUtil • DTExec replaces DTSRun • DTExecUI replaces DTSRunUI and has new UI • DTUtil new utility for moving packages etc. Other Features : SSIS Server • Integrates with SQL Server Management Studio • Provides package file store • Enumerates running packages • Hierarchical package storage on file system, Server Store and in SQL Server • Configure with MsDtsSrvr.ini.xml in 90\DTS\binn Other Features : Not so obvious • Variables made safe with the Variable Dispenser • Scriptable sources and destinations as well as transformations • Transaction flexibility and scoping • Script now compiled for much faster execution • Agent interaction more customized and integrated • Packages saved to XML • Always uses encrypted connections with SQL2K5 Help : Upgrade Advisor & Migration Wizard • Discovers problems before installing • Migrate packages from DTS 2K only Demos Summary • Integration Services has changed dramatically. • A platform designed to support embedding and extension • Stock feature set enormous and comprehensive • Exhibit A in Microsoft’s BI Strategy • True enterprise ETL and Integration capabilities • Long term support from Microsoft • We’ve only just begun! Resources – Integration Services MSDN Home http://msdn.microsoft.com/SQL/sqlwarehouse/SSIS/defaul t.aspx – Articles – – http://www.windowsitpro.com/Authors/AuthorID/1457/1457 .html – Junkies Forum http://www.sqljunkies.com/Forums/ShowForum.aspx?Foru mID=48 – MSDN Forum http://forums.microsoft.com/msdn/ShowForum.aspx?Foru mID=80 – Community http://SQLIS.com