* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Database Design Data Modeling Standards - PA
Microsoft Access wikipedia , lookup
Oracle Database wikipedia , lookup
Concurrency control wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Ingres (database) wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Functional Database Model wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
ContactPoint wikipedia , lookup
Relational model wikipedia , lookup
APPENDIX R 6100038971 PA State Police (PSP) Database Design Data Modeling Standards Database Team Version 1.4 Date: 03/19/2014 SECURITY WARNING The information contained herein is proprietary to the Commonwealth of Pennsylvania and must not be disclosed to un-authorized personnel. The recipient of this document, by its retention and use, agrees to protect the information contained herein. Readers are advised that this document may be subject to the terms of a non-disclosure agreement. DO NOT DISCLOSE ANY OF THIS INFORMATION WITHOUT OBTAINING PERMISSION FROM THE MANAGEMENT RESPONSIBLE FOR THIS DOCUMENT. COMMONWEALTH OF PENNSYLVANIA PENSYLVANIA STATE POLICE Version History Date Version Modified By / Approved By Section(s) Comment 09/03/2009 1.0 S. Greer All Initial Version 01/20/2010 1.1 S. Greer 5.2 Removed model name as upper case 05/05/2010 1.2 S. Greer / C. Reber 2.0 / 5.3 Update links to point to web page instead of an individual document (2.0 intro / 5.3). 11/15/2011 1.3 C. Reber All Change ESF to EDC 03/19/2014 1.4 S. Greer All Change EDC to PSP DATABASE DESIGN DATA MODELING STANDARDS PAGE 2 OF 8 COMMONWEALTH OF PENNSYLVANIA PENSYLVANIA STATE POLICE Table of Contents 1 DATA MODELING INTRODUCTION .............................................................................................. 4 1.1 PURPOSE........................................................................................................................................ 4 1.2 DATA MODELING OVERVIEW ......................................................................................................... 4 2 ROLES AND RESPONSIBILITIES.................................................................................................. 5 2.1 DBA ROLE AND RESPONSIBILITY ................................................................................................. 5 2.1.1 Logical Data Modeling ......................................................................................................... 5 2.1.2 Physical Data Modeling ....................................................................................................... 5 2.2 DEVELOPER ROLE AND RESPONSIBILITY ..................................................................................... 5 3 DATA MODELING COMPONENTS ................................................................................................ 6 4 LOGICAL DATA MODELING STANDARDS ................................................................................ 7 4.1 LOGICAL DATA MODELING CONVENTIONS ................................................................................... 7 4.2 LOGICAL DATA MODEL NAMING STANDARDS .............................................................................. 7 5 PHYSICAL DATA MODELING STANDARDS .............................................................................. 8 5.1 5.2 5.3 PHYSICAL DATA MODELING CONVENTIONS ................................................................................. 8 PHYSICAL DATA MODEL NAMING STANDARDS ............................................................................ 8 PHYSICAL DATA MODELING TABLE AND COLUMN NAMING STANDARDS ................................... 8 DATABASE DESIGN DATA MODELING STANDARDS PAGE 3 OF 8 COMMONWEALTH OF PENNSYLVANIA PENSYLVANIA STATE POLICE 1 Data Modeling Introduction 1.1 PURPOSE The Pennsylvania State Police (PSP) Bureau of Information Technology Database team provides Database Design Services (DDS) for applications that require database development This document focuses on data modeling standards related to the logical and physical design of a database. This document lists and defines data model components and the data modeling naming conventions provided as part of Database Design Services. 1.2 DATA MODELING OVERVIEW Data modeling is a very vital part in the development process. Data models consist of three basic types: conceptual data models, logical data models and physical data models. Major events in data modeling include: Identify entities, data requirements and processes Define attributes of the data such as data types, sizes, defaults Apply validation and business rules to ensure data integrity Define data management and security processes Specifying data archival and storage ITB-INF003 Data Modeling Standard defines the policy that is followed for Database Design Services. DATABASE DESIGN DATA MODELING STANDARDS PAGE 4 OF 8 COMMONWEALTH OF PENNSYLVANIA PENSYLVANIA STATE POLICE 2 Roles and Responsibilities Database Team requires the developer to complete and submit the Database Design Requirements Questionnaire (available on the Bureau of Information Technology Database Sharepoint site). The template requests a list of the data elements metadata to be used to develop the data models and data dictionary. 2.1 DBA ROLE AND RESPONSIBILITY 2.1.1 Logical Data Modeling The following are the roles and responsibilities Database Design Services provides for logical data modeling: Participate in project meetings related to data processing Communicate with Application Developers to understand all aspects of the logical model Construct, support and maintain development of the logical model Ensure the logical model conforms to Database Design Database Naming Standards, Data Modeling Standards, and OA/OIT Data Modeling Standards 2.1.2 Physical Data Modeling The following are the roles and responsibilities Database Design Services provides for physical data modeling: Generate the physical model from the logical model and estimate the size of the database Perform standards reviews of the model to ensure the physical model conforms to the Database Design Database Naming Standards, Database Naming Standards Abbreviations list, Data Modeling Standards, and OA/OIT Data Modeling Standards Generate a data dictionary from the physical model Provide a data dictionary (Deliverable) Provide the physical model Entity Relationship Diagram (ERD) (Deliverable) Generate DDL from the physical model and create the database (Deliverable) Maintain and keep the physical model in synchronization with the database at all times 2.2 DEVELOPER ROLE AND RESPONSIBILITY Complete and submit the Database Design Requirements Questionnaire Document (Deliverable) Support standards reviews of the physical model DATABASE DESIGN DATA MODELING STANDARDS PAGE 5 OF 8 COMMONWEALTH OF PENNSYLVANIA PENSYLVANIA STATE POLICE 3 Data Modeling Components A data model is composed of the following components and may be illustrated using a data-modeling tool: Conceptual Data Model A conceptual data model is a visual diagram from the business user’s perspective. Conceptual data models are often created as the precursor to Logical Data Models or as alternatives to Logical Data Models. Logical Data Model A logical data model is an abstract Database Management System (DBMS) independent representation of a set of data entities and their relationships within the scope of a system’s business requirements. The data model is a visual diagram of the database schema that contains all the information required to generate a physical database on a specific server. The model should identify all the tables, columns, and relationships related to the database. The logical data model should be constructed prior to the creation of the database. The coding and database development should begin after the logical model has been approved. Physical Data Model A physical data model is a representation of a data design which takes into account the facilities identified in the physical data modeling table and column naming standards section and constraints of a given database management system. In the lifecycle of a project it is typically derived from a logical data model, though it may be reverse-engineered from a given database implementation. A physical data model is translated into a data schema. A series of steps to normalize the data should be applied at this phase to ensure efficiency and integrity of the data schema. Data Dictionary A data dictionary is a comprehensive document that outlines the business definition of all tables and columns within the model. Each table is defined by its name, columns (including data type, length, fixed or variable length, valid values, default values and nullable), primary key, foreign key, and business definition for every table and column. DATABASE DESIGN DATA MODELING STANDARDS PAGE 6 OF 8 COMMONWEALTH OF PENNSYLVANIA PENSYLVANIA STATE POLICE 4 Logical Data Modeling Standards 4.1 LOGICAL DATA MODELING CONVENTIONS A logical data model must comply with the following: 4.2 Use English words with no abbreviations separated by a space. Do not use underscores in the logical model. Specify all primary and foreign keys in the data model. Specify relationship types as identifying or non-identifying and the cardinality for each relationship. Add comments for every table and column that describes the business definition including valid values for the columns. Enter table and column descriptions and other Metadata as required by the developer. LOGICAL DATA MODEL NAMING STANDARDS Since most data modeling software tools allow for both the logical and physical versions of the data model to exist in one file, the data model name will follow the data model naming standards outlined in the Physical Data Modeling Standards section. DATABASE DESIGN DATA MODELING STANDARDS PAGE 7 OF 8 COMMONWEALTH OF PENNSYLVANIA PENSYLVANIA STATE POLICE 5 Physical Data Modeling Standards 5.1 PHYSICAL DATA MODELING CONVENTIONS A physical data model must comply with the following: 5.2 Specify all primary and foreign keys in the data model. Define the primary key as the first data item in the table. Specify relationship types as identifying or non-identifying and the cardinality for each relationship. Enter table and column descriptions and other Metadata as required by the developer. Synchronize the physical data model and the physical database with each other in the production environment. PHYSICAL DATA MODEL NAMING STANDARDS Standards are established for naming data models to easily identify the data model, data model subsystem and version. Name data models using the following guidelines: 5.3 The data model name will be of the format “SPApplicationName”. This will be the same name as the database. Data-model subsystem names will be in the format “SPApplicationName_SubsystemName”,. Versioning of the data model will be done as part of the modeling tool. PHYSICAL DATA MODELING TABLE AND COLUMN NAMING STANDARDS Consistent naming practices are vital. The use of a common methodology when naming database tables and columns promote the uniform description and definition of the elements while, at the same time, facilitates: The reduction of redundant data elements by identifying like name elements The reduction of element documentation by utilizing existing documents The promotion of data integrity by ensuring data singularity and consistency The simplification and standardization of system documentation by utilizing existing documents Consistency in the use and meaning of abbreviations Follow the standards defined in Database Design SQL Server Database Naming Standards (available on the Bureau of Information Technology Database Sharepoint site). DATABASE DESIGN DATA MODELING STANDARDS PAGE 8 OF 8 PA State Police (PSP) Database Design SQL Server Database Naming Standards Database Team Version 1.7 Date: 03/14/2014 SECURITY WARNING The information contained herein is proprietary to the Commonwealth of Pennsylvania and must not be disclosed to un-authorized personnel. The recipient of this document, by its retention and use, agrees to protect the information contained herein. Readers are advised that this document may be subject to the terms of a non-disclosure agreement. DO NOT DISCLOSE ANY OF THIS INFORMATION WITHOUT OBTAINING PERMISSION FROM THE MANAGEMENT RESPONSIBLE FOR THIS DOCUMENT. COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE Version History Date Version Modified By / Approved By Section(s) Comment 8/11//2009 1.0 M. Forbes All Initial Version 9/30/2009 1.1 S. Greer 2.5 Changed indicatior datatype to CHAR 1/20/2010 1.2 S. Greer 2.4, 2.5 Added singular name, voitaile tables and logical delete. Added column not past tense. 05/05/2010 1.3 S. Greer / C. Reber 1.2 / 3.1 / App. B Update links to point to web page instead of an individual documents (1.2 – App. B) - Rework verbiage in section 3.1 08/12/2010 1.4 S Greer 2.5 Removed identiy column to Primary Key names for convention to apply to all primary keys not just identy columns. Added Primary key identity column data type. 12/09/2010 1.5 S. Greer 2.3, 2.7 Multiple databases naming convention, Added Login Section 11/15/2011 1.6 C. Reber All Change ESF to EDC 03/14/2014 1.7 S. Greer All Changed EDC to PSP DATABASE DESIGN SQL SERVER DATABASE NAMING STANDARDS PAGE 2 OF 10 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE Table of Contents 1 INTRODUCTION................................................................................................................................. 4 1.1 PURPOSE........................................................................................................................................ 4 1.2 SQL SERVER DATABASE NAMING STANDARDS OVERVIEW ........................................................ 4 2 SQL SERVER NAMING CONVENTIONS ...................................................................................... 5 2.1 DATABASE OBJECTS...................................................................................................................... 5 2.2 SCHEMAS ....................................................................................................................................... 5 2.3 DATABASE NAMES ......................................................................................................................... 5 2.4 TABLES ........................................................................................................................................... 6 2.5 COLUMNS ....................................................................................................................................... 6 2.6 VIEWS ............................................................................................................................................. 7 2.7 APPLICATION LOGINS .................................................................................................................... 7 3 DATABASE STANDARD DATA ELEMENTS & ABBREVIATIONS ........................................ 8 3.1 STANDARD DATA ELEMENTS ........................................................................................................ 8 3.2 ABBREVIATIONS ............................................................................................................................. 8 4 APPENDIX A – RESERVED WORDS ............................................................................................ 9 5 APPENDIX B – RESOURCES & REFERENCES ....................................................................... 10 DATABASE DESIGN SQL SERVER DATABASE NAMING STANDARDS PAGE 3 OF 10 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE 1 Introduction 1.1 PURPOSE The Pennsylvania State Police (PSP) Bureau of Information Technology Database team provides Database Design Services (DDS) for applications that require database development. This document describes the SQL Server database naming standards that will be applied for Database Design. This document focuses on defining the rules for database object and naming construction and the rules for abbreviating the names to meet length constraints imposed by programming languages and database management systems. 1.2 SQL SERVER DATABASE NAMING STANDARDS OVERVIEW Standardizing naming conventions and abbreviations are defined in order to promote structure, consistency, readability, reusability, and maintainability. Applying standards promotes data element definition reuse between databases in an effort to improve data sharing, data mapping, and eliminate data redundancy where possible. Standards are important to developers for a number of reasons: 80% of the lifetime cost of a piece of software goes to maintenance. Hardly any software is maintained for its life by the original author. Code standards improve the readability of the software, allowing developers to understand new code more quickly and thoroughly. For the conventions to work, each person who writes software for SQL Server databases must conform to these naming standards. It is also understood that there are situations where it is necessary to stray from these guidelines. Commercial-Off-The-Shelf (COTS) or third party products will adhere to these standards, if possible. For additional SQL Server naming conventions and standards, reference the documents listed under the Database Design Standards library on the Bureau of Information Technology Database Sharepoint site. The SQL Server standards will only be enforced when PSP Database Design Services are engaged. If any of these standards cannot be met, a justification must be submitted to PSP Data Administration within the Database Section. DATABASE DESIGN SQL SERVER DATABASE NAMING STANDARDS PAGE 4 OF 10 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE 2 SQL Server Naming Conventions Naming conventions provide a standard approach to naming different objects and help to troubleshoot and locate objects. All objects also need a detailed description of use. The following naming conventions must be adhered to for SQL Server databases. When reading the naming conventions, remember that consistent naming is the most important rule. 2.1 DATABASE OBJECTS A database object is a database component like a table, trigger, view, key, constraint, default, rule, userdefined datatype, or stored procedure in a database. The following naming conventions apply for all database objects: Names for user-defined database objects shall be case sensitive (use upper and lower case) for readability and programmability. Names must be significant and meaningful. Names must begin with a letter followed by letters, digits, or _ with a maximum length of 30 characters Names for user defined database objects should not use reserved words, keywords (Appendix A), quoted (or bracketed) names, spaces and special characters (such as $, #, etc.) except the underscore “_”. Example: Use the name „Account‟ for Accounts table. Use the name „Emp_Num‟ for the column of employee number. 2.2 SCHEMAS A schema is a collection of components and database objects under the control of a given database user. Schema names will be suffixed with „dbo‟. The schema name will be of the format “XXApplicationNamedbo”, where XX is the agency twodigit code. All user-defined database objects are owned by a schema. The Fully Qualified Domain Names (FQDN) of these database objects must be unique. The FQDN for all database objects must be in this form: [Server_Name].[Database_Name].schema.[Object_Name]. Example: Use the name „Server01.Sample. XXApplicationNamedbo.Account‟ for the Accounts table, XXAccountdbo is the schema name in the Sample database residing on Server01. 2.3 DATABASE NAMES A database in the relational context is organized by tables, rows and columns used to store information organized in such a way that a computer program can quickly select desired pieces of data. The database name will be of the format “XXApplicationName”, where XX is the agency twodigit code. For applications that require multiple databases all of the database names must start with the same “XXApplicationName”. Example: SPLTC, SPLTCWorkFlow, SPLTCAudit The database name must not include the underscore “_”. DATABASE DESIGN SQL SERVER DATABASE NAMING STANDARDS PAGE 5 OF 10 COMMONWEALTH OF PENNSYLVANIA 2.4 PENNSYLVANIA STATE POLICE TABLES A table is a two-dimensional object, consisting of rows and columns, used to store data in a relational database. The table name must be singular. The table name must describe the contents of the table. Avoid prepositions in table names. If possible, table names should not be longer than 26 characters. Do not use the prefix of „sys‟, „spt‟ and „sysremote‟. These are used for system tables. Tables used to resolve a many-to-many relationship must include both the source table names and a suffix of „_Xref‟. Example: Permit_Type and Status_Type are combined as Permit_Status_Type_Xref A lookup table or list of values must be suffixed with „_Lkp‟. The fact table name in the data mart must be suffixed with a „_Fact‟. The dimension table name in the data mart must be suffixed with a „_Dim‟. Volatile tables (tables with business data that is changed on a frequent basis) should include columns, which identify when the row was entered, who entered it, when it was last updated, and who last updated it. These columns are named: Entered_By VARCHAR(30) NOT NULL default suser_sname() Entered_Date SMALLDATETIME NOT NULL default getdate() Update_By VARCHAR(30) NULL Update_Date SMALLDATETIME NULL Tables that logically delete rows (i.e. Lookup tables) should include columns that identify the row was deleted by using obsolete, when it was last updated, and who last updated it. These columns are named: Obsolete Update_By Update_Date 2.5 BIT VARCHAR(30) SMALLDATETIME NOT NULL NULL NULL default „0‟ Xref tables must not include the Entered_By, Entered_Date, Update_by and Update_Date. COLUMNS A column is the area in each row that stores the data value for some attribute of the object modeled by the table. The column name must be singular. The column name must not be past tense. The column name must describe the contents of the column. Column names that appear in multiple tables should be consistently named throughout the Enterprise (across all agency tables). There may be cases where a column name would be referenced twice. If so, add a meaningful prefix to the original column name. Example: Color_Code would become Interior_Color_Code and Exterior_Color_Code. Primary key columns must include the table name and suffixed with „_ID‟. Example: Use the name Address_ID as the primary key column of the ADDRESS table. Primary key columns must exclude the Xref, Lkp, Fact and Dim suffix of the table name and be replaced with „_ID‟. Example: Use the name State_ID as the primary key identify column of the STATE_Lkp table. DATABASE DESIGN SQL SERVER DATABASE NAMING STANDARDS PAGE 6 OF 10 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE Primary key identity columns data type must be integer. If the Primary key is also used as a Foreign key, the table name may be excluded to follow the consistent naming of columns throughout the database. Example: PFA table primary key would be Person_ID instead of PFA_ID since the Person_ID is also a foreign key to the Person table. Foreign-key columns should also have the same name in both parent (master, reference, or lookup) and child (detail or data) tables including the suffix „_ID‟. There may be cases where a foreign-key column requires a meaningful prefix to the parent name. Example: Primary key State_ID foreign key would become Address_State_ID The column defined as a date or to only store the date must be suffixed with „_Date‟. Example: Activity_Date The column defined as a time must be suffixed with „_Time‟. Example: Activity_Time The column defined as a datetime must be suffixed with „_DateTime‟. The column defined as a timestamp must be suffixed with „_TimeStamp‟. The indicator column must be suffixed with „_IND‟. The preferred method is using values „Y‟ for Yes and „N‟ for No, and the column datatype must be CHAR(1). Other values such as „0‟ for False and „1‟ for True with a BIT or CHAR datatype may be used in justifiable cases. Example: Active_IND CHAR(1) 2.6 VIEWS Views are customized presentations of data in one or more tables or other views. A view can also be considered a stored query. Views do not actually contain data. Rather, they derive their data from the tables on which they are based. The view name must be prefixed with a „v_‟. If the view comes from a single table, then the name must describe the contents of the table. Example: v_Address If the view comes from more than one table, then the name must describe the combined name of the underlying tables. Example: v_Customer_Address 2.7 APPLICATION LOGINS An application login is an account that provides the application permissions to access the SQL Server databases. These conventions only apply to SQL authentication accounts. The application owner login name must be prefixed with the database name “XXApplicationName”, where XX is the agency two-digit code. The application owner login name should be suffixed with its database purpose. This allows for multiple logins created for an application. Example: SPLTCuser, SPLTCAdmin The login name must not include spaces and special characters (such as -, $, #, +, etc.) including the underscore “_”. The report login should be suffixed with _Report (only exception to underscore “_”). Example: SPLTC_Report DATABASE DESIGN SQL SERVER DATABASE NAMING STANDARDS PAGE 7 OF 10 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE 3 Database Standard Data Elements & Abbreviations 3.1 STANDARD DATA ELEMENTS Standard data elements are created to achieve consistency of names, datatype, lengths, etc. throughout the agencies databases. This will benefit the sharing of data between agencies and applications. To set a standard, all database elements are researched, and PSP‟s data dictionary is researched to determine the most used or most accurate element name and attributes. 3.2 Standard elements take precedence over using any abbreviation. Prefixes can be added to the standard element. Ex: FIRST_NAME could become CLIENT_FIRST_NAME, CONTACT_FIRST_NAME, etc. ABBREVIATIONS The PSP Data Administration owns and maintains the PSP Database Naming Standards Abbreviations list. Please contact the Data Administrator within the Database team for access to the list. General guidelines are as follows: Abbreviations should be avoided whenever possible but, if necessary, they must conform to the standard abbreviations as set forth in the PSP Database Naming Standards Abbreviations list. An attempt shall be made to keep this list to a minimum. Abbreviations should not be used for words with six or less characters. If words over six characters are used and an abbreviation is required, consult the PSP Database Naming Standards Abbreviations list. If an abbreviation does not appear in the list, spell the word out and the PSP Data Administrator will assign the official abbreviation and include it in the list. Abbreviations and acronyms that are unique to the agency must be provided to the PSP Database team. The PSP Database Naming Standards Abbreviations list only contains common abbreviations to be used across the Commonwealth Enterprise. DATABASE DESIGN SQL SERVER DATABASE NAMING STANDARDS PAGE 8 OF 10 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE 4 Appendix A – Reserved Words Microsoft SQL Server uses reserved keywords for defining, manipulating, and accessing databases. Reserved keywords are part of the grammar of the Transact-SQL language used by SQL Server to parse and understand Transact-SQL statements and batches. Although it is syntactically possible to use SQL Server reserved keywords as identifiers and object names in Transact-SQL scripts, this can be done only by using delimited identifiers. The following Microsoft links list the SQL Server reserved keywords. SQL Server 2008 R2 - http://msdn.microsoft.com/en-us/library/ms189822(v=sql.105).aspx SQL Server 2012 - http://msdn.microsoft.com/en-us/library/ms189822.aspx DATABASE DESIGN SQL SERVER DATABASE NAMING STANDARDS PAGE 9 OF 10 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE 5 Appendix B – Resources & References For additional standards information, please reference the documents listed below which are available on the Bureau of Information Technology Database Sharepoint site: Database Design Data Modeling Standards defines the logical and physical design of a database, listing of data model components, and data modeling naming conventions. Database Design SQL Server Database Naming Standards defines naming standards, construction, abbreviations, and rules for database objects. Database Design SQL Server Stored Procedure Standards defines naming standards and examples of how stored procedures should be coded and documented. A code review checklist is also provided. Database Design SQL Server SSIS Standards defines roles and responsibilities, naming conventions, standards, and best practices for SSIS development. DATABASE DESIGN SQL SERVER DATABASE NAMING STANDARDS PAGE 10 OF 10 PA State Police (PSP) Database Design SQL Server Stored Procedures Standards Database Team Version 1.4 Date: 03/14/2014 SECURITY WARNING The information contained herein is proprietary to the Commonwealth of Pennsylvania and must not be disclosed to un-authorized personnel. The recipient of this document, by its retention and use, agrees to protect the information contained herein. Readers are advised that this document may be subject to the terms of a non-disclosure agreement. DO NOT DISCLOSE ANY OF THIS INFORMATION WITHOUT OBTAINING PERMISSION FROM THE MANAGEMENT RESPONSIBLE FOR THIS DOCUMENT. COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE Version History Date Version Modified By / Approved By Section(s) Comment 8/11/2009 1.0 M.Forbes All Initial Version 05/05/2010 1.1 S. Greer / C. Reber 2/3 Update links to point to web page instead of individual documents 06/17/2011 1.2 S. Greer 4.1, 5 Added programmability object paragraph. Added TFS as a code repository. 11/15/2011 1.3 C. Reber All Change ESF to EDC 03/14/2014 1.4 S. Greer All Change EDC to PSP DATABASE DESIGN SQL SERVER STORED PROCEDURE STANDARDS PAGE 2 OF 9 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE Table of Contents 1 INTRODUCTION................................................................................................................................. 4 1.1 PURPOSE........................................................................................................................................ 4 1.2 SQL SERVER STORED PROCEDURE OVERVIEW ......................................................................... 4 2 ROLES AND RESPONSIBILITIES.................................................................................................. 5 2.1 EDC DBA ROLE AND RESPONSIBILITY ........................................................................................ 5 2.2 AGENCY ROLE AND RESPONSIBILITY ........................................................................................... 5 3 STORED PROCEDURE NAMING CONVENTIONS .................................................................... 6 4 DESIGNING & CODING STORED PROCEDURES ..................................................................... 7 4.1 STORED PROCEDURE HEADER STRUCTURE ............................................................................... 7 4.2 CODING STORED PROCEDURES ................................................................................................... 8 5 APPENDIX A – SQL SERVER CODE REVIEW CHECKLIST ................................................... 9 DATABASE DESIGN SQL SERVER STORED PROCEDURE STANDARDS PAGE 3 OF 9 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE 1 Introduction 1.1 PURPOSE The Pennsylvania State Police (PSP) Bureau of Information Technology Database team provides Database Design Services for applications that require database development. This document describes the stored procedure standards that will be applied for Database Design and focuses on defining the naming conventions for stored procedures. Examples of how stored procedures are to be coded and documented is provided. A code review checklist is defined in Appendix A to be used as a guideline for coding stored procedures. 1.2 SQL SERVER STORED PROCEDURE OVERVIEW Stored procedures are server level procedures that are called within application programs. They are a group of Transact-SQL statements compiled into a single execution plan. If stored procedures are utilized, they must follow the standards set within this document. Many activities in SQL Server are performed through system stored procedures. System stored procedures are created and stored in the database and have the “sp_” prefix. Developers will be granted the ability to create stored procedures within the development environment. In all client applications, stored procedures should be used rather than sending T-SQL commands directly to the SQL Server. The DBA will be responsible for the migration of the stored procedures to all other environments. When coding applications, the use of stored procedures to separate application logic from database code has been found beneficial. The advantages include: Eased Maintenance – When supporting database structure or application logic changes, the DML is easily accessible directly from the database, can be easily scanned using SQL. As such, it is recommended that all web-based applications use stored procedures for the majority, if not all, database access. Enhanced Performance – Stored procedures run directly on the database server, which can be optimally tuned to perform such operations. Stored procedures are pre-compiled and do not need to be compiled each time they are executed. Reusability – Stored procedures can be called from multiple applications, thereby reducing the time required to design, code and test commonly used application functions. Enhanced Security –The application users are restricted to execute only the specified stored procedures within the database. There is no need to grant direct table access. Stored procedures are compiled code and cannot be manipulated or executed as SQL code. Stored procedures help to reduce SQL injection risks. DATABASE DESIGN SQL SERVER STORED PROCEDURE STANDARDS PAGE 4 OF 9 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE 2 Roles and Responsibilities EDC Database Design requires the agency to complete and submit the EDC Database Design Process Requirements document (available on the EDC Database Design Services website page) for complex stored procedures. This document includes an overview of the stored procedure tasks, and the tasks associated with those requirements, for stored procedures requiring EDC DBA design. Database Design Services (DDS) will collaborate with the developer and provide the design for complex stored procedure by updating the Database Design Process Requirements Document with a recommended solution on how to develop the stored procedure. 2.1 2.2 DBA ROLE AND RESPONSIBILITY Support the development of stored procedures with the Developer Review the Database Design Process Requirements Document of a complex stored procedure Collaborate on design for a complex stored procedure and provide recommended solution to the Developer (Deliverable) Ensure best practices (such as proper error-checking or error-raising from database to application) are followed Test all stored procedures to ensure best optimization choice Deploy the stored procedures DEVELOPER ROLE AND RESPONSIBILITY Complete and submit the Database Design Process Requirements Document Template for complex stored procedures to the DBA (Deliverable) Collaborate on design for complex stored procedures Write the stored procedures Test and tune stored procedures to ensure best optimization choice Submit stored procedure to DBA which contains the header section and process logic as defined in Section 4. DATABASE DESIGN SQL SERVER STORED PROCEDURE STANDARDS PAGE 5 OF 9 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE 3 Stored Procedure Naming Conventions Naming conventions provide a standard approach to naming different objects and help to troubleshoot and locate objects. Stored procedures should be named according to the business function they perform or business rule they enforce. The following naming conventions must be adhered to for SQL Server stored procedures. A user-defined stored procedure name must include a prefix of „USP_‟ and a business description of the action performed. A user defined extended stored procedure name must include a prefix of „UXP_‟ and a business description of the action performed. SP_ and XP_ are not to be used as the prefix of user defined stored procedures and extended stored procedures. The stored procedure name should be appropriately abbreviated, as defined in the abbreviations list, which is available from PSP Data Administration. An example of a procedure name would be USP_Insert_New_Bldg. As the name implies, this procedure would insert a new Building row where Bldg is the abbreviation for Building. For more details, reference the Database Design SQL Server Database Naming Standards document (available on the Bureau of Information Technology Database Sharepoint site. The stored procedure name must contain a verb (such as Select, Insert, Update, Delete or Combine) and the description that best describes the work the stored procedure will perform. The stored procedure name should have a maximum length of 30 characters. DATABASE DESIGN SQL SERVER STORED PROCEDURE STANDARDS PAGE 6 OF 9 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE 4 Designing & Coding Stored Procedures This section provides examples of the standards that should be followed when designing a stored procedure. A code review checklist is also used to help developers as a guideline in helping to insure that standards are being met as they create or alter stored procedures. See Appendix A for the checklist. 4.1 STORED PROCEDURE HEADER STRUCTURE All programmability objects should contain the header section described below. Use Properties Extended Properties for the object description in place of the header section when not available for the object (i.e. User-Defined Table Types). Header section This is to be placed at the beginning of the stored procedure. /*********************************************************************** * Database: (name of database being used for the procedure goes here.) * Procedure Name: * Date: * Author: * Procedure Description: * * * Parameters: * Returns: * * * * * * Calling Mechanism: (How is it being called: DTS, Stored Procedure etc.) Tables / Alias Definitions: (what tables are being accessed) * * * Procedures Used: (if any other stored procedures are called within this procedure.) * * Notes: * * Special Comments/Warnings: * ************************************************************************ * Version: (current version number) * Date: Author: (current version author) ************************************************************************* * Description of Modifications: DATABASE DESIGN SQL SERVER STORED PROCEDURE STANDARDS PAGE 7 OF 9 COMMONWEALTH OF PENNSYLVANIA * 1. See { * Special Comments: PENNSYLVANIA STATE POLICE } (what is modified) * * Other modules changed with this request: * ***********************************************************************/ 4.2 CODING STORED PROCEDURES The following is an example of how a new procedure is to be coded and documented. Every table should be accessed in the same order for every stored procedure that references it, to help prevent deadlocks. CREATE PROCEDURE [schema name].usp_test As Set NOCOUNT ON /*this should be first line in code*/ Go /*********************************************************************** ** This area describes what the sql statement is doing. DROP TEMP TABLE USED TO HOLD VALUES *******************************************************************************/ DROP TABLE ##logfiles GO DATABASE DESIGN SQL SERVER STORED PROCEDURE STANDARDS PAGE 8 OF 9 COMMONWEALTH OF PENNSYLVANIA PENNSYLVANIA STATE POLICE 5 Appendix A – SQL Server Code Review Checklist ITEM YES NO COMMENTS 1. CODE IS IN SOURCE SAFE or TFS X 2. PROPER HEADER COMMENT X See Section 4 3. PROPER MAINTENANCE COMMENT X See Section 4 4. CURSORS USED 5. PROPER INDENTING OF FIELDS 6. USE OF =* OR *= X 7. USE OF SELECT INTO X 8. PROPER COLUMN WIDTH(80) X 9. PROPER OVERALL LENGTH OF OBJECT(4 PAGE MAX) X X X 10. USE OF * X 11. PROPER USE OF COMPOUND STATEMENTS X 12. PROPER FORMATTING OF FIELDS X 13. VERBS ARE DIFFERENT CASE FROM VARIABLES X 14. OBJECT COMPILES X 15. PROPER USE OF ERROR HANDLING X 16. NUMBER OF NESTED IF‟S DOES NOT EXCEED 4 X 17. SPEED AND PERFORMANCE OF OBJECT IS ACCEPTABLE X 18. OBJECT HAS ONE PURPOSE X 19. OBJECT CAN BE BROKEN DOWN INTO SMALLER PARTS EXECUTION PLAN ANALYZED X 20. T_SQL VERBS ARE SPELLED OUT X 21. DEAD OR UNUSED CODE REMOVED X 22. LEFT JOINS CAN BE REWRITTEN AS INNER JOINS EX. INNER JOIN VS INNER X 23. ORDER OF JOINS RETURNS MINIMAL ROWS NEEDED X 24. FIELDS LISTED ON INSERT X 25. DYNAMIC SQL X 26. TABLE VARIABLES OR VIEWS INSTEAD OF TEMP TABLES X 27. TABLE & INDEX SEEKS NOT SCANS X 28. TEXT DATA TYPE IF LESS THAN 8 KB USE VARCHAR(8000) X DATABASE DESIGN SQL SERVER STORED PROCEDURE STANDARDS Total number of lines (excluding comment lines): 45 EXECUTION PLAN ANALYZED PAGE 9 OF 9 PA State Police (PSP) Database Design SQL Server Integration Services (SSIS) Standards Database Team Version 1.7 Date: 03/14/2014 SECURITY WARNING The information contained herein is proprietary to the Commonwealth of Pennsylvania and must not be disclosed to un-authorized personnel. The recipient of this document, by its retention and use, agrees to protect the information contained herein. Readers are advised that this document may be subject to the terms of a non-disclosure agreement. DO NOT DISCLOSE ANY OF THIS INFORMATION WITHOUT OBTAINING PERMISSION FROM THE MANAGEMENT RESPONSIBLE FOR THIS DOCUMENT. COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE Version History Date Version Modified By / Approved By Section(s) Comment 8/11/2009 1.0 C.Hoffman All Initial Version 9/24/2009 1.1 C.Hoffman 3.3 New item added 4/23/2010 1.2 C.Hoffman 4.1, 4.2, 4.3, 5.1, 5.2, 3.1 SSIS File Structure, Error Log file location Added link to naming standards. 05/05/2010 1.3 S. Greer / C. Reber 2.0 / 3.1 Update links to point to web page instead of an individual document. (2.0 intro / 3.1 – bullet 3 – rework verbiage) 11/15/2011 1.4 C. Reber All Change ESF to EDC 08/10/2012 1.5 S. Greer 3.3, 4.1, 4.2 FTP guidelines added 03/04/2013 1.6 S. Greer 3.3 FTP guidelines updated 03/ 1.7 S. Greer All Change EDC to PSP SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 2 OF 15 COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE Table of Contents 1 INTRODUCTION................................................................................................................................. 4 1.1 PURPOSE........................................................................................................................................ 4 1.2 SQL SERVER INTEGRATION SERVICES (SSIS) OVERVIEW ........................................................ 4 2 ROLES AND RESPONSIBILITIES.................................................................................................. 5 2.1 EDC DBA ROLE AND RESPONSIBILITY ........................................................................................ 5 2.2 AGENCY ROLE AND RESPONSIBILITY ........................................................................................... 5 3 SSIS NAMING AND CODING CONVENTIONS ............................................................................ 6 3.1 SSIS PACKAGE NAMING CONVENTIONS ...................................................................................... 6 3.2 SSIS PACKAGE OBJECT NAMES AND DESCRIPTIONS ................................................................. 7 3.3 DESIGNING AND CODING SSIS PACKAGES.................................................................................. 9 4 SSIS PACKAGE DEVELOPMENT STANDARDS ..................................................................... 11 4.1 SSIS FILE STRUCTURE ON DATABASE SERVER ........................................................................ 11 4.2 SSIS FILE STRUCTURE ON FILE SERVER ................................................................................... 11 4.3 SSIS PACKAGE ERRORS AND LOGGING .................................................................................... 11 4.3.1 Package Error Log ............................................................................................................. 11 4.3.2 Transform Data Task Exception Log ............................................................................... 12 4.4 SSIS PACKAGE DEPLOYMENT .................................................................................................... 14 5 SSIS PACKAGE TROUBLESHOOTING ..................................................................................... 15 5.1 5.2 REVIEWING THE PACKAGE LOGS ................................................................................................ 15 REVIEWING TRANSFORM DATA TASK EXCEPTION FILE ............................................................ 15 SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 3 OF 15 COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE 1 Introduction 1.1 PURPOSE The Pennsylvania State Police (PSP) Bureau of Information Technology Database team provides Database Design Services (DDS) for developing applications that require database development. This document defines the Roles and Responsibilities, Naming Conventions, Standards, and Best Practices for SSIS development provided by the Database team. 1.2 SQL SERVER INTEGRATION SERVICES (SSIS) OVERVIEW SQL Server Integration Services (SSIS) is a component of SQL Server. In SSIS, the Data Flow Engine is segregated from the Control Flow Engine or SSIS Runtime Engine, which improves the performance. SSIS is „basically‟ an Extract, Transform, and Load (ETL) tool. ETL describes the processes that take place in data warehousing environments for extracting data from source transaction systems; transforming, cleaning and conforming the data; and finally loading it into cubes or other analysis destinations. SSIS is a platform for building high performance data integration and workflow solutions. SSIS packages are made up of tasks that can move data from source to destination and if necessary alter it on the way. Integration Services provides a Control Flow for performing work that is tangentially related to the actual processing that happens in data flow, including downloading and renaming files, dropping and creating tables, rebuilding indexes, performing backups, and any other number of tasks. Integration Services provides a full-featured control flow to support such activities. The Data Flow Task is a high-performance tool because you can use it to perform complex data transformations on very large datasets without hindering the processing performance. The pipeline concept means that you can process data from multiple heterogeneous data sources, through multiple parallel sequential transformations, into multiple heterogeneous data destinations, making it possible to process data found in differing formats and on differing media in one common "sandbox" location. SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 4 OF 15 COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE 2 Roles and Responsibilities PSP Database Design requires the developer to complete and submit the Database Design Process Requirements document (available on the Bureau of Information Technology Database Sharepoint site) for all SSIS packages. This document includes an overview of tasks and the requirements associated with those tasks. Database Design Services (DDS) will collaborate with the developer and provide the design for complex packages by updating the Database Design Process Requirements Document with recommended solutions on how to develop the SSIS package. DDS will determine if a package is simple or complex based on the requirements. Simple packages will be designed by the developer and noted in the Database Design Process Requirements Document solution as “Designed by Developer”. 2.1 DBA ROLE AND RESPONSIBILITY Support the development of SSIS package with the Developer Review the Database Design Process Requirements Document of the SSIS package Collaborate on design for complex SSIS packages Provide the Database Design Process Requirements Document with recommended solutions to the Developer (Deliverable) Ensure SSIS Package conforms to the SSIS Standards Document Setup and test SSIS Package job schedule using SQL Server Agent Deploy SSIS packages with the Package Installation Wizard on the Development and User Acceptance Testing environments Test packages prior to implementing them on the production server so that there is no negative impact to the server 2.2 DEVELOPER ROLE AND RESPONSIBILITY Prior to coding, complete and submit the Database Design Process Requirements Document Template of the SSIS package to the DBA (Deliverable) Collaborate on design for complex SSIS packages Design and code SSIS packages with Visual Studio Test SSIS Packages by using Visual Studio Create package configuration files (manifest, Config, DTSX) and provide to DBA Notify the System Administrator of the appropriate rights to all the data sources for the SSIS package SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 5 OF 15 COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE 3 SSIS Naming and Coding Conventions 3.1 SSIS PACKAGE NAMING CONVENTIONS SSIS Packages should be named according to the business function they perform. Naming conventions are as follows: Example: XXApplicationName_Personnel_Load XXApplicationName_Transfer_Web_Appl XXApplicationName_Data_Load_v2 The package name will include a prefix of „XXApplicationName‟ and a business description of the action performed. The “XXApplicationName” prefix will be the same name as the database, where XX is the agency two-digit codes. The main purpose for this prefix is to keep all Agency database related packages grouped together. The package name should be appropriately abbreviated, as defined in the abbreviation list, which is available from PSP Data Administration. For more details, reference the PSP Database Design SQL Server Database Naming Standards document (available on the Bureau of Information Technology Database Sharepoint site. If developers are working on multiple versions of the package, any additional versions will be suffixed with a lower case “v” followed by the version number. The first time a version number is assigned, it will start with the number 2. The first version of the package will not contain a version number. Use an underscore (_) for blank spaces. The Package name should be under 100 characters. SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 6 OF 15 COMMONWEALTH OF PENNSYLVANIA 3.2 PENNSYLVAINA STATE POLICE SSIS PACKAGE OBJECT NAMES AND DESCRIPTIONS 1. Rename all Name and Description properties from the default for all SSIS Objects. The default description should be changed to a short but suitable name for the object. This will greatly aid personnel in troubleshooting package errors. If the default name is not known, then do the following: Open the SSIS package. Right click on the object and select Properties. Figure 1: Locating Default Names for Package Objects 2. Use acronyms at the start of the name to better identify the object. The following acronyms should be used at the beginning of the names of tasks to identify the type of task. Task For Loop Container Foreach Loop Container Sequence Container ActiveX Script Analysis Services Execute DDL Analysis Services Processing Bulk Insert Data Flow Data Mining Query Execute DTS 2000 Package Execute Package Execute SQL File System FTP Message Queue Script Send Mail SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS Prefix FLC FELC SEQC AXS ASE ASP BLK DFT DMQ EDPT EPT SQL FSYS FTP MSMQ SCR SMT PAGE 7 OF 15 COMMONWEALTH OF PENNSYLVANIA Transfer Database Transfer Error Messages Transfer Jobs Transfer Logins Transfer Master Stored Procedures Transfer SQL Server Objects Web Service WMI Data Reader WMI Event Watcher XML PENNSYLVAINA STATE POLICE TDB TEM TJT TLT TSP TSO WST WMID WMIE XML Figure 2: Task Acronyms The following acronyms should be used at the beginning of the names of components to identify the type of component. Component DataReader Source Excel Source Flat File Source OLE DB Source Raw File Source XML Source Aggregate Audit Character Map Conditional Split Copy Column Data Conversion Data Mining Query Derived Column Export Column Fuzzy Grouping Fuzzy Lookup Import Column Lookup Merge Merge Join Multicast OLE DB Command Percentage Sampling Pivot Row Count Row Sampling Script Component Slowly Changing Dimension Sort Term Extraction Term Lookup Union All Unpivot Data Mining Model Training DataReader Destination SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS Prefix DR_SRC EX_SRC FF_SRC OLE_SRC RF_SRC XML_SRC AGG AUD CHM CSPL CPYC DCNV DMQ DER EXPC FZG FZL IMPC LKP MRG MRGJ MLT CMD PSMP PVT CNT RSMP SCR SCD SRT TEX TEL ALL UPVT DMMT_DST DR_DST PAGE 8 OF 15 COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE Dimension Processing Excel Destination Flat File Destination OLE DB Destination Partition Processing Raw File Destination Recordset Destination SQL Server Destination SQL Server Mobile Destination DP_DST EX_DST FF_DST OLE_DST PP_DST RF_DST RS_DST SS_DST SSM_DST Figure 3: Component Acronyms 3.3 DESIGNING AND CODING SSIS PACKAGES The following should be followed when designing and coding SISS packages. 1. All SSIS packages must contain a Data related task that either reads data from the database, extracts data from the database, transforms data, or loads data into the database. SSIS Packages should not be used entirely for file manipulation, FTP, Email etc. since these processes can be done within a .NET Application. 2. All File Paths and Connections should be setup as Variables and apart of the Config file so that it can be changed within the Config file when moving the package from one environment to another. 3. All SSIS packages will be called from SQL Agent Jobs. 4. Keep all SSIS packages as small as possible for efficiency, readability, and for rerunning after fixing errors. 5. When using SSIS; use the native OLE DB provider instead of the ODBC provider when importing and exporting data, as it provides better performance. 6. If data transforming (row by row) is not needed during a SSIS import in SQL Server, the Bulk Insert task provides the fastest data loads into SQL Server 7. SSIS lookups slow down performance. Use a Transact-SQL statement (stored procedure is best) to perform the same function within the SSIS package. Avoid using global variables or COM objects for performing lookup type functions, as they are even slower than using a SSIS lookup. 8. The Data Pump Task is faster than a Data Driven Query within a SSIS package, if there is a oneto-one mapping of the columns and no transformations are involved when moving data between tables. If there are transformations involved, then a Data Driven Query will offer better performance. 9. For transformations, setting an error threshold greater than 0 is NOT recommended unless the package is designed to only load new records. Otherwise, there will be duplicate records loaded when rerunning the package after a failure. 10. When calling an SQL Server stored procedure that will execute a BCP (Bulk Copy Process) command, the SSIS package needs to commit all transactions before calling the procedure, because a BCP command will control its own transactions and does not allow a calling procedure to handle it. Therefore, the previous step before the BCP step must commit its transactions on successful completion of package. 11. Do not use ActiveX scripts because they will not be supported in future versions of SSIS. ActiveX scripts also do not migrate during migration from DTS Package to SSIS Package. They should be replaced with .NET Scripts 12. Only select columns that you need in the pipeline, to reduce buffer size and reduce OnWarning events at execution time. This helps in avoiding extra execution time overhead of package and in turn improves overall performance of package execution. SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 9 OF 15 COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE 13. Use Sequence Containers to organize package structure into logical units of work. This makes it easier to identify what the package does and helps to control transactions if they are being implemented. 14. Use caching in your LOOKUP components where possible. It makes them quicker. Watch that you are not grabbing too many resources when you do this though. 15. If you want to conditionally, execute a task at runtime use expressions on your precedence constraints. Do not use an expression on the “Disable” property of the task. 16. Don‟t put all configurations into a single XML configuration file. Instead, put each configuration into a separate XML configuration file. This is a „modular‟ approach and means that configuration files can be reused by different packages more easily. 17. Make sure to set the package protection level to „DontSaveSensitive‟ to avoid package development errors when deploying to other environments. If you use the default packageprotection level, „EncryptSensitiveWithUserKey‟, the same package may not execute as expected in other environments because the package was encrypted with your personal key. 18. While configuring any OLEDB connection manager as a source, avoid using „Table or view‟ as data access mode, this is similar to „Select * From‟. Always try to use „SQL Command‟ data access mode and only include required column names in your SELECT T-SQL statement. 19. Sorting of data is a time consuming operation, in SSIS you can sort data coming from upstream using „Sort‟ transformation, however this is a memory intensive task and may degrade the overall package execution performance. It‟s better to perform the sorting operation at the database level where sorting can be performed within the query. SQL Server database sorting is much refined and happens at the SQL Server level, which results in overall performance improvement in package execution. 20. Use Flat File Connection Manager very carefully, creating Flat File connections with default setting will use data type string [DT_STR] as a default for all the column values. This may not be the right option because you might have some numeric, integer etc. in your source, passing them as a string to downstream tasks would take unnecessary memory space and may cause errors at tasks that follow in the package execution. 21. Use the SSIS feature, Checkpoint, to restart failed packages from the point of failure. 22. Do not use Execute Process Task. 23. Execute SQL Task is our best friend in SSIS; it can be used to run a single or multiple SQL statement at a time. The results can be returned as a single row, full result set and XML. 24. Files created or FTP on the database server are to be deleted when the package is done processing the file. If multiple packages need to process the same file the last package to process the file should delete the file. SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 10 OF 15 COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE 4 SSIS Package Development Standards 4.1 SSIS FILE STRUCTURE ON DATABASE SERVER The SSIS File structure shall include the following folders: 1. A Main Folder named “SSIS”. 2. A subfolder named “FTPFiles”. This is used for files that are FTP to or from the database server. Once the file is processed it should be deleted. 3. A subfolder name that follows the format “XXApplicationName”, where XX is the agency two-digit code 4. Under each application subfolder, the following Subfolder names: Subfolder Archive_Files Logs Packages PKGConfig 4.2 Purpose Used for file archiving SSIS packages Used for any SSIS package Transform Data Tasks exception files Used to store the package deployment files Used to store the Package Configuration file, which allow us to make SSIS packages portable so they can easily be migrated from one environment to another SSIS FILE STRUCTURE ON FILE SERVER 1. A Share folder name that follows the format “XXApplicationName”, where XX is the agency two-digit code 2. The following Subfolder names: Subfolder Logs Data_Files 4.3 Purpose Used for the package error file. Used for storing SSIS data files SSIS PACKAGE ERRORS AND LOGGING Because you can‟t foresee all the conditions that will ultimately occur in a production environment, SQL Server Integration Services (SSIS) includes logging features that write log entries when run-time events occur and can write custom messages to display information about a package after it‟s been executed. The challenge is to log enough information to help you quickly diagnose and minimize the impact of problems that might occur in production. Integration Services supports a diverse set of log providers, and gives you the ability to create custom log providers. The Integration Services log providers can write log entries to text files, SQL Server Profiler, SQL Server, Windows Event Log, or XML files. Logs are associated with packages and are configured at the package level. Each task or container in a package can log information to any package log. 4.3.1 Package Error Log 1. To add logging to a package, click Logging on the SSIS menu and select one or more error log providers, which enable you to write to a target destination. Select the check box next to the events that you want to log, then click the details tab and specify the events you want to log. Next, click Advanced to specify the columns to be logged, otherwise all the columns will be logged. Depending on the events you choose to log, the error log can grow rapidly. Be sure to log only the events that you need and occasionally prune old log entries. Because SSIS doesn‟t SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 11 OF 15 COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE include a process to do this out of the box, you must manually prune logs if your error log provider doesn‟t provide the functionality to do so. 2. The package error path will be located in the Logs subfolder of the Application folder structure on the file server. 3. Name the file name according to the SSIS package name. 4. The package log file will be suffixed with “_Error.txt”. 4.3.2 Transform Data Task Exception Log 1. To record errors in Data Transformation, create an Exception File to log all errors in the transformation. The exception file path will be located in the Logs subfolder of the Application folder structure on the file server. Name the file name according to the SSIS package name. 2. The exception log files will be suffixed with “_Exception.txt”. Figure 4: SSIS Process to save errors to an Exception File SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 12 OF 15 COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE Figure 5: Redirect Error Row Output SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 13 OF 15 COMMONWEALTH OF PENNSYLVANIA 4.4 PENNSYLVAINA STATE POLICE SSIS PACKAGE DEPLOYMENT SSIS has a feature called "Package Configurations". Package configurations allow us to make SSIS packages portable so they can easily be migrated from one environment to another. Configuration Files contain the connection strings for various sources and destinations, package variables and the expressions. We can modify these properties without even opening up the package. At the run time, the SSIS engine looks for the Configuration File. If the file does not exist then it takes the configuration/variable information contained in the SSIS package. If the file exists, then the information in the Configuration File is made used. If any variable/configuration information is missing in the Configuration File, that particular value of the configuration parameter/variable from the SSIS package is made used. 1. When your SSIS Package is complete, use the "Package Configuration Wizard" to create the package configurations, which will create a Config file, a Manifest file for deploying, and the actual SSIS Package file (DTSX). 2. Use the „XML configuration file‟ option because it is more flexible. 3. Provide the Config, Manifest, and SSIS Package files to the EDC DBA who will deploy them to SQL Server Integration Services. SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 14 OF 15 COMMONWEALTH OF PENNSYLVANIA PENNSYLVAINA STATE POLICE 5 SSIS Package Troubleshooting This section explains how to review a package log and exception files to assist with SSIS Package troubleshooting. 5.1 REVIEWING THE PACKAGE LOGS 5.2 The Error Log file is a running result of each step in a SSIS package. It will report a success or failure and the reason for the failure. This file will also contain a start and completed time for the package and for each step including the total execution time, Package Name, description, ID, Version, server name where package was executed, and who executed the package. The error file is located in the Logs subfolder of the Application folder structure on the file server. REVIEWING TRANSFORM DATA TASK EXCEPTION FILE The exception file is located in the Logs subfolder of the Application folder structure on the file server. SQL SERVER INTEGRATION SERVICES (SSIS) STANDARDS PAGE 15 OF 15