Download Unicode Migration

Module 6: Global Deployments using Siebel Global Deployments Unicode with SQL Server ©Siebel Systems 2005 – Do not distribute or re-use without permission Global Deployments using Unicode with SQL Server What is a “Global Deployment” Challenges Underlying technologies Siebel implementation How to get there Agenda Overview Introduction to Code Pages Introduction to Unicode Unicode and Siebel Unicode Migrations The Business Challenge Global Organisations require Global Solutions View the User Interface in Multiple Languages Store data from many languages Display the data using regional preferences Different data for different regions Customers have a single view of the company, and the company has a single view of its customers The Business Challenge Reduced cost of ownership Centralised IT infrastructure Centralised Data (single source) Development and testing only happen once Increased Customer Satisfaction Overview Introduction to Code Pages Introduction to Unicode Unicode and Siebel Unicode Migrations Traditional Code Pages Tables that relate binary values to graphical characters and symbols Binary values are transmitted as single or multiple bytes  Western character sets are single-byte  Many Asian character sets are multi-byte Sometimes called “character sets” or “code sets”  The term "code" emphasizes the binary value aspect  The term "character" emphasizes the graphical representations (bitmaps or glyphs) Code Page Examples 1252 Western European Code Pages and Supported Languages All standard code pages support English Windows Code Page Number Languages Commonly Used Code Page Name 1252 English, Albanian, Basque, Catalan, Afrikaans, Danish, Dutch, Finnish, French, German, Icelandic, Italian, Norwegian, Portuguese, Spanish, Swedish 8859-1 1250 English, Czech, German, Hungarian, polish, Romanian, Slovak, Slovenian 8859-2 1254 Same as 1252, but with Turkish replacing Icelandic 8859-9 1253 English and Greek 8859-7 1255 English, Hebrew, Yiddish 8859-8 1256 English and Arabic 1251 English and Russian 8859-5 932 English and Japanese Shift-JIS 949 English and Korean 950 English and Traditional Chinese Big 5 936 English and Simplified Chinese GBK KS C 5601 Codepage Comparison: ASCII Codepage Comparison: ANSI 1252 (Western European)  Windows 1252 Codepage – Known as WE8MSWIN1252  Characters are allocated in the 80 - 9F area Codepage Comparison: ISO 8859-1 (Western European)  Known as WE8ISO8859P1  Chars between 80 and 9F not allocated.  Characters A0 to FF are the same as in MS CP 1252  Except: There is no Euro codepoint! Codepage Comparison: ISO 8859-15 (Western European)  Known as WE8ISO8859P15  Chars between 80 and 9F not allocated.  Characters A0 to FF are NOT the same as in MS CP 1252  8 characters differ, all of which are in the ANSI 80 - 9F area. 80 8E 8C 8A 9A 8C 9C 9F Codepage Comparison: ANSI 1250 (Eastern European)  Chars between 80 and 9F are allocated. Codepage Comparison: ISO 8859-2 (Eastern European)  Chars between 80 and 9F not allocated.  Characters between A0 and FF are NOT the same  There are 15 differences, 10 of which are in the ANSI 80 - 9F area. A5 BC 8C B9 BE 9E A1 8A 8D 8F 8E 9A 9D 9F 9E Character Conversion 00 3 4 C 4 4 D Ä 5 5 E Å 6 6 F Æ F ? Japan Byte Stream: Western Europe ‘44 C4 46 C5’ WE Output: Japanese Output: ‘DÄFÅ’ ‘D聞?’ C5 0 1 2 0 情詳業 C4 3 4 51 し同 4 さ表 2 らい金 00 3 4 1 新報 4 4 D 2 聞発 5 5 E 6 6 F Overview Introduction to Code Pages Introduction to Unicode Unicode and Siebel Unicode Migrations Unicode Codepages What is Unicode? Unicode is a codepage intended to support all languages. Unicode contains characters from all traditional codepages and satisfies the need for virtually all world languages. Unicode Flavors UTF-16  Multi-double-byte encoding; uses one or two double-byte chunks to represent the available characters.  Supports Unicode Standard 2.0 and above  Supports Surrogate characters  Encoding standard used by Siebel internally for executables  Often synonymous with UCS2, but supports characters consisting of multiple 2-byte blocks UCS2  Double-byte encoding  Supports Unicode Standard 1.x  Official codepage name used in support statements by Microsoft for Microsoft SQL and Windows Server 2003 and 2000 Unicode - What it gives you Consolidation Possibilities  Customers can consolidate data that would previously have to be in separate Enterprises due to codepage restrictions.  Customers are not bound by codepage restrictions and can consolidate HW. Performance may still dictate distributed HW. Data Sharing  Customers can share data that could not be shared before. Easier Management  For Siebel deployments crossing languages and regions. Unicode - What it doesn’t give you Don’t expect Unicode to…  Be the savior of all Global Deployment problems. Unicode is only a codepage.  Do language translations. If my friend in Japan enters Japanese text, I don’t automatically see it in English here.  Remove the need to implement solid Business Practices for managing data in multiple languages. Do you want your Italian users to get contact information for Chinese contacts in Chinese? Should a customer in Japan be able to see products available only in Spain in Spanish? Overview Introduction to Code Pages Introduction to Unicode Unicode and Siebel Unicode Migrations Siebel 7.5 Global Unicode-enabled Deployment Mobile Clients Web Clients Local DB ENU FRA ESN JPN Web Server, Web Engine Gateway Server The Internet Actuate Report Servers Application Server File System User Data ENU FRA ESN JPN ENU FRA ESN JPN ENU FRA ENG FRS Siebel Database ENU ENU FRA FRA ESN ESN JPN JPN OM SRFs / Locales Siebel Tools Repository Siebel and Unicode Unicode support from Siebel 7.5 on  This presentation refers to Siebel 7.5 or later Supporting characters from more than one Codepage  One Siebel Enterprise -> One Database -> One Codepage  Requires Unicode Database Single Executable encoding  All internal processing within Siebel in Unicode  Data converted as necessary Siebel still supports non Unicode databases  Western European (1252)1 ¹ - See Siebel Systems Requirements and Supported Platforms Guide on SupportWeb for the latest information What does Siebel v7.5 with Unicode look like? Locales Web Clients: ENU From Siebel 7.5 one physical server can support multiple locale-specific data formats because Siebel locale settings are independent of those used by the Server OS. DEU ESN For Mobile Clients, the locale can be changed by the user through the Regional Settings/Options in Windows Control Panel. Actuate The Actuate Reports Server is fully Unicode enabled This enables consolidated handling of all languages and locales on a single server with the language selection at runtime: Configuration Considerations Data Segregation  Which data is supposed to be entered in local languages and which in corporate language. Is the following view acceptable to end users? Integration Character Conversion UCS2 Unicode 0000 14 15 16 … 44 45 46 … C4 C5 C6 00 … Western Europe 3 4 C 4 4 D Ä Д 5 5 E Å E 6 6 F Æ Ж F ? 3 4 C 4 4 D Д 5 5 E E 6 6 F Ж F ? 04 ? D E F Data Intact Ä Å Æ Cyrillic Integration Character Conversion UCS2 Unicode 0000 14 15 16 … 44 45 46 … C4 C5 C6 00 ? D E F … Western Europe 3 4 C 4 4 D Ä Д 5 5 E Å E 6 6 F Æ Ж F ? 04 Solution: Updates not allowed. Un-Displayable Data is read-only Character Conversion Ä Å Æ Cyrillic 3 4 C 4 4 D Д 5 5 E E 6 6 F Ж F ? Siebel eBusiness Application Integration Integration  Systems with a multitude of code pages  Even with Unicode, company may have different ‘flavors’  External partners may have data in different code pages  Often the cause of data corruption when moving to Unicode. Solution  Investigate ALL interfaces whilst planning Unicode migration  Re-develop Interfaces and upgrade external systems/middleware as necessary to support Unicode  Use the Siebel Transcode Business Service to validate data before sending it and do not send if it cannot be stored.  TEST, TEST and TEST again! Siebel eBusiness Application Integration Transcode Business Service  Business Service to convert data between codepages Two modes  Validate – Only checks if a conversion could be performed without character conversion error  Convert – Converts data between code pages Can be employed in Workflow Processes Need to implement error handling Data Growth SQL Server uses different data types for Unicode and NonUnicode data Unicode data types require more space than Non-Unicode. Data Fixed Length Variable Length Large Bytes per character Non-Unicode char varchar text 1 Unicode (UCS-2) nchar nvarchar ntext 2 Could cause rows to exceed SQL Server page size (8KB)!  May need to move some columns to extension tables Database Growth moving to Unicode Language National Code Page UTF-16/UCS2 English 1 byte 2 bytes ~70% Western European 1 byte 2 bytes ~70% Eastern European 1 byte 2 bytes ~70% Asian 2 bytes 2 bytes 0% varchar/char/text data expansion only! Note: Database expansion will vary substantially with profile of data Collation Sequences / Sort Order What is the Collation Sequence / Sort Order? The way in which characters are ordered changes for different locales e.g. A collation sequence is not the same as a sort order. Collation sequences a unicode and a non-unicode sort order together with a non-unicode code page, so have a greater impact than simple sort orders. A collation sequence can also affect unique keys as they affect string comparisons. SQL Server Collation Sequences/Sort Order Siebel Enterprise Database (Development System)  Binary Siebel Enterprise Database (Production System)  Binary (Recommended) This is also the only supported collation sequence for UCS2 (unicode)  Dictionary case-sensitive and case-insensitive Overview Introduction to Code Pages Introduction to Unicode Unicode and Siebel Unicode Migrations Moving to Unicode New Install:  Create Unicode database  Select Unicode as code page during install. Migration from standard codepage  Unicode migration is not a simple matter of altering the codepage and REQUIRES MANDATORY assistance of Siebel Expert Service to protect your data from corruption during the migration process. The length of this engagement will vary depending on the complexity of the environment and the size of the database.  Get details from TAM or Practice Manager Preparation (Source database) Run DBCHCK to validate Siebel Schema matches repository Check for conversion errors Delete inactive repositories If Development Environment Check In ALL Projects Create “awkward” test records Use unusual characters Backup the source database Preparation (Target database) Create new database with Code Page UCS-2 Should be in same SQL server instance as source database Increase storage space Ensure large space for tempdb Run grantusr.sql to create default users and roles Migrate all users (not created during migration) SQL Server – Unicode Migration (migrate.bat) CP (Source) char(1) char(1) varchar2(10) varchar(40) char varchar char varchar char varchar 1. ddldict - Generate schema.ddl with schema definitions from source (CP) database schema.ddl nchar(1) nvarchar(40) char varchar char varchar 2. ddlimp /Z Y - Apply schema.ddl to target (UNI) database through ddlimp with /Z Y option (converts char to nchar, varchar to nvarchar and text to ntext) insert.sql 3. Genload – Creates insert.sql file for migrating data from source to target. UNI (Target) 4. insert.sql – Runs SQL statements loading the data. insert into target..table (col...) select col... from source..table Preparation of ‘migrate.bat Update environment specific variables Users, Passwords, ODBC DSNs, etc Ensure ‘ddldict’ call includes ’/A Y /T DCIER’ Not included in all versions of ‘migrate.bat’ Steps in ‘migrate.bat’ script 1. 2. 3. 4. 5. 6. Create schema definition (ddldict) Create physical schema for new database (ddlimp) Create sequence (SQL) Create clustered indexes (SQL) Create scripts to migrate data (genload) Run script to migrate data (insert.sql)  insert into dbo.S_ETL_CTRYREGN (COUNTRY ,REGION ) select COUNTRY ,REGION from <sourcedb>..S_ETL_CTRYREGN; Steps in ‘migrate.bat’ script (contd.) 7. Create non-clustered indexes (ddlimp) 8. Update data type fields in siebel tables (SQL) System Preference 'Enterprise DB Server Code Page‘ = ‘utf-16’ S_APP_VER.UNICD_DATATYPS_FLG = 'Y' 9. Create views (SQL) Post Unicode Migration Tests Check migrated data  Dump out binary values of “awkward” data & check values are correct  View records for “awkward” data Check Unicode Database has sufficient spare space Re-run ‘dbchck’ Ensure that Unicode characters can be entered and displayed correctly Post Unicode Migration Tests Test connectivity via a Siebel Web Client. Test connectivity via a Siebel Tools Client. Generate a new database template. Create a new database extract. Initialize a new Tools local database and perform a GET operation. Create a new SRF by performing a full compilation. Generate browser scripts Post Unicode Migration Steps Backup Unicode Database Update Web Template file (jctrl.css) with Unicode Font Details  i.e. For Windows 2000 all occurrences of ‘JFONTFAMILY=Arial’ replaced by ‘JFONT-FAMILY=Tahoma’ Direct Siebel Servers to Unicode Database  i.e. Update ODBC DSN settings Re-Extract Mobile Clients and Developer Databases  Carry out Full Repository Get on Developer Databases Summary Overview Introduction to Code Pages Introduction to Unicode Unicode and Siebel Unicode Migrations Additional Resources Siebel Expert Services Offerings  Global Deployment Workshop  Unicode Migration Workshop  Unicode Migration  Unicode Migration Validation Technical Note 455: How can EAI processes be enabled for Global Deployment? Alert 573: Tools Repository Compilation Performance Issue on Microsoft SQL Server Unicode Database in 7.5 Additional Resources Titus Unicode Charts, reference for Unicode codepoints:  http://titus.uni-frankfurt.de/unicode/unitest.htm Unicode Primer:  http://www.menteith.com/unicode/primer/ Unicode Overview:  http://www.basistech.com/papers/unicode/overview.html Unicode Consortium Resources:  http://www.unicode.org Any Questions…. Module 6: Global Deployments using Siebel Global Deployments Unicode with SQL Server ©Siebel Systems 2005 – Do not distribute or re-use without permission Siebel Global Deployments ©Siebel Systems 2005 – Do not distribute or re-use without permission Siebel 7.7 Enhancements for Global Deployments “Smart Charset”  Responds to email with same code page (character set) as received in.  Useful for Asian browsers that do not support the Unicode encoding of emails “Symbolic String model” for translation of UI data  Application specific translation table for key terms  Translation stored only once and reused throughout application  Reduces the size of the Repository  Reduces the complexity of adding new languages to the User Interface Locale Management Utility (LMU) enhancement  Native LMU XLIFF support. XLIFF is XML localization industry standard  Standards-based support for integration with localization tools greatly reduces localization engineering time spent on import/export file transformation tasks

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Unicode Migration