Download UCPATH Data Dissemination Operational Data Store (DDODS) (docx)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Relational model wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Big data wikipedia , lookup

Clusterpoint wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Project Title: UCPATH Data Dissemination Operational Data Store (DDODS)
Submitter:
Albert Course
Senior Applications Manager
ITS Data Services
[email protected]
Phone (510) 987-0761
Mobile (925) 348-4265
Project Leads: Micheal Schwartz, Enterprise Data Architect
Hooman Pejman, Data Services Data Architect
Team Members:
Steve Hunter, Java Developer
Stephen Dean, SOA Architect/Developer
Ric Carr, Systems Analyst
Sanketh Sangam, ETL Developer
Jerome McEvoy, Enterprise Architecture, Manager
Deborah Hill, Project Manager
The Data Dissemination Operational Data Store (DDODS) is an UCPath product. It is designed to cull
information from a complex Human Capital Management & Payroll software package containing 20,000
tables, determine change data, and relay pertinent and concise information to multiple UC institutions.
The primary objective of the DDODS is to provide HR & Payroll data for the Data Warehousing
processing needs of each UC Location (Campuses, Medical Centers).
The data provided to each UC Location is structurally and semantically identical, uses a less complex
data model consisting of approximately 200 tables (1%), and establishes a set of consistent data
definitions for this subject across UC. This level of data consistency across all UC Locations is a primary
objective of the UCPath project.
Secondary objectives of the DDODS pertain to consumption of DDODS data by local business
applications, local operational reporting use and as an enabler of unique or localized Data Shaping needs
(e.g. offsetting the need for new UCPath interfaces).
The DDODS is comprised of a number of enabling technologies that support the processing steps
required to determine data change, identify data that has to be sent to each Location, consistently
deliver the data in a secure manner, and process the change data into the local DDODS database
repositories built on varying database platforms (Microsoft, IBM and Oracle).
The DDODS was a collaborative effort with all of the campuses participating in the design reviews. The
team participated in the development of a common data dictionary with approximately 200 tables and
5000 data elements. The DDODS was an early delivered product for locations to plan and design data
warehouse and interfaces. The DDODS is also being used to validate conversion data at Wave 1
locations.
What is it The DDODS is a product and service that distributes PeopleSoft data to campuses nightly and
will be supported and managed by ITS. The three main components of the product include the database,
loader application, and data dictionary. ITS is responsible for the design of the database and providing
locations with DDL scripts to recreate the database locally on one of three data base platforms: SQL
Server, Oracle or DB2. The loader application is a Java utility that locations can use to populate their
local DDODS. The data dictionary contains the published data definitions for all data elements in the
DDODS.
How does it work The DDODS starts out with a nightly Change Data Capture (CDC) process. The CDC
utilizes materialized views to snapshot the 200+ HCM tables and compares them with the prior days
snapshot using Informatica. The Informatica program evaluates each table and writes the changes to
the ODS database at Oracle Managed Cloud Services (OMCS). The data includes an indicator for Insert,
Update or Delete, the batch number and date and time stamp. A description of the CDC process can be
found on GCDP\UC - Systems\ODS\May 9 Preparation Materials. The file name is “DDODS Change Data
Capture.pptx”.
Once the Informatica job completes the updates to the ODS database, the Business Process Engineering
Language (BPEL) is launched. This process reads the new data that has been loaded and applies the
Affiliation Rules to the data to determine which location or locations the data needs to be sent to.
These Affiliation Rules take into account if the data needs to go to all locations (lookup tables), or a
single location based on the person’s job or to multiple locations due to data processing requirements
(for example UC Merced and UCOP data to UCLA). Once the BPEL processes generate all of the files
necessary for a location, the data is SFTP’d to each location. The process includes a control file that
identifies which files have been sent to a particular location and is used to launch the Java Loader.
Business locations are using a number of different databases, we used ERWin to generate the logical and
physical models. After surveying the locations, three target databases were selected: Oracle, Microsoft
SQL Server and IBM DB2. After a version of the DDODS is finalized, language appropriate DDLs are
generated for each location. These DDLs are sent to each location to allow them to build their local copy
of the ODS database. By managing the model centrally and delivering the DDLs, we ensure that each
locations database matches the central ODS database and reduce the workload for each location to
build these tables independently.
Locations are utilizing a number of different ETL tools for loading data; we decided not to pursue a
particular ETL tool but rather to develop a loader program. The loader program was developed in Java.
As part of the installation at each location, the location has the ability to control a number of
parameters including the source file directories and target database. The loader program runs a
background process to check for the existence of the “Control File” and then begins the data load into
the local ODS database. The log file keeps track of the success and/or failures of any of the data loads
and can be restarted after error resolution.
The DDODS data dictionary provides standard data definitions for over 5000 data elements in both HCM
and the DDODS. This data dictionary will be accessible and searchable via a web interface for all
locations to use. The DDODS Data Dictionary is currently maintained on GCDP \UC Systems\ODS\DDODS Data Model and Dictionary and the file name is
EA_DA_DDODS_DataDictionary_[date]_v9.0.xlsx where the date indicates the as of date within a given
version. We hope this serves as the foundation for other system wide data dictionaries.
Timeframe The DDODS will go live with the Wave 1 implementation in July 14, however it is currently in
use at Wave 1 locations in support of conversion. The design and test data has been sent to Wave 2 and
3 locations so that they can begin work on their local systems and data warehouse that will use data
from the DDODS. The DDODS will also be run as part of the SIT testing this fall.
Customer Comments
The DDODS plan to implement a standard solution for UC Path was effective and efficient. It is one of
the best UC wide work efforts I have experienced in my 27 years of service to UC. The principles of the
plan were truly collaborative. Every UC location participated in the design of the UC wide DDODS. Each
location had the opportunity to represent their location’s requirements and assure that they were met
through the design process. Each location would implement their ODS from the DDODS. From there the
location had the flexibility to implement a Data Warehouse or report off the ODS.
This saved UCLA significant time and resources. The DDODS Team enforced best practice for ODS/DW
design and build. It also enforces consistency across the UC in standardized naming and business
logic. Again this saved UCLA time and money. UCLA could extend the build of the ODS into our
DW. UCLA chose to implement the DDODS with little to no modifications. It gives UCLA the opportunity
to utilize resources saved in the DW build and Tier 2 education. It also assisted UCLA in enforcing best
practice ODS guidelines which has been a struggle in the past.
The UC DDODS Team has been wonderful to work with. They are extremely skilled, knowledgeable, and
experienced in DW practices and major project implementations. They found a formula for success in
building a standard ODS with standardized naming and business rules in a truly collaborative way.
I highly recommend them for the Sautter Award.
Regards,
Donna M. Capraro
Director Information and Data Strategy
UCLA IT Services
Campus Data Warehouse
[email protected]
310 206-1624