Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota Today’s Purpose > Keeping track of 100’s of databases is difficult. Without a map it means constant re-discovery (at best) or making mistakes (at worst). > Today I’ll outline the metamodel of a rich topographic map of an enterprise-level data landscape to keep track of 100’s of databases. 2 Who Am I? > Todd Sicard – [email protected] > Started at Blue Cross in 1993 > Enterprise Data Architect 2004-2009 > Enterprise Architect 2010+ > CDMP 3 Goal Create an overall model of - what data is stored where, - whose data it is, Lineage - when it arrives, - where it came from, and - which technology it uses. >Collect, don’t forget Technology Information Line of Business Datastore State >Keep it one-person simple >Useful, Usable, Used 4 This isn’t column level or table level… This is database-level metadata. Breadth before depth Accuracy before precision It had to be one-person-able. 5 By doing this, you will be able to… 1) Understand 2) Manage 3) Leverage > You can’t leverage what you don’t manage, and you can’t manage what you don’t understand! 6 Datastore > A datastore is any electronic (?) repository of structured (?) information. (Not all structured data is in a database) (Not all important data is always electronic) (Not all important data is structured) > A list of all the logical names using the most common and accurate vernacular > Data System: A collection of datastores. – Composition: Essential to the definition. – Aggregation: Non-essential to the definition, usually a collection of independent datastores. 7 Datastores ARC-DB class Claims «DataSystem» Claims "As-Received Claim Database“ Started in 1995 as an MS Access DB, then converted to RDBMS. Contains 24 rolling months of claims data. «DataStore» ARC-DB::ARC-DB Business Owner: Warren Buffet «DataStore» As-Paid DB:: As-Paid DB Business SME: Blarfengaar B. Technical Owner: Bill Gates Technical SME: Steve Hoberman 8 Domain - DDM Information Models Subject - SAM Data Models >“What” data does it contain? Model Purpose Information Semantic Information Semantic Data Data Data Type Domain Subject Structural Conceptual Structural Logical Structural Physical Concept - CDM Entity - LDM Table - PDM Description 10 - 14 for the enterprise 8-12 per domain Links Associations Associations Key entities Technology independent Specific implementation Crow's foot Crow's foot Crow's foot 9 Information Models class Your Enterprise Data Domain Model class Claims SAM «Domain» Customers «Domain» Subscribers «Domain» Health Benefits «Subject» Claims::Patient «Subject» Claims:: Subscriber «Subject» Claims::Rendered Serv ices «Domain» Claims «Subject» Claims::Prov ider «Subject» Claims:: Adj udication «Domain» Doctors Data Domain Model The Claim Subject Area Model 10 Information Models: “Scope + 1” class Claims SAM «Subject» Enrollment «Subject» Subscribers:: Cov ered Person «Subject» Subscribers:: Subscriber «Subject» Claims::Patient «Subject» Claims:: Subscriber «Subject» Customers:: Cov ered Customer «Subject» Claims::Rendered Serv ices «Subject» Claims::Prov ider «Subject» Claims:: Adj udication «Subject» Customers:: Experience The Claim Subject Area Model “Plus One” 11 class Claims SAM A Datastore’s Subject Area Model (SAM) «Subject» Claims::Patient «Subject» Claims:: Subscriber «Subject» Claims::Rendered Serv ices class As-Paid DB «Subject» Claims::Prov ider «DataStore» As-Paid DB «Subject» Claims:: Adj udication Paid Medical Claims «SOR» «Subject» Information:: Claims:: Subscriber «Subject» Information:: Claims:: Adj udication «Subject» Information:: Claims::Prov ider «Subject» Information:: Claims::Rendered Serv ices «Subject» Information:: Claims::Patient 12 “Line of Business” >“Whose” data is it? Regulations Industry A poor name for a mix of stuff: –Industry Subtypes –Corporate Legal Entity Structure –Product Lines –Market Segments –External Data Actors Core Corporations Market Segments External Data Actors Product Lines Affiliates and Partners Etc. 13 uc Product Lines LOB’s… uc Doctors Health Plans Medical Prov ider Professional Commercial Institution Indiv idual Small Group Large Group Public Medicare 14 uc Doctors Medical Prov ider Datastore LOB Professional class As-Paid DB Institution uc Product Lines Health Plans Commercial Commercial «trace» (from Product Lines) Public «DataStore» As-Paid DB Indiv idual Small Group Large Group Medicare «trace» Medical Prov ider (from Doctors) 15 Lineage - “Database, Database, Flow.” >“Where” does the data come from? A B No matter: –How it moves, –How it’s transformed, –How it’s rolled up, Process –How big it is, –How mangled it becomes… …It’s just a data flow Retrieve Load MQ Service 16 Lineage = “Database, Database, Flow.” Information moves from A to B… that’s all that matters! A B Miracle class ARC-DB class As-Paid DB «DataStore» ARC-DB::ARC-DB F0123 Medical Claims «DataStore» As-Paid DB F0321 Medical Claim «DataStore» ARC-DB «flow» «flow» Professional (from Doctors) 17 stm State Initial State > “When” does the data arrive? > The relevant lifecycle of a piece of important data with lots of processing. > Generic lifecycle: 1. Creation, Receiv ed Edit Outcome Processed Rej ected Adjudication Outcome Paid Denied 2. Formation, 3. Maturity, 4. Destruction Final 18 stm State Initial Receiv ed Edit Outcome Processed Datastore State Rej ected Adjudication Outcome Paid Denied Final class ARC-DB class As-Paid DB Receiv ed Paid (from Claim) (from Claim) Denied «trace» (from Claim) «trace» «trace» «DataStore» ARC-DB «DataStore» As-Paid DB 19 class Technology «Technology» Structured Data «Technology» Structured Flat File Technology «Technology» Unknow n «Technology» Database Management System «Technology» Delimited «Technology» XML «Technology» Fixed Width «Technology» Hierarchical «Technology» Oracle «Technology» Relational «Technology» DB2 «Technology» Obj ect «Technology» SQL Serv er 20 class Technology «Technology» Structured Data «Technology» Structured Flat File «Technology» Unknow n Datastore Technology «Technology» Database Management System «Technology» Delimited «Technology» XML «Technology» Fixed Width «Technology» Hierarchical «Technology» Relational «Technology» Obj ect class As-Paid DB «Technology» Technology:: Oracle «Technology» Oracle «Technology» DB2 «Technology» SQL Serv er «use» «DataStore» As-Paid DB 21 Deployment: Servers, Instances, etc. > I didn’t go there > Why not? – “One-person-able” – Breadth before depth. – Accuracy before precision. – Understand, Manage, then Leverage – Manage information at the Enterprise-level > But it sure would be nice… maybe later 22 The Metamodel (UML Model) class Enterprise Data Landscape Repository «Technology» Data Tech Technology Model «DataSystem» Data System «Domain» Info Domain Data Domain Model Tech LOB Model LOB Aggregation Composition LOB (Line of Business) «use» Composition «trace» «DataStore» Data Store SAM (Subject Area Model) «Subject» Info Subj ect «SOR» Lineage Domain Subject Area Model «flow» Lineage «flow» State «trace» LOB - External Data «flow» Lineage State Data State Model 23 The Metamodel (ER Model) Data System Data System Name Data Store Data Store Lineage Data Store Name (FK) From To Technology Data Store Name Data System Name (FK) Tech (FK) Tech Tech Model Line of Business LOB Data Store LOB Data Store Name (FK) LOB (FK) Data Store SAM Data Store Name (FK) Domain (FK) Subject (FK) Subject Area Domain (FK) Subject Data Domain Domain LOB Model LOB (FK) Internal Data Store Name (FK) Domain SAM Domain (FK) Subject (FK) Domain Model Domain (FK) External Data Store Name (FK) 24 class Claims Drawing the Pictures > Datastore-centric: LOB, SAM, Lineage, Tech, State, Composition > Reference (Process POV) Professional (from Doctors) F0321 Medical Claim «flow» «DataStore» ARC-DB::ARC-DB F0123 Medical Claims «flow» «DataStore» As-Paid DB:: As-Paid DB Claims Data Flow > Project POV F0314 Claim Payment Info «flow» In-scope Datastores (Scope + 1) Medical Prov ider (from Doctors) 25 Potential Users >Warehouse architects >Data modelers >Data stewards >DBA's >Data leadership >Enterprise architects >Business continuity planners >Disaster recovery planners >Testers >Internal audit >Corporate attorneys >Security architects 26 Enough talking… let’s see it. 27 The Tool But only from a vendor-neutral perspective… > Sparx Enterprise Architect – Corporate Edition, Standard License – www.sparxsystems.com.au 28 Thank you! [email protected] 29