Download DAMA0402_Everest - DAMA-MN

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
DAMA International Symposium &
Wilshire Meta-Data Conference
GETITLE
1
2004 May 2-6  Los Angeles, California
Conducting
Database Design
Project Meetings
---o---
Gordon C. Everest
Carlson School of Management
University of Minnesota
© 2004
[email protected]
Outline
DBPROJ
2
• Database Design Project Meetings
– Initial Meeting(s) followed by extended series of meetings
– Process and Product
– Explaining the Objective, Purpose, Principles, and Benefits
of Data Modeling
– Based on many actual experiences, but focusing on one
– Concluding with Guidelines and Best Practices
Then expand our view:
• Interviews
• Accelerated Group Meetings
– Comparative study
– Lessons learned
• Advice from others
– Simsion, Moody, Moriarty, Barden
Data Modeling Project: The Context
DBPROJ
• Global view
• Set Priorities
3
Enterprise
Data
Model
PLANNING
&
ANALYSIS
Feedback
To flesh out
piece by piece
PRIORITIES, SCOPE
Carve out a piece:
DATA
•
•
•
•
PROCESS / BEHAVIOR
Understandable
Doable
Priority
Greatest payoff
DESIGN
USER INTERFACE
(MODELING)
PLATFORM
"REPOSITORY"
DESIGN
DATABASE
CONSTRUCTION
(GENERATION)
OPERATION
&
MAINTENANCE
The Process
DBPROJ
4
• Global Architecture
– Inventory and set priorities
• Choose a User Application Area
• Obtain User Top Management Support
[ ]->
– INITIAL MEETING:
with user area managers and experts, IS Dept representative,
and database design expert/facilitator
– Explain the project, process, deliverables, benefits, and
expected project time duration (variable!)
– Obtain required commitment of people
• Conduct Kickoff Training Session
• Begin an Extended series of
Database Design Project Meetings
Initial Meeting(s)
DBPROJ
5
First with User Top Management, then with User Domain Experts. 
EXPLAIN THE FOLLOWING:
• Objective of Data Modeling
– To accurately and completely model a chosen user domain
– Within a defined Scope
• Purpose of Data Modeling
– To understand the chosen user domain
– Prelude to building a database
• The Benefits of Data Modeling
• The Process
– Finding Entities (nouns) and Relationships (verbs)
– Adding attributes (roles in relationships), and constraints
• The Product
– Design documentation – diagram and supporting narrative
– Notational scheme
Modeling
DMOD
MODEL = Abstract (Re).present.(ation)
Reality
(mental models)
MODELING
PROCESS
Knowledge
externalized,
formalized,
shared.
MODEL
What drives or guides the process?
Re.present
Knowledge
in the head
Knowledge
in the world
present
6
The Modeling Process
DMOD
7
MODELING SCHEME
METHODOLOGY:
Steps/Tasks + Milestones + Deliverables +
Real World
Universe of Discourse
perception
selection/filtering
REPRESENTATIONAL FORMS:
Narrative, Graphical Diagram,
Formal Language Statements
(the Syntax)
Context
Constructs
Composition
Constraints
MODELING
PROCESS
MODEL
Data Modeling Constructs
DMOD
8
What to look for:
Relative emphasis differentiates Data Modeling approaches
e.g. ER
modeling focuses on Entities and Relationships,
de-emphasizing or hiding Attributes.
ENTITY
RELATIONSHIP
(OBJECT)
IDENTIFIER
ATTRIBUTE
(Data Item)
characteristics
[ FOREIGN KEY ]
characteristics
Data Modeling Process
DMOD
9
PERCEIVE
Mental
Model
EXTERNALIZE
Conceptual
Model
(ORM/ER diagram)
map
FORMALIZE
Logical
Data Model
(relational tables)
IMPLEMENT
Physical Model
(define database
to a DBMS)
Objective of Data Modeling
DMOD
10
(WHAT we are trying to do)
TO ACCURATELY AND COMPLETELY MODEL
SOME PORTION OF THE REAL WORLD
UNIVERSE OF DISCOURSE (UoD)
OF INTEREST TO SOME ORGANIZATION
OR COMMUNITY OF USERS.
Purpose of Data Modeling
DMOD
(WHY we do it)
DUAL, CONFLICTING PURPOSES DRIVE THE PROCESS:
11
USE
R
• Facilitate Human Communication, Understanding, Validation
– capture and present meaning, the semantics of a model
– direct representation of only essential model semantics
PRESENTATION CHARACTERISTICS:
– scoping and presenting subparts of a Model
– unfolding presentation at different levels of abstraction or detail
– visual prominence in proportion to semantic importance
SECONDARY:
• Basis for Implementation - defining & creating a Database
– complete in all the necessary details
– construction/generation able to be fully automated
SCHEMA
DATABASE
Purpose of Modeling
Satzinger2e, SA&D, Fig 5.2, p.149.
DMOD
12
FIRST STEP in the DESIGN phase of Systems Development (BUILDING)*
• Capture semantics – all relevant, important details
• Document – record and remember
• Understand – learn, raise questions, record answers, refine
• Communicate – shared with all interested parties
– Users, stakeholders, management, developers
• Validate – a complete and accurate representation
– Internal validation – consistent with the modeling rules
– External validation – Who can do this?
• Blueprint to Build
* Some say that Modeling begins in the Analysis phase.
Data Model to Database Realization
DMOD
13
Database
Definition
Language
DATA
MODEL
DATABASE
DEFINER
data
input
DataBase
Management
System
DDL
stmts
DataBase
Management
System
DATABASE
"Schema"
DEFINITION
describes
DATABASE
Data Modeling Principles
DBPROJ
14
• Done at the highest conceptual level
• Done at the schema level
augmented with sample data populations
• Involve all interested parties
(not just one department or application)
• Easier for users to learn data modeling
than for IS professionals/data modelers to
learn the business
• Capture all possible/expressible semantics
• Users’ (collectively) will always know more
• Be inclusive (within the defined Scope)
[ ]->
Stages of Data Modeling
DMOD

15
Start at the highest Conceptual Level!
USE
R
Domain
Knowledge
CONCEPTUAL
ER
CLUSTERED
ORM
“LOGICAL”
Attribs in Records RELATIONAL
• Objects
MultiValued,
• Obj. ID’s
PHYSICAL
Nested - - - - - -> Flat (1NF)
• Roles/Relships
Ternaries - - - - - -> Binary only
• Implementation
• (Fnl. Dep)
in/for a DBMS
M:N - - - - - - - - - -> 1:Many only
NO clustering
• Denormalize
Normalized (2,3,4) Primary Keys
(for performance)
=> NO “attributes” Relationships - - ->
Foreign Keys
+ triggers, stored
w/attributes
procedures
Sub/SupTypes
SCHEMA
DATABASE
Record-Based Data Modeling
DMOD
16
• Commonly called Entity Relationship (ER) Modeling
• Attributes clustered into Entity Records (or Tables)
• Focus on Entities and Relationships (hence ER)
suppressing attributes in ER Diagrams
(hence no explicit representation of identifiers)
leaving open the nature of the intra-record structure.
• Most general case allows:
– “Nested” Multivalued attributes or repeating groups
Hence not in first normal form (1NF)
(should still satisfy other normal forms – 2NF, 3NF, …)
–
–
–
–
Direct representation of M:N relationships between entities
Attributed relationships (i.e., with attributes)
Ternary (and higher) relationships
Subtypes and supertypes
• Restricting all of the above gives the Relational Model
– Atomic (single-valued) attributes; binary relationships (FKey)
=> Often, ERDiagrams are Relational Table Diagrams
Choosing a Relationship Notation
DMOD
17
Everest-DM: p.224.
Candidate suggestions for ‘one-to-many’ (1:M):
ENTITY1
PARENT
ENTITYy
ENTITY1
ENTITY1
ENTITY1
ENTITY1
M
1
y=f(x)
P
M
ENTITY2
1
CHILD
ENTITYx
ENTITY2
ENTITY2
ENTITY2
ENTITY2
Bachman
1969
Nijssen
1974
Chen
1976
Kroenke
IDEF1X
SilverRun
CRITERIA:
ENTITY1
• NOT imply direction,
access path, or
physical representation
• Visually intuitive to
aid human understanding
• Printable
• international
The “Fork”
ENTITY2
Everest
1976
Benefits of Data Modeling
DMOD
18
• Users gain a better understanding of their area.
• Greater system success with user involvement.
• Platform for communication between users and designers.
• Separation of information-oriented specifications from
economic / performance / implementation considerations.
• Determines the content of the database.
• Solid base for information systems development.
• Database more viable/stable; Greater evolvability for
handling changes in the developed information system.
• A basis for integration
• Data modeling is a small part of the total IS development effort, but,
when done “right,” can reduce overall development costs and
downstream maintenance costs. When done poorly, the downstream
impacts can be disastrous and costly.
The Chosen User Area
DBPROJ
19
After conducting a survey of existing applications and databases,
evaluating them, and setting priorities.
• Department of Transportation, Right of Way Division
• Functions:
– Appraisal, Direct Purchase, Leasing, Relocation, Sale,
Demolition, Reconveyance, Legal Owners, Condemnation
•
•
•
•
Manageable Scope; Not too Complex
Great Need; Potentially High Payoff
100 People; Mostly Manual Operations
One Large COBOL file (1971) on Magnetic Tape
• 110,000 parcels of land; 250 attributes (Data Items)
• Several Manual Files on Floating Carts
The Data Modeling Process
DMOD
20
GATHERING INFORMATION
• Once the SCOPE and OBJECTIVES are set
• and understanding the modeling constructs to use
How to determine the INFORMATION REQUIREMENTS?
• Where would you go?
• Where would you look?
• What would you look for?
• Who would you talk to?
• What would you ask?
N
Database Design Process – Two Approaches
DMOD
21
BOTTOM-UP:
TOP-DOWN:
DFDs, Sample forms, reports, files, ...

REALITY
User Domain of interest
LIST
Look,
Listen
of data items
“Data Dictionary”
Perceive,
Filter
FIND
ENTITIES
The pivotal
construct in
Data Modeling
CLUSTER
DATA ITEMS
ADD
RELATIONSHIPS
“Conceptual Model Diagram”
Ask questions
USER-DOMAIN
EXPERT
Talk
echo
validate
DATABASE
DESIGNER
Different Kinds of Entity (Types)
DMOD
22
• Independent / Base / Reference
WATSON2-ch.7, p.176-9.
– Exists / is of interest… for some duration of time
– Frequently the starting point; most important to users
• Dependent
– Depends on some other entity(s) for existence, and
– Perhaps for identification (Watson notation:
)
• Association (“Intersection”)
– Represents a Many:Many binary (or more) relationship
– May be something meaningful in the users world
• Event or “Transaction”
– A happening at a point in time
– Number of instances grows endlessly
• Summary – to contain summary (derived) information
• Generalization (“Aggregate”, Supertype) or
• Specialization (“Subordinate”, Subtype)
The Processing Continuum – Choosing Entities
ISUSE
23
e.g.:
Transaction
EVENTS
FLOW data
Standing
ENTITIES
LEVEL/STATUS
AGGREGATIONS
DERIVATIONS
SUMMARY data
hire, fire
sales
Employee
Product Inventory
workforce growth
stockouts
• DESIGN ISSUE: calculating derived information
– at input/update time - when transaction event captured & recorded
– at output/retrieval time - when output data is requested
• Sometimes we don’t record event transactions at all
– of no interest
– just record the effect of the event transaction, e.g. marriage
• We don’t usually store summary data
– calculated at retrieval request time
– except in Data Warehousing/OLAP for better response time
Steps in the Modeling Process
DMOD
24
The A B C D E F G procedure:
• Ask & Analyze
• Bounce Back & Forth with/among user domain experts
• Comprehend what they are saying; Verbalize
• Design - Diagram & Document in Dictionary
with narrative
• Evaluate against rules of construction & user experts
• Formalize in a Data Model (mapping for implementation)
• Generate a definition for implementation in a DBMS
List of Data Items (Bottom-up Design)
DMOD
ISDATAD
25
• UNORGANIZED, UNSTRUCTURED
e.g. the “Data Dictionary” derived from DFDs
• ORGANIZED, CLUSTERED
Add
ATTRIBUTES ...
Customer Number
Customer Name
Billing Address
Customer Phone
Shipping Address
Credit Limit
Salesperson ID
Salesperson Name
Salesperson Address
Salesperson Phone
Commission Rate
Order Number
Order Date
Ship Date
Terms
Gross Amount of Order
Inventory Item Number
Item Description
Price
Bin Location
Quantity Ordered
of …
ENTITIES:
RELATIONSHIPS
CUSTOMER
calls on
SALESPERSON
places
ORDER
contains
ITEM
ORDER LINE ITEM
The Product
DBPROJ
26
Documentation
– produced according to a set of Guidelines (See Appendix to Everest paper)
– structured to facilitate incremental updates
- Hierarchical organization, dated, modular sections
• Scope and Objectives
– Use Cases; Major Processes (Setup, Retrieval & Reporting,
Update/Maintenance/Transaction processing, Archival
• Global Data Model Diagram
– Top-down unfolding presentation
• Narrative Description of:
– Entities
– Relationships
– Attributes
• Formal Definition in a Data Dictionary / Repository
– Preferably using a CASE Tool
• Generated Schema (DDL Script) for a target DBMS
User Experiences and Activities
DBPROJ
27
• Users get excited
• Learning and Self-Confidence grew
• Relationship with central IS support unit
• One user forged ahead early
• Anxious to buy equipment and install systems
• The Product: Documentation
- 40 entities, 400 pages
• Used a CASE Tool to support data modeling
Sample Data Model
(Excelerator 1.9)
DMODPRE
28
AUTHMAP
Authorization
Map
MAINTDIST
Maintenance
District
1-4
COUNTY
County
Num | Code...
ROADSECT
Road Section
Cty# |RS#
AGREEMENT
Agreement
rare
PROJECTS
Project
Actions
RWPROJ
R/W PROJECT
900's or Dash #
20%
rare
COMORDACT
Commissioners
Orders Action
PMSSPROJ
PMSS Project
FEDPROJ
Federal
Project
10%
usually 1
rare
<99
rare
PARCEL
Interest in a
Land Parcel
COMMORDER
Commissioners
Order
10%
2 if EG
m if 88
Minnesota DOT
Right of Way
Database Structure
Gordon C. Everest
INTHOLDER
Interest
Holder
PARTY INT
Party to
Interest
PARTY NAD
Party Name
& Address
0-2
APPACTION
Appraisal
Action & Cert
APPRAISAL
Appraisal
APPRAISER
Appraiser
COMREPORT
Commissioners
Report
EMDOMACT
Em Domain
Action: St vs.
PETITION
Petition &
Lis Pendens
FINALCERT
Final
Certificate
TRIALSETL
Trial and
Settlement
EDPARCTRK
EmDom Parcel
Tracking
?
CHARGEID
Charge
Identifier
<- last
LEASE
Lease
3%
LEGEND
COMMWORK
Commissioner
Hours Worked
3-5
rare
One )----------E( many
Dependent -- --D -- -Orphan -- -- -- -- F -- -Foreign ID -- -- -- -- -->
COMMISSION
Commissioner
5/yr
COMASSIGN
Commissioner
Assignment
OCCUPANT
Occupant
Relocation
DIRPURCH
Direct
Purchase
SUPHOUSING
Supplemental
Housing
RELOCPMTS
Relocation
Payments & Appls
LESSEE
Lessee
MEMBERS
Household
Members
OCCATTRNY
Occupant
Attorney NAD
3%
IMPROVEMENT
Improvements
on R/W Parcel
latest
V
<.01
REMOVCONT
Removal
Contract
SALESACT
Sales Action
CONTRACTOR
Contractor
OTHERBIDS
Other Bids
<3
Data Modeling
DMOD
29
GUIDELINES for GATHERING & RECORDING Information:
1. PERCEPTIONS IN MINDS OF KEY USERS
2. EXISTING FILES/SCREENS/FORMS/REPORTS ONLY CLUES
3. DOCUMENTATION GUIDELINES Parts and organization
Diagramming conventions
4. GROWING THE DOCUMENTATION: ENTITIES FIRST
5. DISCOVERING ENTITIES What is a file?
6. NAMING AND DESCRIBING ENTITIES
7. FOLLOWING THE RULES FOR LOGICAL DATABASE DESIGN
8. UNCONSTRAINED BY IMPLEMENTATION/SYSTEM LIMITATIONS
9. LOOKING FOR THE EXTREMES; NOT THE TYPICAL
10. SEEKING CONSENSUS AMONG THE USERS
Conducting Data Modeling Project Meetings
DBPROJ
30
BEST PRACTICES:
• Get user top management support & commitment
• Don’t limit to a fixed deadline
• Get the “right” people to the table; ask what they ‘do’
• Set and agree on the project scope early
• Be inclusive in the design
• Break expectation that it will all be implemented
• Biweekly, ½ day meetings
• Focus on finding entities, relationships, & characteristics
• Grow the documentation (not meeting minutes)
following guidelines, modeling scheme, and notation
• Facilitator – an “outsider” (know the process, not the domain)
• Scribe – an “insider” (so organization takes ownership)
CAUTION: must be open, balanced, willing to record all viewpoints
• Use a data modeling CASE tool
to iterate on revisions to diagrams and documentation
Gathering Business User Requirements
DBPROJ
31
from user domain experts (not IS people):
• Interviews

+ everyone gets heard
° one at a time + requires less interviewee time
° small group (homogeneous) + interaction stimulates ideas
• Facilitated Group Sessions
° Accelerated
+ less elapsed time (intensive 1-3 days)
+ creative brainstorming
+ raise issues
+ set priorities (voting)
? build consensus? resolve issues?
° Extended
+ advantage
+ to achieve common, accepted design
Interviews vs. Group Meetings
DBPROJ
The “sweet” spots:
Many
# PARTICIPANTS
32
5-10
ACCELERATED
(“JAD” session)
for brainstorming,
straw votes, and
setting priorities
EXTENDED
for Database
Design
2-3
Managers
Executives
Interviews
Visionaries
1
1
2 (Follow-up)
# MEETINGS
Many
Interviews: Preparation
DBPROJ
33
• Understand Background
– the business - its strategic direction
– the industry - trends, competition
– the organization - formal and real organization charts
– the history - any prior initiatives
––> still IS people/interviewers DO NOT presume to know everything,
and DO pretend to know nothing (to ask the “dumb” questions).
• Select Interviewees
– horizontal and vertical cross section
– the visionaries; the thorns in the side; the power users
• Project Kick-Off Meeting
– with impacted users and their management
– introduced by user management sponsor
– convey commitment, scope, expectations, required user involvement
• Pre-Interview Letter
– from project sponsor: internal respected authority
– logistics and what to bring
• Plan a Structured Interview
Conducting the Interview
DBPROJ
34
• Think through what you need to discover
• Prompting single sheet of topics/questions
• Lead Interviewer + Scribe + Observers
• REVIEW Project Purpose and Scope
• Let USERS TALK about what they DO, what they know
(stay within their comfort zone… initially)
• then LISTEN carefully for expressions of:
– vision, strategies, priorities, strengths, problems,
suggestions for improvement, …
• ASK the classic questions:
– why, how (much), who, where, when, what if, what then.
• FLAG the nouns and verbs
– Nouns become entities
– Verbs become relationships
Accelerated vs. Extended
Design Approaches
DBPROJ
35
TASK
SCOPE
SCHEME
• DATA PLANNING
• DETAILED DESIGN
• Division-wide data model
• Forest inventory database
• Entity-Relationship Modeling • Extended E-R Modeling
APPROACH
DURATION
PEOPLE
ORGS
LEADER/S
• Accelerated Workshop
• Extended Project Meetings
• 5 consecutive days
• Biweekly 1/2 day - 6 months
• 76 participants from
• 11 participants from
• Forestry + 10 other agencies • Forestry, Fish & Wildlife
• 2 facilitators (also as scribes) • 1 facilitator (also as scribe)
Results: Accelerated Approach
for Data Planning & Modeling
DBPROJ
36
TASK:
• Intro / Kickoff / Training
• Define ENTITIES
• Define ATTRIBUTES
• Define RELATIONSHIPS
• Partition and Prioritize
TIME (days)
Planned Actual
1
1
1
2 1/2*
1
3/4
1
3/4
1
0
*Difficult and contentious, so facilitators decided arbitrarily to move on.
Entity definitions
were incomplete, missing, or poorly stated, with no consensus reached -- which
hindered definition of attributes and relationships in remainder of workshop.
A global data model not produced, nor detailed design projects defined and prioritized.
The contractor promised to develop these later in the final report of the workshop.
User Surveys
DBPROJ
37
• Data planning workshop unsatisfactory, final report
omitted "where used" matrix and global data model
was useless. Contractor released. No user validation.
• Comparison of user survey results confounded by
contractor's apparent lack of experience, preparation,
organization and management of the workshop.
Novice facilitators in both approaches.
• Extended bi-weekly meetings: participants willing to
do it on another project, felt this project was
completed but uncomfortable stating that a good data
model had been produced.
Lessons Learned
DBPROJ
38
• Accelerated approach may be good for eliciting information
requirements and setting priorities, not for database design.
• Difficult to reach consensus with a broad scope ...
necessitating 76 participants.
• Experienced, prepared facilitator is critical...
the accelerated approach is unforgiving for the novice.
• Clearly define and communicate organizational goals,
expectations, and outcomes / deliverables.
• User domain experts: get the best; use as needed.
• Top management support to ensure good participation.
• Facilitator: expert in the process, but not the domain.
• Dedicated scribe from within orgn… to take ownership.
• Select the first design project with a manageable scope to
ensure success, and increase future mgmt and user buy-in.
• Consider using a blend of the two approaches.
Advice from Others
DBPROJ
39
•
•
•
•
Terry Moriarty
Dan Moody
Graeme Simsion
Dick Barden
Conflicting Objectives in IS Development
DBPROJ
40
T. Moriarty, “… Data Modelers!” Intelligent Enterprise (3:1), 2000 Jan.
• User Domain / Subject Matter Experts (SME’s)
//
Application Systems
• (Business) Process Analysis
• Process Models (DFD’s)
Object-Oriented Development
•
•
•
•
Implement in OO Programming Languages
Object Models; Use Cases
(UML) Class Diagrams
State Transition Diagrams
Data Warehousing / Data Marts
• Multi-dimensional models
Data Modeling
• Focus on Data
• Singular Objective
(NOT implementation)
• Precise Thinking
• Rich Semantics
-probe for hidden meaning
• (Shared) (Integrated)
“Enterprise” Models
• Normalized ER Diagrams
RELATIONAL
DATA MODELS
• Implementation
in RDBMS
What do Data Modelers bring to the table?
– Strengths and Perceived Disadvantages
How to get invited to the table, to be involved in IS Development?
Dan Moody’s “Seven Habits”
DBPROJ
41
• IMMERSE yourself in the client/user environment
– See it for yourself
• CHALLENGE
•
•
•
•
•
– Generate alternatives, test the boundaries, find the exceptions
GENERALIZE, discern the underlying similarities of entities
– Keep it simple, reduce the number of entities
TEST out the model; have users validate the model
– Examine every relationship … in both directions
LIMIT the Time and set the Scope up front
– Know when to stop
INTEGRATE with existing systems and databases
– Keep an eye on the big picture
COMPLETE – resolve ambiguities; handle the exceptions
– Follow the job through to completion
Daniel MOODY, “The Seven Habits of Highly Successful Data Modelers,”
Database Programming and Design (9:10), 1996 October, pages 57 – 64.
Summarized in: Richard Watson, Data Management, 2nd ed, Wiley, 1999, p.185.
Simsion’s Foundation Principles
DBPROJ
42
G. Simsion, Database Programming & Design (9:2), 1996 Feb.
• Data Modeling is about Design
– Different designers may produce different solutions
there is no single correct model for a given situation;
thus need quality criteria to make an objective choice.
[ ]->
• Data Modeling is Important… and NOT Optional
– Data modelers believe it; problem is persuading other stakeholders
• Data Modeling is a Discipline… requiring expertise
– Requiring Training, Practice, Experience. … Users can’t model! But..
NOT just knowing and applying some rules and conventions;
witness the difficulty in using data modeling CASE tools
• Data Modelers use Patterns … DW is a dimensional model
– e.g., hierarchies, M:N, assemblies (ring fact), orders/warehouses
• Subtypes help… Level of Generalization is critical
• Logical DB Design is the Data Modeler’s Responsibility
– DBA’s for physical design, implementation in a DBMS, performance
• Corporate (Enterprise) Data Modeling is different
– Purpose – understand global architecture; integrate; set priorities
Data Modeling as Design
DBPROJ
43
•
•
•
•
“Data Modeling is a Design activity” – Graeme Simsion
Analysis seeks to discover the (one) truth, represented in a model
Our perceptions of reality differ; Modeling Schemes are imperfect
Design involves Choice – e.g., Entity/Object Types; Sub/Supertypes
• Need Criteria
Criteria for Choosing a Quality Design
DBPROJ
44
• Follows the rules of construction (“grammar”) of the
data modeling scheme.
• Accurate model of the users real world domain of interest
• Complete… within the defined scope
• Enables enforcement of (business) rules
• Non-redundant
• Stable
• Flexible
• Extensible
• Understandable
• Simple
• Unambiguous
• Basis for an efficient, workable implementation
SOME OF THESE IN CONFLICT and INVOLVE TRADEOFFS.
“Baloney Detection Kit” – Dick Barden
DBPROJ
45
Adapted from Carl Sagan, “The Fine Art of Baloney Detection,”
The Demon-Haunted World: Science as a Candle in the Dark, 1996.
1. Seek out independent confirmation of the ‘facts’
2. Encourage substantive debate on the evidence
3. Be fair to the process; treat each expert equally
4. Spin more than one way of looking at your UoD
5. Seek out others for critical feedback – challenge
6. Populate – gather example data for the facts
7. Everything in a chain of argument must fit
8. Use the simpler one when two equally model the data
9. Ask how the examples can be falsified
10. Can others understand and accept the model
References
DBPROJ
46
• Matthew H. Pelkki, Gordon C. Everest, Dietmar W. Rose,
“Using Accelerated and Extended Approaches for Data
Planning and Design,” The Compiler (13:3), 1995 Fall.
• Terry Moriarty, “Data Modeling is Dead! Long Live Data
Modelers!” Intelligent Enterprise (3:1), 2000 Jan 1.
• Daniel Moody, “Seven Habits of Highly Effective Data
Modelers,” Database Programming and Design, 1996 October.
• Graeme Simsion, “Data Modeling: Testing the Foundations,”
Database Programming & Design (9:2), 1996 February.
• Dick Barden, “Baloney Detection Kit,” Journal of Conceptual
Modeling (10), 1999 August. www.inconcept.com/jcm