Download First normal form

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Open Database Connectivity wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

SQL wikipedia , lookup

Concurrency control wikipedia , lookup

Ingres (database) wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

ContactPoint wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational algebra wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
First normal form
First normal form (1NF) is a property of a relation in
a relational database. A relation is in first normal form if
the domain of each attribute contains only atomic values,
and the value of each attribute contains only a single value
from that domain.[1]
However, if we assume that two customers can in principle have the same name (although there are no such
in the current table), this database design does not meet
the more stringent requirements of third normal form and
possibly not even second normal form.[5]
Edgar Codd, in a 1971 conference paper, defined a relation in first normal form to be one such that none of the
domains of that relation should have elements which are 1.3
themselves sets.[2]
First normal form is an essential property of a relation
in a relational database. Database normalization is the
process of representing a database in terms of relations
in standard normal forms, where first normal is a minimal
requirement.
Another design for the same data makes use of two tables: a Customer Name table and a Customer Telephone
Number table.
Indeed, repeating groups of telephone numbers do not occur in this design. Instead, each Customer-to-Telephone
Number link appears on its own record. With Customer
ID as key, a one-to-many relationship exists between the
two tables. A record in the “parent” table, Customer
Name, can have many telephone number records in the
“child” table, Customer Telephone Number, but each
telephone number belongs to one, and only one customer.
It is worth noting that this design meets the additional requirements for second and third normal form.
First normal form enforces below criteria:
• Eliminate repeating groups in individual tables.
• Create a separate table for each set of related data.
• Identify each set of related data with a primary key
1
Examples
2 Atomicity
The following scenario illustrates how a database design
might violate first normal form.[3][4]
1.1
A design that also complies with higher
normal forms
Edgar F. Codd's definition of 1NF makes reference to the
concept of 'atomicity'. Codd states that the “values in the
domains on which each relation is defined are required
to be atomic with respect to the DBMS.”[6] Codd defines
an atomic value as one that “cannot be decomposed into
smaller pieces by the DBMS (excluding certain special
functions)"[7] meaning a field should not be divided into
parts with more than one kind of data in it such that what
one part means to the DBMS depends on another part of
the same field.
Domains and values
Suppose a designer wishes to record the names and telephone numbers of customers. He or she defines a customer table which looks like this:
The designer then becomes aware of a requirement to
record multiple telephone numbers for some customers.
He or she reasons that the simplest way of doing this is to
allow the “Telephone Number” field in any given record
Hugh Darwen and Chris Date have suggested that Codd’s
to contain more than one value:
concept of an “atomic value” is ambiguous, and that this
Assuming, however, that the Telephone Number column ambiguity has led to widespread confusion about how
is defined on some telephone number-like domain, such 1NF should be understood.[8][9] In particular, the notion
as the domain of 12-character strings, the representation of a “value that cannot be decomposed” is problematic,
above is not in first normal form.
as it would seem to imply that few, if any, data types are
atomic:
1.2
A design that complies with 1NF
• A character string would seem not to be atomic, as
the RDBMS typically provides operators to decompose it into substrings.
In the first normal form, the previous table can be represented in the following way.
1
2
4
• A fixed-point number would seem not to be atomic,
as the RDBMS typically provides operators to decompose it into integer and fractional components.
• An ISBN would seem not to be atomic, as it includes
language and publisher identifiers.
Date suggests that “the notion of atomicity has no absolute meaning":[10] a value may be considered atomic for
some purposes, but may be considered an assemblage of
more basic elements for other purposes. If this position is
accepted, 1NF cannot be defined with reference to atomicity. Columns of any conceivable data type (from string
types and numeric types to array types and table types)
are then acceptable in a 1NF table—although perhaps not
always desirable; for example, it would be more desirable
to separate a Customer Name field into two separate fields
as First Name, Surname.
REFERENCES
• A view whose definition mandates that results be returned in a particular order, so that the row-ordering
is an intrinsic and meaningful aspect of the view.[13]
This violates condition 1. The tuples in true relations are not ordered with respect to each other.
• A table with at least one nullable attribute. A nullable attribute would be in violation of condition
4, which requires every field to contain exactly one
value from its column’s domain. It should be noted,
however, that this aspect of condition 4 is controversial. It marks an important departure from Codd's
later vision of the relational model,[14] which made
explicit provision for nulls.[15]
4 References
First normal form, as defined by Chris Date, permits
relation-valued attributes (tables within tables). Date argues that relation-valued attributes, by means of which a
field within a table can contain a table, are useful in rare
cases.[11]
[1] Elmasri, Ramez and Navathe, Shamkant B. (July 2003).
Fundamentals of Database Systems, Fourth Edition. Pearson. p. 315. ISBN 0321204484. It states that the domain
of an attribute must include only atomic (simple, indivisible) values and that the value of any attribute in a tuple
must be a single value from the domain of that attribute.
3
[2] E. F. Codd (Oct 1972), Further normalization of the
database relational model, Courant Institute: PrenticeHall, ISBN 013196741X, A relation is in first normal
form if it has the property that none of its domains has
elements which are themselves sets.
1NF tables as representations of
relations
According to Date’s definition, a table is in first normal
form if and only if it is "isomorphic to some relation”,
which means, specifically, that it satisfies the following
five conditions:[12]
1. There’s no top-to-bottom ordering to
the rows.
2. There’s no left-to-right ordering to the
columns.
3. There are no duplicate rows.
4. Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).
5. All columns are regular [i.e. rows have
no hidden components such as row IDs,
object IDs, or hidden timestamps].
Violation of any of these conditions would mean that the
table is not strictly relational, and therefore that it is not
in first normal form.
[3] studytonight.com
[4] stackoverflow.com
[5] Indeed, assuming that different customers can in principle
have the same name and that different customers can in
principle have the same telephone number, this database
design is not in second normal form since “First Name”
and “Surname” are non-prime attributes that depend only
on Customer ID, not the whole candidate key (Customer
ID, Telephone number). If we assume that different customers must have different telephone numbers then Telephone number is the only candidate key and the database
is in second normal form but not in third normal form
because “First Name” and “Surname” are non-prime attributes that depend on the non-prime attribute Customer
ID. On the other hand, if we assumed that different customers must have different names but can have the same
telephone number, this database design would be both in
second and third normal form because there would be two
candidate keys (Customer ID, Telephone number) and
(First Name, Surname, Telephone number) and no nonprime attributes.
[6] Codd, E. F. The Relational Model for Database Management Version 2 (Addison-Wesley, 1990).
Examples of tables (or views) that would not meet this
definition of first normal form are:
[7] Codd, E. F. The Relational Model for Database Management Version 2 (Addison-Wesley, 1990), p. 6.
• A table that lacks a unique key. Such a table would
be able to accommodate duplicate rows, in violation
of condition 3.
[8] Darwen, Hugh. “Relation-Valued Attributes; or, Will the
Real First Normal Form Please Stand Up?", in C. J. Date
and Hugh Darwen, Relational Database Writings 19891991 (Addison-Wesley, 1992).
3
[9] "[F]or many years,” writes Date, “I was as confused as
anyone else. What’s worse, I did my best (worst?) to
spread that confusion through my writings, seminars, and
other presentations.” Date, C. J. ["What First Normal
Form Really Means"] in Date on Database: Writings
2000-2006 (Springer-Verlag, 2006), p. 108
[10] Date, C. J. ["What First Normal Form Really Means"] p.
112.
[11] Date, C. J. ["What First Normal Form Really Means"] pp.
121–126.
[12] Date, C. J. ["What First Normal Form Really Means"] pp.
127–128.
[13] Such views cannot be created using SQL that conforms to
the SQL:2003 standard.
[14] “Codd first defined the relational model in 1969 and didn't
introduce nulls until 1979” Date, C. J. SQL and Relational
Theory (O'Reilly, 2009), Appendix A.2.
[15] The third of Codd’s 12 rules states that “Null values ...
[must be] supported in a fully relational DBMS for representing missing information and inapplicable information
in a systematic way, independent of data type.” Codd, E.
F. “Is Your DBMS Really Relational?" Computerworld,
October 14, 1985.
5
Further reading
• Litt’s Tips: Normalization
• Date, C. J., & Lorentzos, N., & Darwen, H. (2002).
Temporal Data & the Relational Model (1st ed.).
Morgan Kaufmann. ISBN 1-55860-855-9.
• Date, C. J. (1999), An Introduction to Database Systems (8th ed.). Addison-Wesley Longman. ISBN
0-321-19784-4.
• Kent, W. (1983) A Simple Guide to Five Normal
Forms in Relational Database Theory, Communications of the ACM, vol. 26, pp. 120–125
4
6 TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES
6
Text and image sources, contributors, and licenses
6.1
Text
• First normal form Source: https://en.wikipedia.org/wiki/First_normal_form?oldid=676829675 Contributors: General Wesc, Boson,
Rochkind, Jason Quinn, CanisRufus, Alxndr, Alansohn, Wtmitchell, Duplode, Crosbiesmith, Jacobolus, GregorB, Isnow, Mahlon, Heathcliff, Bgwhite, Gwernol, NawlinWiki, VinceBowdren, Nabav, Jgzheng, Closedmouth, SmackBot, Eibcga, Brick Thrower, Lordmwesh,
Jerome Charles Potts, Mfpinhal, Dfass, DanielLemire, Ambuj.Saxena, Dreftymac, Arun an, Funnyfarmofdoom, Jordan Brown, Morra,
Ebrahim, Davidhorman, Dougher, VictorAnyakin, Magioladitis, GermanX, STBot, Kwetal, Montchav, Svilenv, Brianga, Mystagogue,
Flyer22, SilverbackNet, LarRan, ClueBot, Bernard Ladenthin, Saravu2k, Rhododendrites, Burner0718, DumZiBoT, Jaan Vajakas, Addbot, M.r santosh kumar., Yobot, Fraggle81, Materialscientist, Eallik, Vegpuff, ReformatMe, Hobsonlane, Mark Renier, Hamidrizeh,
RBarryYoung, John of Reading, BillyPreset, Dixtosa, Aeonx, ClueBot NG,
, MerlIwBot, Whitmerj, Sespanta, Klilidiplomus, Plavozont, Alphama, Graphium and Anonymous: 149
6.2
Images
6.3
Content license
• Creative Commons Attribution-Share Alike 3.0