Database Normalize 数据库正则化
2019-06-27
Coding
Database
👋 ‍️‍️阅读
❤️ 喜欢
💬 评论

# Database Normalize 数据库正则化

Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by Edgar F. Codd as an integral part of his relational model. Normalization entails organizing the columns (attributes) and tables (relations) of a database to ensure that their dependencies are properly enforced by database integrity constraints. It is accomplished by applying some formal rules either by a process of synthesis (creating a new database design) or decomposition (improving an existing database design).

## Concepts

### Superkey 超键

A superkey or super-key is defined in the relational model of database organization as a set of attributes of a relation variable for which it holds that in all relations assigned to that variable, there are no two distinct tuples (rows) that have the same values for the attributes in this set.[1] It can be defined as a set of attributes of a relation schema upon which all attributes of the schema are functionally dependent.

### Candidate Key 候选键

A minimal superkey for that relation; that is, a set of attributes such that:

• the relation does not have two distinct tuples (i.e. rows or records in common database language) with the same values for these attributes (which means that the set of attributes is a superkey)
• there is no proper subset of these attributes for which (1) holds (which means that the set is minimal).

### Functionally Dependent 函数依赖

Given a relation R, a set of attributes X in R is said to functionally determine another set of attributes Y, also in R, (written $X → Y$) if, and only if, each X value in R is associated with precisely one Y value in R; R is then said to satisfy the functional dependency $X → Y$. Equivalently, the projection $\Pi _{X,Y}R$ is a function, i.e. Y is a function of X. In simple words, if the values for the X attributes are known (say they are x), then the values for the Y attributes corresponding to x can be determined by looking them up in any tuple of R containing x. Customarily X is called the determinant set and Y the dependent set. A functional dependency FD: $X → Y$ is called trivial if Y is a subset of X.

### Multivalued Dependency 多值依赖

if we denote by $(x,y,z)$ the tuple having values for $\alpha, \beta, R-\alpha -\beta$ collectively equal to $x, y, z$, correspondingly, then whenever the tuples $(a,b,c)$ and $(a,d,e)$ exist in {\displaystyle r} r, the tuples {\displaystyle (a,b,e)} (a,b,e) and $(a,d,c)$ should also exist in $r$.

### Join Dependency 连接依赖

A table $T$ is subject to a join dependency if $T$ can always be recreated by joining multiple tables each having a subset of the attributes of $T$. If one of the tables in the join has all the attributes of the table $T$, the join dependency is called trivial.

## 1NF

### Definition

The domain of each attribute contains only atomic (indivisible) values, and the value of each attribute contains only a single value from that domain.

### Sample

NameTypePhone
DeanCHD GUI1801234567
AshkanCHD Server9962653205,86010123
JasonD4C GUI1807654321
DavidD4C Server9962653208,86021230

Defects

• Find all GUI team members or find all CHD team members
• Dean get a new phone number or Ashkan need to remove a phone number

With 1NF

Click to show

NameProductType
DeanCHDGUI
AshkanCHDServer
JasonD4CGUI
DavidD4CServer

NamePhone
Dean1801234567
Ashkan9962653205
Ashkan86010123
Jason1807654321
David9962653208
David86021230

## 2NF

### Definition

Any non-prime attribute that is functionally dependent on any proper subset of any candidate key of the relation. A non-prime attribute of a relation is an attribute that is not a part of any candidate key of the relation.

### Sample

NameLocationTeamManagerTypeProjectName
WenZheSZPLTLiangSharkTankFindCode
XiaoLongSJPMGeorgeSharkTankSTAR
WenZheSZPLTLiangABCGolang

Defects

• Some data duplicate many times
• WenZhe transfer to another team
• If XiaoLong exit the project, after we delete the row. Where is XiaoLong? Who is his manager?

With 2NF

Click to show

NameLocationTeamManager
WenZheSZPLTLiang
XiaoLongSJPMGeorge

NameTypeProjectName
WenZheSharkTankFindCode
XiaoLongSharkTankSTAR
WenZheABCGolang

## 3NF

### Definition

No non-prime (non-key) attribute is transitively dependent of any key i.e. no non-prime attribute depends on other non-prime attributes. All the non-prime attributes must depend on the primary key only.

### Sample

NameLocationTeamManager
WenZheSZPLTLiang
ZhengPuSZPLTLiang
XiaoLongSJPMGeorge

Defects

• WenZhe transfer to another team
• PLT transfer to another manager
• Team without member has no manager

With 3NF

Click to show

NameLocationTeam
WenZheSZPLT
ZhengPuSZPLT
XiaoLongSJPM

LocationTeamManager
SZPLTLiang
SJPMGeorge

## BCNF

### Definition

For every dependency $X \rightarrow Y$, one of the following conditions hold true:

• $X \rightarrow Y$ is a trivial functional dependency (i.e., $Y$ is a subset of $X$)
• $X$ is a superkey for schema R

## 4NF

### Definition

For every one of its non-trivial multivalued dependencies $X \twoheadrightarrow Y$, $X$ is a superkey—that is, $X$ is a super key.

### Sample

RestaurantFoodDelivery Area
KFCBurgerW.Trimble
KFCBurgerE.Trimble
KFCRiceW.Trimble
KFCRiceE.Trimble
BKBurgerW.Trimble

Suppose that for each area they have same food

Defects

• KFC has new food: dumpling
• KFC has new Delivery Area: N.Trimble

With 4NF

Click to show

RestaurantFood
KFCBurger
KFCRice
BKBurger

RestaurantDelivery Area
KFCW.Trimble
KFCE.Trimble
BKW.Trimble

## 5NF

### Definition

Every non-trivial join dependency in that table is implied by the candidate keys.

### Sample

RestaurantFoodDelivery Area
KFCBurgerChina
KFCBurgerUS
KFCRiceChina
BKBurgerUS

Defects

• KFC has new food dumpling that only Chinese eat it
• KFC has new Delivery Area, North Korea, where need Rice but not Burger

With 5NF

Click to show

RestaurantFood
KFCBurger
KFCRice
BKBurger

RestaurantDelivery Area
KFCChina
KFCUS
BKUS

FoodDelivery Area
BurgerChina
BurgerUS
RiceChina

## 6NF

### Definition

No nontrivial join dependencies at all.

## Summary

1. Un-Normalize 看山是山
2. Normalize 看山不是山
3. De-Normalize (Not-Only-Normalize) 看山还是山(看山不只是山)