Examples: Normal Shapes in SQL

Oct 3, 2021
Examples: Normal Shapes in SQL

2. Normal shapes

The first relational model established by Dr. Codd has twelve rules. We’re just going to see the first rules, in a simplified form.

2.1. First normal form

The first normal form essentially defines the notion of atomic information. It states the following:

  • the row order does not carry information in a table;
  • the order of the columns does not carry information in a table;
  • a table cannot contain the same row several times;
  • each cell in a row carries only one piece of information, which cannot be divided;
  • a table exposes all its information in its rows and columns (no hidden information).

All of these principles are actually quite trivial. The fact that there cannot be a duplicate row implies that each row must have an identity , which supposes that one can always define a primary key on it. Usually, this primary key is carried by a particular column, but this is not always the case. The notion of primary key is defined by the second normal form.

Note that this form indicates that each cell can only carry one piece of information. Strictly speaking, a column numéro de téléphonein our table Marinsshould therefore not have several numbers separated by commas (a case that we never meet in reality, of course!).

2.2. Second normal form

The second normal form defines the notion of primary key.

A table verifying the second normal form must first verify the first normal form. If we define a primary key PKon this table, possibly made up of several columns, then all the columns of the table must depend only on the whole of PK, and not on one of its parts.

Theoretically, this statement may seem a little complicated, on an example it becomes very simple.

Table 10. Music database – first version

name Instrument Address
Albert Floor 12, rue des fruits
Albert Harpsichord 12, rue des fruits
André Clarinet 23, flower street
André Oboe 23, flower street
André Flute 23, flower street
Antoine Bassoon 34, vegetable street

A natural approach would be to take nomeach musician’s as the primary key. This is not possible, because this name should allow access to a single row, which is not the case.

A more complex approach could be to take the fields (nom, instrument)as the primary key. This second approach also fails, since the address we have added depends only on the name of the musician, and not on the instrument he is playing. We therefore violate the fact that a column cannot depend on a component (that is to say on a part) of a primary key.

This table therefore violates the second normal form, so it has to be reorganized. We have to split our information into two tables.

Table 11. Music base – second version: musicians table

name Address
Albert 12, rue des fruits
André 23, flower street
Antoine 34, vegetable street

Table 12. Music base – second version: practiced instruments table

name Instrument
Albert Floor
Albert Harpsichord
André Clarinet
André Oboe
André Flute
Antoine Bassoon

In this last table, the primary key is made up of the two columns of the table. Strictly speaking, it would even be necessary to split this second table into two tables, following the model we used to build the relationship between our sailors and their communities of birth.

2.3. Third normal form

The second normal form imposes on us that it is not possible that a column can depend on a part of the primary key defined on the table. The third imposes the complement on us: a column does not have the right to depend on a column that is not part of the primary key.

Let us take an example of a structure which respects the second normal form, and which nevertheless poses an immediate problem. We have a first table of sailors:

Example 59. Third normal form – Marines table

create   table sailors (name, date of birth)   primary   key (name);

These sailors embark on boats, which have a tonnage:

Example 60. Third normal form – Embarkations table

create   embarkation table (sailor, embarkation_date, ship, tonnage) 
     primary   key (sailor, embarkation_date);

The table embarquementsfollows the second normal form well: no column depends on a part of the primary key. However, the column bateau, which is not part of the primary key, actually serves as the key to the column tonnage, since it tonnageof course depends on the bateau. If we change the value of bateau, which is part of the primary key, we must also change that of tonnage.

In this, it violates the third normal form: a column cannot depend on anything other than the primary key of a table.

2.4. Normal forms of higher orders

There are still many ways to normalize data. The principle underlying all these normalizations is however always the same: uniqueness of the dependencies, uniqueness of the SQL actions to be carried out for the update of a single data.

It is also up to each developer to think about his problem, and to see which normal form is the easiest to solve his problem.

Leave a Reply

Your email address will not be published. Required fields are marked *