Deleting data is not always necessary or desired in a database. It sometimes even happens, in archive databases, that it is prohibited, except for purely technical reasons. However, in many cases erasing data that has become unnecessary becomes a necessity, if only for performance reasons.
Erasing data can become very problematic, even in (reasonably) simple basic structures. The main problem that this can pose is that of referential integrity, that is to say cross references between primary keys. In general, we refrain from deleting a data referenced by another. If this data is obsolete, but the first is not, there is no choice but to keep the first. It is like this that databases turn into heaps of data, gradually lose their structure, duplicates appear, and things become more and more unusable.
The general form of a
deleteis as follows.
A basic precaution before running a
deleteis to perform a
selectwith the same clause
where. This makes it possible to verify that the data which will be erased are indeed the correct ones, and to prevent possible disasters.
Note that we can perform sub-queries in the clause
As has been said, the presence of a foreign key can prevent the erasure of a record. There is a way to remedy this, when declaring this foreign key. If this key was created with the option
on cascade delete, then the deletion is possible, and all the data that references the data being deleted will also be deleted.
This mechanism is very practical provided it is perfectly mastered. Otherwise, an a priori harmless erasure can lead to the erasure of quantities of other data, without it being really possible to predict which in a simple manner. The option should be handled with the greatest care.
on cascade delete
The operation of
deletefollows a very precise order, very important in some cases. The server proceeds in two stages.
First, it first scans the table for the rows that meet the specified condition. It marks each of these lines, which will be erased later.
Secondly, it erases the marked lines. This erasure can moreover be a process common to all the tables, which empties the marked data at regular time intervals.
Let’s see on an example why this way of doing things is important. Suppose we want to erase sailors from our table, whose age is less than the average age of sailors.
Example 52. Erasure whose clause depends on what is erased
delete from Sailors where ddmort - ddnaissance & lt; ( select avg (ddmort - birth) from Sailors);
In this case, the nested query is first evaluated, and the result compared to the subtraction. These two operations are done on each line. If the erase was immediate, the result of the averaging would change with each row, the final result of the operation would change according to the order in which the rows were processed.
As the calculation is done in two passes, the process works as expected. First, the lines to be erased are marked, but are always taken into account in the calculation of the average. Once all the lines have been examined, the erasures are carried out.