The process of refining the structure of a database to minimise redundancy and improve integrity is called normalisation. When a database has been normalised, it is said to be in normal form. There are three normal forms:
First Normal Form
A database is in first normal form if there are no repeated fields. That means that there must only be one field for each item of data you want to score. The library database described on the previous page is not in first normal form because it has the fields book1, book2, book3, book4, book5, and book6, which all store the same information, i.e. the name of a book. First normal form not only facilitates searching, but is also more space efficient as there wouldn't be 5 empty fields being stored if borrowers only had one book.
Second Normal Form
A database is said to be in second normal form if it:
- is already in first normal form
- has no fields that aren't dependent on the whole of the key
That is to say that all fields are dependent on the whole of the key (where there is a compound key). So, take a look at the merits database at the top of the previous page. In fact, here it is again:
- Name
- Tutor group
- Teacher
- Subject
- Date
- Reason
- Head of year
- Tutor
There are no repeated fields, so it's in first normal form. If you made name the key field, however, you could see there would be fields that aren't dependent on the key. Subject, for example, is related to the teacher, not to the student. This database, therefore, is not in second normal form. Even if you were to make a compound key to include the date and the teacher giving the merit, it still wouldn't be in second normal before because subject wouldn't be dependent on the whole of the key.
Third Normal Form
A database is in third normal form if it:
- is already in second normal form
- it has no non-key dependencies
By non-key dependencies, we mean that there are no fields that are dependent on other fields that are not part of the key. For example, in the version of merits database above, where Name was the key, the subject was related to the teacher, and not to the student - that is a non-key dependency.
Real Life
But forget all that! It still sounds a bit complicated to me, and I've been developing relational databases (including professionally) for more than 10 years. Don't try to start at first normal form, and then go to second normal form, and then to third. Once you've got the hang of it, you'll knock out databases in third normal form without even thinking about it. The key thing is to think about what your entities (or tables) are going to be - if you pick the right ones it'll normalise itself.
The Normalisation Oath
Well, don't forget it all, however, because you may be asked about normalisation in an exam. Here's a little tip to remember the three stages - we call it the Normalisation Oath:
So, to get to third normal form, your non-repeating fields (first normal form) need to be dependent on the whole of the key (second normal form), and nothing other than the key (third normal form). It works for me!