This is a read-only copy of the MariaDB Knowledgebase generated on 2024-11-14. For the latest, interactive version please visit https://mariadb.com/kb/.

Database Normalization: 4th Normal Form

This article is intended to be read after the Boyce-Codd normal form article.

Let's look at the situation where redundancies can creep in even though a table is in Boyce-Codd normal form. Let's take the student / instructor / course example used in that article, but change one of the initial assumptions.

Assume that the following is true for the tables below:

  • Each instructor takes only one course
  • Each course can have one or more instructors
  • Each student can have several instructors per course (this is different to the previous example)
  • Each student can take one or more courses

Student Course Instructor data, with several instructors per course

More data anomalies

StudentCourseInstructor
Conrad PienaarBiologyNkosizana Asmal
Dingaan FortuneMathematicsKader Dlamini
Gerrie JantjiesScienceHelen Ginwala
Mark ThobelaBiologyNkosizana Asmal
Conrad PienaarSciencePeter Leon
Alicia NcitaSciencePeter Leon
Quinton AndrewsMathematicsKader Dlamini
Dingaan FortuneMathematicsHelen Ginwala

The data is the same as before, except that Helen Ginwala is teaching science to Gerrie Jantjies as well as mathematics to Dingaan Fortune, and Dingaan Fortune is being taught by both Helen Ginwala and Kader Dlamini for mathematics.

The only possible key is a combination of all three attributes, as shown below. No other combination will uniquely identify a particular record.

Three attributes as key

Student Course Instructor table
Student
Instructor
Course

But this still has some potentially anomalous behavior. The fact that Kader Dlamini teaches mathematics is still stored more than once, as is the fact that Dingaan Thobela takes mathematics. The real problem is that the table stores more than one kind of fact: that of student-to-course relationship, as well as that of a student-to-instructor relationship. You can avoid this, as always, by separating the data into two tables, as shown below:

Creating a table for the student to instructor relationship

Student Instructor table
Student
Instructor

Creating a table for the student to course relationship

Student Instructor table
Student
Course

This situation exists when you have multiple multivalued dependencies. A multivalued dependency exists between two attributes when, for each value of the first attribute, there are one or more associated values of the second attribute. For each value of student, there were many values of course. This is the first multivalued dependency. Then, for each value of student, there are one or more associated values of instructor. This is the second multivalued dependency.

Thus, a table is in 4th normal form if:

  • it is in Boyce-Codd normal form
  • it does not contain more than one multivalued dependency
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.