Normalization - Yousef's Notes
Normalization

Normalization

Normalization is the process to eliminate data redundancy and enhance data integrity in the table. Normalization also helps to organize the data in the database. It is a multi-step process that sets the data into tabular form and removes the duplicated data from the relational tables.

#First Normal Form (1NF): Ensuring Atomicity and Uniqueness

A table is in 1NF if:

  • Each column contains atomic (indivisible) values – meaning each field must contain only one value, not multiple values or lists.
  • Each row must be unique – so there are no duplicate rows. eg
student_id name courses
1 John Doe Math, Science
2 Jane Doe History, Literature
this table is not in 1NF because the courses column holds multiple values (Math, Science). To bring this table into 1NF, we need to split the multiple values into separate rows.
student_id name course
1 John Doe Math
1 John Doe Science
2 Jane Doe History
2 Jane Doe Literature
now each field has atomic values and each row is unique – so the table is in 1NF.

#Second Normal Form (2NF): Removing Partial Dependencies

A table is in 2NF if:

  • It is already in 1NF.
  • Every non-key column is fully dependent on the entire [[Primary Keys |primary key]], meaning there are no partial dependencies.
  • Partial dependency occurs when a non-key column depends only on part of a composite primary key (when the primary key consists of more than one column). Suppose you have a student_courses table like this:
student_id course_id student_name course_name
1 101 John Doe Math
1 102 John Doe Science
Here, the student_name and course_name columns depend only on part of the composite key (either student_id or course_id), but not both. This violates 2NF.

#Students Table:

student_id student_name
1 John Doe
2 Jane Doe

#Courses Table:

course_id course_name
101 Math
102 Science

#StudentCourses Table (Associating students with their courses):

student_id course_id
1 101
1 102
now, every non-key column is fully dependent on the entire primary key, and the table is in 2NF.

#Third Normal Form (3NF): Removing Transitive Dependencies

A table is in 3NF if:

  • It is already in 2NF.
  • There are no transitive dependencies, meaning no non-key column depends on another non-key column.
student_id student_name department department_location
1 John Doe Math Building A
2 Jane Doe History Building B
in this table, department_location depends on department, not directly on student_id. This is a transitive dependency and violates 3NF.

To remove the transitive dependency, split the table:

#Students Table:

student_id student_name department
1 John Doe Math
2 Jane Doe History

#Departments Table:

department department_location
Math Building A
History Building B