{"id":1708,"date":"2023-12-04T13:15:36","date_gmt":"2023-12-04T00:15:36","guid":{"rendered":"https:\/\/www.ronella.xyz\/?p=1708"},"modified":"2026-03-01T01:51:50","modified_gmt":"2026-02-28T12:51:50","slug":"understanding-database-normalization","status":"publish","type":"post","link":"https:\/\/www.ronella.xyz\/?p=1708","title":{"rendered":"Normalization in Relational Databases up to 3NF"},"content":{"rendered":"<p>Normalization is a design technique for relational databases that structures tables to reduce duplication and avoid data anomalies, usually by progressing through 1NF, 2NF, and 3NF. In OLTP systems, normalizing to 3NF is a strong default; you can selectively denormalize later for performance.<\/p>\n<hr \/>\n<h2>Why Normalize?<\/h2>\n<p>Normalization helps you:<\/p>\n<ul>\n<li>Avoid storing the same fact in many rows, which reduces inconsistent data.<\/li>\n<li>Prevent insert, update, and delete anomalies (for example, losing a customer when their last order is removed).<\/li>\n<li>Align the schema with real business rules, making constraints and changes easier to reason about.<\/li>\n<\/ul>\n<p>A classic outcome is breaking one big \u201cOrders\u201d sheet into <code>Customer<\/code>, <code>OrderHeader<\/code>, and <code>OrderLine<\/code> tables, each responsible for its own facts.<\/p>\n<hr \/>\n<h2>Functional Dependencies \u2013 The Simple View<\/h2>\n<p>A functional dependency simply states <strong>what decides what<\/strong> in a table.<\/p>\n<ul>\n<li>Written as <code>X \u2192 Y<\/code>.<\/li>\n<li>Read as: \u201cIf two rows have the same X, they must have the same Y.\u201d<\/li>\n<\/ul>\n<p>Examples:<\/p>\n<ul>\n<li><code>CustomerId \u2192 CustomerName<\/code><br \/>\nKnowing <code>CustomerId<\/code> uniquely fixes <code>CustomerName<\/code>.<\/li>\n<li><code>OrderId \u2192 OrderDate, CustomerId<\/code><br \/>\nKnowing <code>OrderId<\/code> uniquely fixes the date and customer for the order.<\/li>\n<\/ul>\n<p>A <strong>key<\/strong> is just a special case where the determinant decides all other columns:<\/p>\n<ul>\n<li>If <code>OrderId<\/code> is a primary key, <code>OrderId \u2192 all other columns in Order<\/code>.<\/li>\n<\/ul>\n<hr \/>\n<h2>First Normal Form (1NF)<\/h2>\n<p>1NF ensures your tables are truly relational: no lists or repeating groups.<\/p>\n<p>A table is in 1NF if:<\/p>\n<ul>\n<li>Each column holds atomic (indivisible) values, not sets or comma-separated lists.<\/li>\n<li>There are no repeating groups like <code>Phone1<\/code>, <code>Phone2<\/code>, <code>Phone3<\/code>.<\/li>\n<li>Each row is uniquely identifiable by a key.<\/li>\n<\/ul>\n<p>Example before 1NF:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align: right;\">OrderId<\/th>\n<th>CustomerName<\/th>\n<th>Products<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align: right;\">1001<\/td>\n<td>Alice<\/td>\n<td>Mouse,Laptop<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><code>Products<\/code> contains multiple values.<\/p>\n<p>1NF version:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align: right;\">OrderId<\/th>\n<th>CustomerName<\/th>\n<th>Product<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align: right;\">1001<\/td>\n<td>Alice<\/td>\n<td>Mouse<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: right;\">1001<\/td>\n<td>Alice<\/td>\n<td>Laptop<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Each cell is atomic, with one product per row.<\/p>\n<hr \/>\n<h2>Second Normal Form (2NF)<\/h2>\n<p>2NF applies when you have <strong>composite keys<\/strong>, and it removes dependencies on just part of the key.<\/p>\n<p>A table is in 2NF if:<\/p>\n<ul>\n<li>It is already in 1NF.<\/li>\n<li>Every non-key column depends on the whole key, not just part.<\/li>\n<\/ul>\n<p>Suppose:<\/p>\n<pre><code class=\"language-sql\">OrderLine(\n  OrderId,\n  ProductId,\n  OrderDate,\n  CustomerId,\n  ProductName,\n  UnitPrice,\n  Quantity\n)\n\nPrimary key: (OrderId, ProductId)<\/code><\/pre>\n<p>Typical \u201cdecides\u201d rules:<\/p>\n<ul>\n<li><code>(OrderId, ProductId) \u2192 Quantity<\/code><\/li>\n<li><code>OrderId \u2192 OrderDate, CustomerId<\/code><\/li>\n<li><code>ProductId \u2192 ProductName, UnitPrice<\/code><\/li>\n<\/ul>\n<p>Here:<\/p>\n<ul>\n<li><code>OrderDate<\/code> and <code>CustomerId<\/code> depend only on <code>OrderId<\/code>.<\/li>\n<li><code>ProductName<\/code> and <code>UnitPrice<\/code> depend only on <code>ProductId<\/code>.<\/li>\n<\/ul>\n<p>These are <strong>partial dependencies<\/strong>, so the table is not in 2NF.<\/p>\n<p>We decompose:<\/p>\n<pre><code class=\"language-sql\">OrderHeader(\n  OrderId    PK,\n  OrderDate,\n  CustomerId\n)\n\nProduct(\n  ProductId  PK,\n  ProductName,\n  UnitPrice\n)\n\nOrderLine(\n  OrderId    FK,\n  ProductId  FK,\n  Quantity,\n  PRIMARY KEY (OrderId, ProductId)\n)<\/code><\/pre>\n<p>Now each non-key column in <code>OrderLine<\/code> depends on the full key <code>(OrderId, ProductId)<\/code>.<\/p>\n<hr \/>\n<h2>Third Normal Form (3NF)<\/h2>\n<p>3NF removes <strong>transitive dependencies<\/strong>, where a non-key column depends on the key through another non-key column.<\/p>\n<p>A table is in 3NF if:<\/p>\n<ul>\n<li>It is in 2NF.<\/li>\n<li>No non-key column depends on another non-key column (no transitive dependency).<\/li>\n<\/ul>\n<p>Using <code>OrderHeader<\/code>:<\/p>\n<pre><code class=\"language-sql\">OrderHeader(\n  OrderId      PK,\n  OrderDate,\n  CustomerId,\n  CustomerName,\n  CustomerCity\n)<\/code><\/pre>\n<p>Assume:<\/p>\n<ul>\n<li><code>OrderId \u2192 OrderDate, CustomerId<\/code><\/li>\n<li><code>CustomerId \u2192 CustomerName, CustomerCity<\/code><\/li>\n<\/ul>\n<p>Then:<\/p>\n<ul>\n<li><code>OrderId \u2192 CustomerName, CustomerCity<\/code> via <code>CustomerId<\/code>.<\/li>\n<\/ul>\n<p><code>CustomerName<\/code> and <code>CustomerCity<\/code> are <strong>transitively<\/strong> dependent on <code>OrderId<\/code> through <code>CustomerId<\/code>, so this violates 3NF.<\/p>\n<p>We fix this by splitting customer details:<\/p>\n<pre><code class=\"language-sql\">Customer(\n  CustomerId   PK,\n  CustomerName,\n  CustomerCity\n)\n\nOrderHeader(\n  OrderId      PK,\n  OrderDate,\n  CustomerId   FK\n)<\/code><\/pre>\n<p>Now each table\u2019s non-key columns depend directly on its key.<\/p>\n<hr \/>\n<h2>Summary Table: 1NF, 2NF, 3NF<\/h2>\n<p>The table below captures the core rule and main problem each normal form addresses.<\/p>\n<table>\n<thead>\n<tr>\n<th>Normal Form<\/th>\n<th>Rule<\/th>\n<th>Main problem removed<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1NF<\/td>\n<td>Only atomic values; no repeating groups; unique rows.<\/td>\n<td>Multi-valued cells and spreadsheet-style repetition.<\/td>\n<\/tr>\n<tr>\n<td>2NF<\/td>\n<td>Already in 1NF and no partial dependency on part of a composite key.<\/td>\n<td>Redundancy caused by attributes tied to only part of the key.<\/td>\n<\/tr>\n<tr>\n<td>3NF<\/td>\n<td>Already in 2NF and no non-key depends on another non-key (no transitive dependency).<\/td>\n<td>Redundancy and anomalies from attributes that depend on other attributes, not directly on the key.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n","protected":false},"excerpt":{"rendered":"<p>Normalization is a design technique for relational databases that structures tables to reduce duplication and avoid data anomalies, usually by progressing through 1NF, 2NF, and 3NF. In OLTP systems, normalizing to 3NF is a strong default; you can selectively denormalize later for performance. Why Normalize? Normalization helps you: Avoid storing the same fact in many [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[58],"tags":[],"_links":{"self":[{"href":"https:\/\/www.ronella.xyz\/index.php?rest_route=\/wp\/v2\/posts\/1708"}],"collection":[{"href":"https:\/\/www.ronella.xyz\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ronella.xyz\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ronella.xyz\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ronella.xyz\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1708"}],"version-history":[{"count":4,"href":"https:\/\/www.ronella.xyz\/index.php?rest_route=\/wp\/v2\/posts\/1708\/revisions"}],"predecessor-version":[{"id":2155,"href":"https:\/\/www.ronella.xyz\/index.php?rest_route=\/wp\/v2\/posts\/1708\/revisions\/2155"}],"wp:attachment":[{"href":"https:\/\/www.ronella.xyz\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1708"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ronella.xyz\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1708"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ronella.xyz\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1708"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}