Unnormalized form

From HandWiki

In database normalization, unnormalized form (UNF), also known as an unnormalized relation or non-first normal form (N1NF or NF2),[1] is a database data model (organization of data in a database) which does not meet any of the conditions of database normalization defined by the relational model. Database systems which support unnormalized data are sometimes called non-relational or NoSQL databases. In the relational model, unnormalized relations can be considered the starting point for a process of normalization. It should not be confused with denormalization, where normalization is deliberately compromised for selected tables in a relational database.

History

In 1970, E. F. Codd proposed the relational data model, now widely accepted as the standard data model.[2] At that time, office automation was the major use of data storage systems, which resulted in the proposal of many NF2 data models like the Schek model, Jaeschke models (non-recursive and recursive algebra), and the Nested Table Data (NTD) model.[1] IBM organized the first international workshop exclusively on this topic in 1987 which was held in Darmstadt, Germany.[1] Moreover, a lot of research has been done and journals have been published to address the shortcomings of the relational model. Since the turn of the century, NoSQL databases have become popular owing to the demands of Web 2.0.

Relational form

Normalization to first normal form requires the initial data to be viewed as relations.[3] In database systems relations are represented as tables. The relation view implies some constraints on the tables:

  • No duplicate rows. In practice, this is ensured by defining one or more columns as primary keys.
  • Rows do not have an intrinsic order. While tables have to be stored and presented in some order, this is unstable and implementation dependent. If a specific ordering needs to be represented, it has to be in the form of data, e.g. a "number" column.
  • Columns have unique names within the same table.
  • Each column has a domain (or data type) which defines the allowed values in the column.
  • All rows in a table have the same set of columns.

This definition does not preclude columns having sets or relations as values, e.g. nested tables. This is the major difference to first normal form.

NoSQL databases like Document databases typically does not conform to the relational view. For example, an JSON or XML database might support duplicate records and intrinsic ordering. Such database can be described as non-relational. But there are also database models which support the relational view, but does not embrace first normal form.[4] Such models are called non-first normal form relations (abbreviated NFR, N1NF or NF2).

Example

Customer Cust. ID Transactions
Abraham 1
Tr. ID Date Amount
12890 14-Oct-2003 −87
12904 15-Oct-2003 −50
Isaac 2
Tr. ID Date Amount
12898 14-Oct-2003 −21
Jacob 3
Tr. ID Date Amount
12907 15-Oct-2003 −18
14920 20-Nov-2003 −70
15003 27-Nov-2003 −60

This table represent a relation where one of the columns (Transactions) is itself relation-valued. This is a valid relation but does not conform to first normal form which does not allow nested relations. The table is therefore unnormalized.

Modern applications

Today, companies like Google, Amazon and Facebook deal with large amounts of data that are difficult to store efficiently. They use NoSQL databases, which are based on the principles of the unnormalized relational model, to deal with the storage issue.[5] Some examples of NoSQL databases are MongoDB, Apache Cassandra and Redis. These databases are more scalable and easier to query with as they do not involve expensive operations like JOIN.[citation needed]

See also

References

  1. 1.0 1.1 1.2 Kitagawa, Hiroyuki; Kunii, Tosiyasu L. (1990-02-06). The Unnormalized Relational Data Model. Springer. pp. 1, 5, 7, 10. ISBN 978-4-431-70049-4. 
  2. "IBM Archives: Edgar F. Codd". April 23, 2003. https://www-03.ibm.com/ibm/history/exhibits/builders/builders_codd.html. 
  3. Codd, E. F. (1970). A Relational Model of Data for. Large Shared Data Banks. IBM Research Laboratory, San Jose, California.
  4. Operations and the Properties on Non-First-Normal-Form Relational Databases H. Arisawa, K. Moriya, T. Miura Published in VLDB 1983
  5. Moniruzzaman, A. B. M.; Hossain, Syed Akhter (2013). "NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison". International Journal of Database Theory and Application 6.