In 1970, Dr. Edgar Codd of IBM developed a
Relational Model of Data. In the model, data would be stored in simple linear
files, these simple linear files are called ?relations? or 'tables?. One of the
best improvements of the relational model over its predecessors was its
simplicity. Rather than having to know dozens of DML commands, the relational
model introduced a declarative language called Structured Query Language (SQL)
to simplify data access and manipulation.
Codd chose to call his language Structured Query
Language because it is not structured, it is not only for queries (SQL can
update), and it is not a language (it is embedded in languages).
The tables are two dimensional arrays of ?rows?
and ?columns?. Rows are sometimes called 'tuples? (rhymes with ?couples?) and
columns are sometimes called ?attributes?. A ?record? is a row of a table, and
a ?field? is a column, in a row of a table. A table will always have a field or
several fields that make a ?primary key? for a table. In a relational database,
the tables are independent, unlike hierarchical and network models that are
pointer connected. Tables basically correspond to segment types in hierarchical
and record types in the network models. Relational tables can contain only one
type of record, and each record has a fixed number of fields that are all
explicitly named. There is no predetermined sequence of records in a table and
duplicate records are not allowed in a table, also in a table, fields are
distinct and repeating groups are not allowed (see figure 2-10).
Figure 2-10 A sample relational chart
A primary key uniquely identifies a row in a
table, the key can be made up of one or more fields. A foreign key allows you
to join two or more tables together by using a key field in one table with a non
key field in another table.
Relational databases made the following
improvements over hierarchical and network databases:
Simplicity
- The concept of tables with rows and columns is extremely simple and easy to
understand. End users have a simple data model. Complex network diagrams used
with the hierarchical and network databases are not used with a relational
database.
Data Independence
- Data independence is the ability to modify data structures (in this case,
tables) without affecting existing programs. Much of this is because tables are
not hard-linked to one another. Columns can be added to tables, tables can be
added to the database, and new data relationships can be added with little or no
restructuring of the tables. A relational database provides a much higher
degree of data independence than do hierarchical and network databases.
Declarative Data Access
- The SQL user specifies what data they want, then the embedded SQL (Structural
Query Language), a procedural language determines how to get the data. In
relational database access, the user tells the system the conditions for the
retrieval of data. The system then gets the data that meets the selection
conditions in the SQL statements. The database navigation is hidden from the
end user or programmer, unlike a CODASYL DML language, where the programmer had
to know the details of the access path.
Dr. Codd also introduced with the relational
database, the concept of Structured Query Language (SQL), also known as
'sequel?. SQL was much more than a query language. SQL is really a data
sub-language that supports end-users, programmers, database administrators, and
security administrators.
There are two ways to think of SQL. The first
says that SQL can be thought of as having three categories of function:
Define, Manipulate, and Authorize. Define is the DDL that does create, drop,
and alter functions. Manipulate is the DML that does select, insert, update,
and delete functions. Authorize is the control that does grant and revoke
functions. Within the DML functions, the other school teaches that SQL has
three DML functions; Select, Project and join. A select reduced the length of a
table by filtering out unwanted rows, a project shrinks the width of the table
by filtering out unwanted columns, and a join is used to relate two or more
independent tables that share a common column.
The most important point about SQL is that it
allowed programmers and end users a simple, easy way to add, change, and extract
data from a relational database. Any two or more tables could be joined
together on the fly at run time using their primary and/or foreign keys. There
are no pointers or hard links from one table to another.