 |
|
Relational Database Objects and Pointers
Oracle Database Tips by Donald Burleson
|
Before we begin out discussion of
pointers, it is important to understand exactly what a pointer
represents and how it is implemented in object databases. In
pre-relational databases, each record in the database had a distinct
address. These addresses were the numbers that corresponded to a
physical database block. Also included in the address was the ?offset?
or displacement of the target record into the block. For example, an
address of 665:2 would refer to the second record in database block
number 665. Once defined, these addresses could be stored inside other
records, essentially allowing one record to point to another record.
These pointers became the foundation of establishing relationships
between entities in pre-relational times.
For the object/relational databases, there is
the ability to create a distinct object identified (OID) to uniquely identify
each row within an object/relational table. These OIDs are guaranteed to remain
unique by the database software, and like pointers, OIDs can be embedded into
columns, providing the ability to point to other rows in the database.
The issue of establishing relationships between
data items is not a new concept. As we discussed in Chapter 2, many of the
pre-relational databases employed linked-list data structures, which create
embedded pointers in the prefix of each occurrence of a database entity. These
pointers were used to establish the one-to-many and many-to-many relationships
between the entities.
Although the design of the pointer-based
databases was very elegant in the sense that foreign keys were not needed to
establish data relationships, there were serious problems with implementation.
Network databases such as CA-IDMS and hierarchical databases such as IMS are
very difficult to navigate because the programmer must be aware of the location,
name and types of each pointer in the database.
Even worse, the use of pointers for establishing
data relationships makes structural changes a nightmare. Because the data
relationships are "hard linked" with embedded pointers, the addition of a new
pointer requires special utility programs to "sweep" each and every effected
entity in the database. As each record is located, the prefix of the entity is
restructured to accommodate the new pointers. While the ?pure? object-oriented
databases have this problem of restructuring, the object/relational databases
avoid this problem because the SQL ALTER TABLE syntax can be used to add the new
pointer column to the entity without restructuring all of the rows in the table.
Database navigation with Pointers
One major feature of the Relational database
model is their requirement that rows be addressed by the contents of their data
values (with the exception of relational databases that support the ROW_ID
construct). Now, within the object/relation model, there will be an alternative
access method for rows, such that rows may be identified by either their data
values, or by their object ID's (OIDs). For example:
SELECT
customer_stuff
FROM
customer
WHERE
customer_ID = 38373;
In the object/relational model, we can also use
SQL to address rows by their OIDs, thereby allowing pointer-based database
navigation:
SELECT
customer_stuff
FROM
customer
WHERE
OID = :host_variable;
The object-oriented database models as well as
the object/relational models require the concept of "currency", so that records
may be addressed independently from their data. As we remember from Chapter 2,
the CODASYL model supported the declaration of abstract "sets" to relate classes
together, and also supports the notion of "currency", whereby a record may be
accessed without any reference to the record's data.
This is the core difference between the database
architectures. While a ?pure? relational database will rely on the SQL
optimizer to retrieve the requested rows from the database, the object-oriented
databases and the object/relational databases may ?navigate? the database, one
entity at a time. As such, these architectures must be able to support
navigation, and the programmers must be able to recognize where they are within
the database.
A navigational programmer is required to clearly
describe the access path that the database will use to service the request. The
access path is clearly described in their code, and the programmer can
graphically show the path with an object/relational diagram (as described in
Chapter 3). In SQL however, the access path in not evident and is hidden
because the access is determined by the SQL optimizer, usually at run time. The
SQL optimizer interrogates the system tables to see if the "target" relational
tables have indexes, an then determines the optimal access path to service the
SQL request. The SQL optimizer uses several access methods, including
sequential scans, sequential pre-fetch, and index scans. Only by running the SQL
EXPLAIN utility can the programmer see the access path to the data.
Now that we understand the conceptual history
behind the use of pointers in database management, let's move on to take a look
at how they are implemented in the object/relational databases. While the
?pure? relational model was defined as being devoid of pointers, the new
conventional wisdom is to uniquely identify each row in each table, and allow
these unique Object Identifiers (OIDs) to be stored within tables, essentially
acting as pointers to other relational rows. The following section will examine
how this works.