 |
|
Array of Pointers in C++
Oracle Database Tips by Donald Burleson
|
Pointer
architectures
There is another problem relating to the pointer
architectures of C++ applications. Unlike databases that conform to standard
pointer architectures (such as the CODASYL DBTG model), object-oriented
applications are unconstrained by any "conventions" or rules, about a uniform
method for establishing pointer relationships.
The Committee on Development of Applied Symbolic
Languages (CODASYL) formed a database task group (the DBTG) in the 1970's to
address the problem of diverse database standards. The CODASYL DBTG was
commissioned to develop a set of "rules", or a model for database management
systems, just as the ODMG group is doing with object-oriented databases. The
CODASYL DBTG developed what is called the "Network model" for databases. The
CODASYL model became the framework for a new generation of commercial database
systems such as the IDMS database from Cullinane Corporation, and the MDBS2
database.
Regardless of conventions, there are two pointer
methods that are predominant in commercial systems and a discussion of the
merits of each method will clarify the issues.
The first pointer method closely follows the
CODASYL DBTG model in which each relationship (sometimes called a "set")
contains circular two-way linked list pointers representing NEXT, PRIOR, and
OWNER. For a many-to-many relationship such as ORDER and ITEM we see linkages
to a junction object. This is shown pictorially in figure 11.3.
Figure 11.3 A Set occurrence diagram
In this diagram, we can follow the NEXT pointer
and see that order 123 contains 19 items and then follow the owner pointer to
see that these items are Pads. We can then continue along the NEXT chain to an
object with a quantity of "3", and follow the owner pointer in the other set to
see that these items are pencils. Continuing along the original NEXT chain we
hit 12 items, and following the owner pointer, we see that they are 12 pens.
Conversely, we could navigate from the pen object to see all orders that contain
pens. Following the pens NEXT pointer, we see a quantity of 1 and we could
follow the owner pointer of this object to get order 456 (not shown).
Continuing along the pen's NEXT chain we encounter 12 pens, and following the
owner pointer we see that it is in order 123.
With this two-way linked list approach, very
long relationships can be represented without the overhead of having a very long
array of pointers in the owner object. For example, if a customer places an
average of 5000 orders, the resulting array in the owner record would be
extremely large (> 2000 bytes). We also have some control over what the
relational database folks call referential integrity (RI). By using RI we can
enforce business rules, for example, no orders may be placed unless they are
from a valid customer, or no customer may be deleted if they have outstanding
orders. The two-way linked list approach helps us to enforce these business
rules. In the first example, an ORDER object cannot be inserted unless
"currency" has been established with a CUSTOMER because the linked-list
algorithm requires the establishment of NEXT and PRIOR pointer in adjacent
objects. In the delete example, a CUSTOMER destructor may not be called if the
NEXT pointer points to a valid ORDER object.
Another popular method for representing
relationships is to create an array of pointers in the owner object. This array
contains one bucket for each member object in the relationship. Each member
object in turn contains a pointer back to the owner object. Figure 11.4
illustrates this type of pointer architecture.
Figure 11.4 Using arrays of pointers
The method of using arrays of pointers has some
benefits over two-way linked lists. If the sets are small, say less than 100
members for each owner, sorting of members can be achieved without re-linking
all of the members. For example, if we have 100 orders for a customer and we
wish to sort the array of pointers such that the orders are retrieved in
ORDER_DATE sequence, we can easily sweep the orders and internally sort the
array into the desired order, all without altering any of the ORDER objects.
This type of pointer twizzling can be very useful over the traditional
linked-list method in which the pointers within each object would need to be
readjusted.
Unfortunately, the concept of "arrays of
pointers" is very difficult to map into a relational table. Most relational
databases do not support repeating groups within tables and there is no direct
way to represent an array within a relational table.
When this pointer architecture is encountered
there are two alternatives; the application can be re-written to employ
linked-lists or another relational table can be created to hold the array of
pointers. Neither solution is ideal, and it is often easier to re-write the
application with linked-list pointer structures.
Remember, the idea is to create a parallel set
of logical pointers to be used by the relational tables, but we will not need to
alter any of the existing pointer structures in the C++ application. However,
it can be very confusing when the in-memory object navigation is performed with
arrays of pointers while the access to persistent object is done with
linked-list pointers.
The most
elegant solution is to have the C++ application and the relational database
using a similar pointer architecture. This way when an object is requested (if
it is not already in memory), a common "call" can be made to the relational
database to fetch the object.