Array of Pointers in C++

Oracle Database Tips by Donald Burleson

Pointer architectures

There is another problem relating to the pointer architectures of C++ applications. Unlike databases that conform to standard pointer architectures (such as the CODASYL DBTG model), object-oriented applications are unconstrained by any "conventions" or rules, about a uniform method for establishing pointer relationships.

The Committee on Development of Applied Symbolic Languages (CODASYL) formed a database task group (the DBTG) in the 1970's to address the problem of diverse database standards. The CODASYL DBTG was commissioned to develop a set of "rules", or a model for database management systems, just as the ODMG group is doing with object-oriented databases. The CODASYL DBTG developed what is called the "Network model" for databases. The CODASYL model became the framework for a new generation of commercial database systems such as the IDMS database from Cullinane Corporation, and the MDBS2 database.

Regardless of conventions, there are two pointer methods that are predominant in commercial systems and a discussion of the merits of each method will clarify the issues.

The first pointer method closely follows the CODASYL DBTG model in which each relationship (sometimes called a "set") contains circular two-way linked list pointers representing NEXT, PRIOR, and OWNER. For a many-to-many relationship such as ORDER and ITEM we see linkages to a junction object. This is shown pictorially in figure 11.3.

Figure 11.3 A Set occurrence diagram

In this diagram, we can follow the NEXT pointer and see that order 123 contains 19 items and then follow the owner pointer to see that these items are Pads. We can then continue along the NEXT chain to an object with a quantity of "3", and follow the owner pointer in the other set to see that these items are pencils. Continuing along the original NEXT chain we hit 12 items, and following the owner pointer, we see that they are 12 pens. Conversely, we could navigate from the pen object to see all orders that contain pens. Following the pens NEXT pointer, we see a quantity of 1 and we could follow the owner pointer of this object to get order 456 (not shown). Continuing along the pen's NEXT chain we encounter 12 pens, and following the owner pointer we see that it is in order 123.

With this two-way linked list approach, very long relationships can be represented without the overhead of having a very long array of pointers in the owner object. For example, if a customer places an average of 5000 orders, the resulting array in the owner record would be extremely large (> 2000 bytes). We also have some control over what the relational database folks call referential integrity (RI). By using RI we can enforce business rules, for example, no orders may be placed unless they are from a valid customer, or no customer may be deleted if they have outstanding orders. The two-way linked list approach helps us to enforce these business rules. In the first example, an ORDER object cannot be inserted unless "currency" has been established with a CUSTOMER because the linked-list algorithm requires the establishment of NEXT and PRIOR pointer in adjacent objects. In the delete example, a CUSTOMER destructor may not be called if the NEXT pointer points to a valid ORDER object.

Another popular method for representing relationships is to create an array of pointers in the owner object. This array contains one bucket for each member object in the relationship. Each member object in turn contains a pointer back to the owner object. Figure 11.4 illustrates this type of pointer architecture.

Figure 11.4 Using arrays of pointers

The method of using arrays of pointers has some benefits over two-way linked lists. If the sets are small, say less than 100 members for each owner, sorting of members can be achieved without re-linking all of the members. For example, if we have 100 orders for a customer and we wish to sort the array of pointers such that the orders are retrieved in ORDER_DATE sequence, we can easily sweep the orders and internally sort the array into the desired order, all without altering any of the ORDER objects. This type of pointer twizzling can be very useful over the traditional linked-list method in which the pointers within each object would need to be readjusted.

Unfortunately, the concept of "arrays of pointers" is very difficult to map into a relational table. Most relational databases do not support repeating groups within tables and there is no direct way to represent an array within a relational table.

When this pointer architecture is encountered there are two alternatives; the application can be re-written to employ linked-lists or another relational table can be created to hold the array of pointers. Neither solution is ideal, and it is often easier to re-write the application with linked-list pointer structures.

Remember, the idea is to create a parallel set of logical pointers to be used by the relational tables, but we will not need to alter any of the existing pointer structures in the C++ application. However, it can be very confusing when the in-memory object navigation is performed with arrays of pointers while the access to persistent object is done with linked-list pointers.

The most elegant solution is to have the C++ application and the relational database using a similar pointer architecture. This way when an object is requested (if it is not already in memory), a common "call" can be made to the relational database to fetch the object.

��