 |
|
The
Problem of Pointers: Logical vs. Physical
Oracle Database Tips by Donald Burleson
|
Logical Pointers
The usage of pointers to navigate object
relationships within a traditional object-oriented application may also have to
be re-worked if one plans to use a relational database. In a non-persistent C++
applications, an object is created with the NEW operator, memory is allocated
and the address of that object is captured. Of course, this address is not
permanent, and a subsequent run of the application may find the same object in
another memory address. For this reason, a method must be determined to replace
the "physical" memory address pointers with "logical" pointers to the object.
(Figure 11.1)
Logical pointers generally assign an object-ID
which are called an "OID", (rhymes with "annoyed"). The OID is associated to an
object when it is instantiated (the OO term for object creation), and will use
this OID as a logical pointer when navigating data relationships. There are
many ways to create logical OID's. Some OODB's make their OID's by relating to
the logical container (similar to a database "page") in which the object
resides, including a byte offset into the container. For example, if a CUSTOMER
object hashes to container number 123 and is assigned space number 4 within that
container, then the OID would have a value of 123-4.
The question of access to relational data is
simplified if we mark a data item as being the primary key for the object. In
this fashion, we can always retrieve the object from disk using SQL, and then
re-establish the OID after we have re-instantiated the object.
The main goal of this technique is to take an
object and stuff its data and pointers into a row of a relational table. Of
course, the relational table will not be able to make any sense of these
pointers, and the relational table structure would not contain any of the
"foreign keys" that we are accustomed to using for relational navigation. SQL
joins cannot be used against these tables, but queries against individual tables
can still be performed.
Figure 11.1 The duality of pointers
In other words, an object uses internal
navigation through its pointers to other objects, while the relational database
must join primary keys with foreign keys to establish a relationship. Compare
the following navigation between C++ objects and the relational equivalent.
ordersForCustomer() { // list all orders for a customer
cout
<< "\n\nOrder summary for customer " << custName << " \n";
int
i;
for(i=0;i<orderCount;i++) {
cout << "\n Order
= " << orderList[i]->orderNum;
};
Select
order_number from customer, order
where
customer_num = :custno and
customer.customer_num = order.customer_num
Here we see that the navigation between objects
is quite different from the navigation between rows in a relational database.
We must keep in mind that associations between objects are pointers to addresses
in the memory of the C++ program. Since each object pointers refers to a RAM
memory location of an associated object, we must address the question about how
to create "parallel universes" of associations, one for the relational database
and another for the in-memory objects.
At first glance it may be tempting to simply
store the in-memory pointers for the objects into rows of a relational table.
These in-memory addresses are established by C++ in the object constructor, and
the actual memory address is dependent upon the prior objects that the program
has constructed. However, since the base register for the C++ program will be
different for each execution of the program, we can never count on these
in-memory addresses to be the same between executions of the program. In other
words, the in-memory address for the customer ABC object will always be
different, each time the constructor is invoked.
When customer ABC is constructed for the first
time, we must store the object in our relational database, independent of any
internal C++ pointers. When the object is retrieved by a later execution of the
C++ program, the row is called from disk into the database buffer, and then
transferred into C++ memory with a memory allocation (using the C++ malloc or
new operators). It is safe to assume that each time an object is retrieved from
disk and allocated to the C++ program, it will have a new memory address. So
then, how can we manage internal associations between objects?
Consider the following example. Here we have
one customer, ABC, who has two orders, order 123 and order 456. The customer
object will have an internal array of pointers to orders, and will store the
in-memory pointers for order 123 and order 456. Each order, in turn, will store
an up-level object pointer to point to customer ABC. (figure 11.2)
Figure 11.2 Pointer structures between objects
Next, we will want to make these objects
persistent, and store them as rows in our relational database. We will want to
store customer ABC as a row in the customer table, and orders 123 and 456 will
each store as rows in the order table. Since we will store "ABC" as the
primary key for the customer and as a foreign key for each order, we can be
assured that the logical relationship between these relational rows will be
maintained. It would be senseless to try to store the array of order pointers,
since their values would be meaningless to a subsequent run, but we could store
information that there are exactly two orders for this customer.
But what happens when customer ABC is retrieved
by a subsequent run of our C++ program? We can easily use SQL to retrieve the
customer row from the table and malloc space in the programs memory for customer
ABC, but how can we establish the pointers to orders 123 and 456? They have not
yet been retrieved, and will not have addresses until we manually call them in
from the database and re-instanciate them. At this point, customer ABC's array
of pointers to orders contains nothing but NULL pointers, since the orders have
not been retrieved from memory. However, we could write a routine that would
check to see if the pointer is NULL, and then go to the database and retrieve
all orders for the customer with an SQL statement. In this case we know that
each order is identified by a unique order number, but it may not always be so
easy to find a key for the sub-objects. In either case, we must somehow keep
the relational keys for each order that has been placed by the customer. There
are two approaches to this:
1. Create a second array in each customer
object to hold the relational keys for each order. This array would parallel
the array of pointers to orders in the customer object.
2. Create a global table of array references
with the following fields:
OWNER_TABLE OWNER_KEY MEMBER_TABLE
MEMBER_KEY
customer ABC order 123
customer ABC order 456
3. Remove all arrays of pointer, and use
linked-list structures.
Using method 1 or 2, orders 123 and 456 can now
be brought into the memory of the C++ program by calling the appropriate SQL.
After the SQL has returned the rows, the objects are re-instantiated and the
pointers re-assigned to the customer and order objects.