There is a huge ongoing debate about the benefits of adding objects into
mainstream database management. For years, a small but highly vocal group of
object-oriented bigots preached the object revolution with an almost religious
fervor. However, object-oriented databases languished in the backwaters until
the major relational database vendors made a commitment to incorporate objects
into their relational engines. And now that object-relational databases are
becoming a reality, Oracle developers are struggling to understand how Oracle's
new object extensions are going to change their lives.
Oracle Corp.'s commitment to objects is not just acknowledgment of a fad; the
benefits of being able to support objects are very real, and Oracle's exciting
new extensions are going to create the new foundation for database systems of
the 21st century.
When I first glanced at the Oracle documentation, the object extensions
seemed mundane. The new features essentially consist of user-defined datatypes,
pointers to rows, and the ability to couple data with behavior using methods.
However, these new additions are anything but mundane - they are going to change
the way that databases are designed, implemented, and used.
Objects and Relational Databases
Relational databases must be able to directly represent the real world.
Advocates for object orientation argue that it does not make sense to dismantle
your car when you arrive at home each night only to reassemble it every time you
want to drive it. With Oracle7, you must assemble aggregates using SQL joins
every time you want to see them. With Oracle8, database designers will no longer
need to model their applications at their most atomic levels.
The new "pointer"
construct allows for the creation of aggregate objects, and it will no longer be
necessary to create Oracle "views" to see composite objects. In addition,
relational designers will be able to model the real world at all levels of
aggregation and not just at the third-normal-form level.
The ability to prebuild real-world objects will enable Oracle designers to
model the world as it exists, without re-creating objects from their pieces each
time they are needed. These real-world objects also have ramifications for
Oracle's SQL. Rather than having to join numerous tables together to create an
aggregate object, the object will have an independent existence, even if it is
composed entirely of pieces from atomic tables. (See Figure 1.)
This modeling
ability also implies a whole new type of database access. Instead of using SQL,
you can "navigate" Oracle databases by going from row to row, chasing the
pointer references, without ever having to join tables together. This
navigational data access enables Oracle designers to create faster links between
tables and avoid some of the time-consuming SQL join operations that plague some
systems.
Finally, the ability of Oracle to tightly couple data and behavior will
change everything. Rather than having all of your process logic in external
programs, the process code will move into the Oracle database, and the Oracle
engine will manage both the data and the processes that operate on the data.
These "methods" were first introduced into the object-oriented model to provide
encapsulation and reusability. Encapsulation refers to the requirement that all
data inside an object can only be modified by invoking one of its methods. By
associating pretested and reliable methods with an object, the Oracle object
"knows" how to behave, and the methods always function in the same manner
regardless of the target objects.
Reusability is achieved through the
elimination of the "code hunt." Before methods were introduced, Oracle
programmers had to scan through Pro*C programs or stored procedures searching
for their desired code. With methods, the developer needs to know only the name
of the class associated with the object, and the list of methods can easily be
displayed. Just as the introduction of reusable parts changed the way American
manufacturing functioned, the introduction of reusable code will change the way
that Oracle systems are constructed and maintained.
This reusability does not
come without a price, however. The structure of the aggregate objects must be
carefully defined, and the Oracle developer must give careful thought to the
methods that are associated with objects at each level of aggregation. Libraries
of reusable code will eventually be developed by application developers as well
as third-party vendors to be used within new Oracle systems.
Now that we know the compelling benefits of object-relational databases,
let's take a closer look at how Oracle8 has implemented these features. Oracle
has implemented the object-relational model in stages, is introducing objects in
Oracle 8.0, and will introduce inheritance in Oracle 8.2. User-Defined
Datatypes
The ability of Oracle to support object types (sometimes called user-defined
datatypes) has profound implications for Oracle design and implementation.
User-defined datatypes will enable the database designer to:
- Create aggregate datatypes - Aggregate datatypes are datatypes that
contain other datatypes. For example, you could create a type called
FULL_ADDRESS that contains all of the subfields necessary for a complete
mailing address.
- Nest user-defined datatypes - Datatypes can be placed within other
user-defined datatypes to create data structures that can be easily reused
within Oracle tables and PL/SQL. For example, you could create a datatype
called CUSTOMER that contains a datatype called CUSTOMER_DEMOGRAPHICS, which
in turn contains a datatype called JOB_HISTORY, and so on.
Pointers and Oracle
One of the new user-defined datatypes in the object-relational model is a
"pointer" datatype. Essentially, a pointer is a unique reference to a row in a
relational table. The ability to store these row IDs inside a relational table
extends the traditional relational model and enhances the ability of an
object-relational database to establish relationships between tables. The new
abilities of pointer datatypes include:
- Referencing "sets" of related rows in other tables - It is now possible to
violate first normal form and have a cell in a table that contains a pointer
to repeating table values. For example, an EMPLOYEE table could contain a
pointer called JOB_HISTORY_SET, which in turn could contain pointers to all of
the relevant rows in a JOB_HISTORY table. This technique also lets you
prebuild aggregate objects, such that you could preassemble all of the
specific rows that comprise the aggregate table.
- Allow "pointers" to nondatabase objects in a flat file - For example, a
table cell could contain a pointer to a flat file that contains a nondatabase
object such as a picture in .gif or .jpeg format.
- The ability to establish one-to-many and many-to-many data relationships
without relational foreign keys - This would alleviate the need for relational
JOIN operations, because table columns could contain references to rows in
other tables. By dereferencing these pointers, you could retrieve rows from
other tables without ever using the time-consuming SQL JOIN operator.
Now that you have a high-level understanding of these Oracle8 features, let's
take a closer look at how they are implemented. Basic User-Defined Datatypes
One of the shortcomings of the relational model is the requirement to model
all data at its smallest level. For example, if you want to select all of the
address information for a customer, you must manipulate STREET_ADDRESS,
CITY_ADDRESS, and ZIP_CODE as three separate statements. With abstract
datatyping, you can create a new datatype called FULL_ADDRESS and manipulate it
as if it were an atomic datatype. Although this may seem like a huge
improvement, it is interesting to note that prerelational databases supported
this construct, and the Cobol language had ways to create data "types" that were
composed of subtypes. For example, in Cobol you could define a full address as
follows:
05 CUSTOMER-ADDRESS.
07 STREET-ADDRESS PIC X(80).
07 CITY-ADDRESS PIC X(80).
07 ZIP-CODE PIC X(5).
You could then move the customer address as if it were an individual
entity:
MOVE CUSTOMER-ADDRESS TO PRINT-REC.
MOVE SPACES TO CUSTOMER-ADDRESS.
By the same token, Oracle will let you define a CUSTOMER_ ADDRESS
datatype.
CREATE TYPE full_name (
first_name varchar2(20),
middle_init char(3),
last_name varchar2(50));
CREATE TYPE full_address (
street_address varchar2(20),
city_name varchar2(20),
state_name char(2);
zip_code varchar2(5));
You could then treat CUSTOMER_ADDRESS as a valid datatype and use it to
create tables and select data:
CREATE TABLE customer (
customer_name full_name,
customer_address full_address,
. . .
);
Now that you've defined the Oracle table, you can reference CUSTOMER_ADDRESS
in your SQL just as if it were a primitive datatype:
SELECT DISTINCT customer_address FROM CUSTOMER;
INSERT INTO customer VALUES (
customer_name ('ANDREW','S.','BURLESON'),
customer_address('123 1st st.','Minot, ND','74635');
UPDATE CUSTOMER (full_address) VALUES
'('171 Glenbrook Rd','Rochester','NY', '14616')
WHERE customer_name.last_name = 'Burleson';
Note that the SQL changes when you use user-defined datatypes. Following is
the syntax to select a component of FULL_ADDRESS:
SELECT full_address.zip_code
WHERE full_address.zip_code LIKE '144%';
Nesting of User-Defined Datatypes
Now let's take this concept one step further and consider how to nest
user-defined datatypes within other datatypes. As a basic example, let's create
a datatype that encapsulates all of the data in a table:
CREATE TYPE customer_stuff (
full_name customer_name,
home_address customer_address
business_address customer_address);
With the CUSTOMER_STUFF type defined, defining the table is simple:
CREATE TABLE customer (customer_data customer_stuff);
By using this type of user-defined datatype, you are essentially duplicating
the object-oriented concept of encapsulation. That is, you are placing groups of
related datatypes into a container that is completely self-contained and has the
full authority of the innate relational datatypes such as INT and CHAR:
SELECT customer_stuff.customer_name.zip_code
FROM customer
WHERE customer_stuff.customer_name.zip_code
LIKE '144%';
Using Pointer References with Oracle8
The ability to define datatypes that contain pointers to rows in other
database tables will profoundly change the way you create and maintain
databases. These extensions to the relational model will enable a cell in a
table to reference a list of values or another entire table. This ability will
let designers define and implement "aggregate objects" that contain pointer
references to the components, rather than having to define a relational view on
the data. This will also enable database designers to more effectively model the
real world, reduce overhead, and provide a way to attach "methods" or behaviors
to aggregate objects.
Repeating Groups and Object Types
Now let's look at how repeating values appear in Oracle8. Oracle PL/SQL uses
the VARRAY construct to indicate repeating groups, so you can use the VARRAY
mechanism to declare your job history item. The following creates a TYPE called
JOB_HISTORY with a maximum of three values:
CREATE TYPE customer_address (
street_address varchar2(20),
city_address varchar2(20),
zip_code char(5));
CREATE TYPE job_details (
job_dates varchar2(80),
job_employer_name varchar2(80),
job_title varchar2(80)
job_address customer_address);
CREATE TYPE job_list (
VARRAY(3) OF REF job_details);
Now that we've defined the datatypes, the following code creates the Oracle
table using the datatypes:
CREATE TABLE customer (
customer_name full_name,
cust_address customer_address,
prior_jobs job_list);
CREATE TABLE job_history (
job_stuff job_details);
Now that we've created a repeating list within our Oracle definition, in
addition to dereferencing the datatype (you dereference a user-defined datatype
by prefixing the data name with all of the owner datatypes that are defined
within the table), we also need to subscript the PRIOR_JOBS to tell Oracle which
one we want:
SELECT customer.prior_jobs.job_title(3)
FROM CUSTOMER
WHERE customer.customer_name.last_name
LIKE 'JONES%';
Establishing Data Relationships to Other Tables with Pointers Now that we
have violated first normal form and upset Chris Date, let's carry this argument
one step further. If it is possible to allow for repeating values within a table
cell, why can't we include a reference to an entirely new table? Imagine a
database that allows nesting of tables within tables such that a single cell of
one table is a pointer to another whole table. This concept may seem foreign on
the surface, but it's not too hard to understand if you consider that many
real-world "things," or objects, are made up of subparts.
It has always been a shortcoming of the relational database that only atomic
things can be represented directly, and relational "views" are required to
assemble aggregate objects. Object technology professors always like to make fun
of the relational model's inability to represent aggregate objects, stating that
it makes no sense to rebuild objects from their components each time you want to
reference the object.
At last, nested object types let Oracle users represent
real-world "things" without resorting to views. Let's take a look at how this
type of recursive data relationship might be represented within Oracle. The
following code creates a TYPE definition for a list of orders. This list of
pointers to orders might become a column within an Oracle table:
CREATE TYPE order_set
AS TABLE OF order;
CREATE TYPE customer_stuff (
customer_id integer,
customer_full_name full_name,
customer_full_address customer_address,
. . .
order_list order_set);
CREATE TABLE customer OF customer_stuff;
The following shows the new style of table-creation syntax. Both of the
following table declarations are identical, except that the CREATE TABLE OF
syntax will establish object IDs (OIDs), so that other tables may contain
references to rows in the customer table. Without OIDs:
CREATE TABLE customer (cust_data customer_stuff);
With OIDs:
CREATE TABLE customer OF customer_stuff;
In either case, we have now defined a pointer column called ORDER_LIST in the
CUSTOMER table. This pointer will point to a list of pointers. Each cell of this
list will contain pointers to rows in the ORDER table. (See Figure 2.) Now, here
comes the cool part. We can now "prejoin" with the ORDER table to add the order
for this customer:
UPDATE customer
SET order_list (
SELECT REF(order)/* this returns the OID's from all order rows */
FROM order
WHERE order_date = SYSDATE
AND order.customer_id = (123)
)
The following query shows how you can navigate between tables without having
to join tables together:
SELECT DEREF(order_list)
FROM customer
WHEREcustomer_id = 123;
/* this will return 3 rows from the order table */
Because we have dereferenced the ORDER_LIST datatype, Oracle will access the
VARRAY and return the data values for all of the OIDs that are in the array - in
this case, the three rows from the order table. The important point is that we
have navigated between tables without ever performing a SQL join. Consider the
possibilities.
You would never need to embed the foreign key for the CUSTOMER
table in the ORDER record, because you could store the pointers in each customer
row. Of course, you would never be able to perform a relational join between the
CUSTOMER and ORDER tables, but this would not really make any difference as long
as you have maintained the ability to navigate between customers and orders with
pointers.
Of course, these are one-way pointers from CUSTOMER to ORDER, and you
don't have a method to get from the ORDER table to the CUSTOMER table unless you
embed a pointer to point to the row that contains the customer for each order.
You could do this by creating an "owner" reference inside each order row that
contains the OID of the customer who placed the order. What it All Means
Although the syntax extensions for Oracle8 seem very minor, the ramifications
for Oracle database design and administration are monumental. You can now design
systems that support all of the major object-oriented features, including
pointer-based data navigation, coupled data and behaviors, polymorphism,
and-upcoming in Oracle 8.2-support for inheritance. The new object features will
revolutionize the nature of Oracle systems and herald a new age of database
design in which the promises of object orientation will become a reality.
Designing with Oracle Methods
Successfully coupling Oracle9i data and methods
requires the use of method prototypes. The following example shows
a hierarchy of methods. If we use the data type definitions in the
data dictionary and the pseudocode from the mini-spec it will
facilitate our discussion. The example references a set of data flow
diagrams. The fill_order process is described in level one,
including all of the lower level data flow diagrams each process is
listed, including the sub-methods within the process:
1 -
fill_order
1.1 - check_customer_credit
1.2 - check_inventory
1.2.1 - check_stock_level
1.2.2 - generate_backorder_notice
1.2.3 - decrement_inventory
1.2.4 - prepare_packing_slip
1.3 - prepare_invoice
1.3.1 - compute_order_cost
1.3.2 - compute_shipping_charges
1.3.3 - add_handling_charge
1.3.4 - assemble_invoice
This hierarchy helps us visualize how methods
are nested within other methods. After the hierarchy has been
developed, we can translate these processes into the database
classes.
The lowest level data flow diagrams represent
functional primitives, processes that cannot be decomposed into
smaller processes. It is obvious that functional primitive processes
become methods, but does this mean that they will never have
subcomponents? With the exception of standalone methods, such as a
compute_shipping_charges method, a method will never have
subcomponents if the analyst has designed the process properly.
Using the primitive processes, we can design a
method that accepts the same values as noted on the DFD and returns
those values to the retrieving program. For example, we might have a
process called compute_shipping_charges that accepts a
valid_in_stock_order as input. The process gathers the weight and
cost of the items, computes the charges, and returns the shipping
charge and total weight.
A prototype is essentially a formal definition
of a method that describes all of the input and output data flows.
The accepted form for a prototype is:
return_data_type Method_name
(input_data_name_1 input_data_type_1,
input_data_name_2 input_data_type_2,
. . .);
Let's review the data types that methods can
use before going into further detail. These data types can return a
data value or they can be an input parameter for a method: