 |
|
Basic User-Defined Data Types
Oracle Database Tips by Donald Burleson
|
One of the shortcomings of the
relational model was the requirement to model all data at their smallest
level. For example, if we want to select all of the address information
for a customer, we are required to manipulate street_address,
city_address, and zip_code as three separate statements. With abstract
data typing, we can create a new data type called full_address, and
manipulate it as if it were an atomic data type. While this may seem
like a huge improvement, it is interesting to note that pre-relational
databases supported this construct, and the COBOL language had ways to
create data "types" that were composed of sub-types. For example, in
COBOL, we could define a full address as follows:
05
CUSTOMER-ADDRESS.
07 STREET-ADDRESS PIC X(80).
07 CITY-ADDRESS PIC X(80).
07 ZIP-CODE PIC X(5).
We can then move the customer-address as if it
were an individual entity:
MOVE CUSTOMER-ADDRESS TO PRINT-REC.
MOVE SPACES TO CUSTOMER-ADDRESS.
By the same token, an object database will allow
the definition of a customer_address data type.
CREATE TYPE customer_address (
street_address char(20),
city_address char(20),
state_name char(2)
CONSTRAINT STATE_NAME IN ('CA','NY','IL','GA','FL'),
zip_code number(9)) DEFAULT VALUE 0;
There wee see that the data type definition
contains much more than just the data and the data size. We may also assign
default values to the data types and specify value constraints. The default
value and constraint checks happen at object creation time, and this insures
that the database designer has complete control over the data definition as well
as the values that are inserted into the data type.
We could then treat customer_address as a valid
data type, using it to create tables and select data:
CREATE TABLE customer (
full_name cust_name,
full_address customer_address,
. . .
);
Now that the Oracle table is defined, we can
reference full_address in our SQL just as if it were a primitive data type:
SELECT DISTINCT full_address FROM CUSTOMER;
INSERT INTO
customer VALUES (
full_name ('ANDREW','S.','BURLESON'),
full_address('123 1st st.','Minot, ND','74635');
UPDATE CUSTOMER
(full_address) VALUES ' ';
Note that the SQL select statements change when
accessing rows that contain user-defined data types. Here is the SQL syntax to
select a component of full_address:
SELECT full_address.zip_code
FROM customer
WHERE
full_address.zip_code LIKE '144%';
Lists of repeating data items
For many years the idea of repeating data items
within an object has been repugnant to database designers. The tenets of data
normalization dictated that the removal of repeating data items was the very
first step toward a clean data model.
The introduction of lists of values into
relational databases was first done by the UniSQL database. At the time, this
non-first normal form modeling ability was treated with suspicion as raised the
ire of the titans of relational modeling. However, these repeating groups
became more respectable, and even C.J. Date introduced a new concept into the
relational model called a "set", to allow this construct to fit into the
relational paradigm. Today, we recognize that there are specific instances
where the introduction of a repeating group will enhance a database design.
In order to use repeating groups with a design,
the following should be true:
1. The data items should be small in size.
2. The data item should be static and rarely
changed.
3. The repeating data should never need to be
queried as a "set".
To illustrate this principle, consider the
following example. Suppose that we are designing a University database and we
notice that a student may take the ACT exam up to three times, and our database
must capture this information. Without repeating groups, we have two choices:
1. We can create unique columns within our
student table, naming each repeating group with a subscript:
create table student (
sudent_ID number(5),
. . .
act_score_one number(3),
act_score_two number(3),
act_score_three number(3))
2. We could "normalize" out the repeating groups and move them into another
table:
create table act_score (
student_ID number(5),
act_score number(3));
Now, let's take a look at how the repeating group might be implemented:
CREATE TYPE act_type as VARRAY(3) OF act_score;
Here we see that we have defined a data
structure that can use an implied subscript to reference the data. For example,
to insert the test scores for Rick Tytler, we could enter:
INSERT INTO
student
act_score(
VALUES
500
556
621)
WHERE
student_last_name = 'Tytler';
Now to select the test score for Rick Tytler, we
could query the act_scores by referring to the subscript:
SELECT
act_score(1),
act_score(2),
act_score(3)
FROM
student
WHERE
student_last_name = 'Tytler';
Now that we understand the concept of repeating
values within a database object or a relational table and the object-oriented
extensions to SQL to handle these structures, let's take a look at the
advantages and disadvantages of this approach:
The primary advantages to repeating groups are
immediately available when the object or row is fetched. Less disk space is
consumed, since we do not have to create another table to hold the act scores.
Remember, if we create another table, we will need to redundantly duplicate the
student_ID for each and every row of the act_score table.
The main disadvantage of repeating groups is
that they cannot easily be queried as a distinct set of values. In other words,
we would not be able to query to see all students who have received an ACT score
greater than 500.
DECLARE c1 AS CURSOR FOR
SELECT * FROM STUDENT;
FOR score IN c1
LOOP
Fetch c1 into :score;
FOR i = 1 to 3
LOOP
IF act_score(i) > 500
THEN
PRINT student_name
END IF
END LOOP
END LOOP
Another alternative to using a cursor would be
to use the SQL union operator. As you can see, this involves creating a
temporary table to hold all of the values into a single column.
CREATE TABLE TEMP as
(
SELECT act_score(1) FROM student
UNION
SELECT act_score(1) FROM student
UNION
SELECT act_score(1) FROM student
)
SELECT act_score FROM temp WHERE act_score >
500;
On the other hand, if we were to remove the
repeating group of act_scores and place them in a table called act_score, we
could then query the table to easily get the list of students:
SELECT student_name
FROM student, act_score
WHERE
act_score.stident_ID = student.student_ID
AND
act_score > 500;
One other confounding problem with repeating
groups within objects is that we do not know in advance how many cells will
contain data. Therefore we would need to test to see how many of the values
are present. As, such special code must be added to test whether the column is
NULL as shown in the following example:
FOR
i - 1 to 3
LOOP
IF act_score(i) IS NOT NULL
then
. . .
END LOOP