Hierarchical databases were IBM's first
database, called IMS (Information Management System), which was released in
1960. Hierarchical databases are generally large databases with large amounts
of data. A Hierarchical database is easy to understand, because we deal with
hierarchies every day. Think about work, you have executives, then managers,
then supervisors, then workers and so on. Basically a hierarchy is a method of
organizing data into ranks, with each rank having a higher precedence than those
below it. A hierarchy can be thought of as a tree, or as some call it, an
"inverted" tree (see figure 2.5). Inverted files, or file inversion has nothing
to do with turning anything upside-down. Rather, it refers to the process of
creating and index. For example, many old timers still call creating an index
on customer_number as, "inverting a file on customer_number".
A hierarchy is just an arrangement of "things"
called nodes, and the nodes are connected by lines or "branches". You can think
of these lines or branches as a connection to the next level of more specific
information. The highest node is called the root node, and queries must pass
through this node on their way down the hierarchy. In our example, (Figure 2-5)
STORE is the root node. Every node, except the root node, is connected upward to
only one "parent" node. Nodes have a parent-child relationship, and a parent
node is directly above the child node. We also see that the node called
CUSTOMERS is a parent of DRINKS. Since a child node is always one level
directly below its parent node, the DRINKS node is a child of the CUSTOMER
node. Note that a parent node can have more than one child node, but a child
may only have one parent.
Figure 2-5 A sample hierarchical chart
When we talk about a hierarchical database, the
nodes that we talked about become "segment types". A segment type is simply a
user defined category of data, and each segment contains fields. Each segment
has a key field, and the key field is used to retrieve the data from the
segment. There can be one or more fields in a segment, and most segments also
contain multiple search fields.
Expanding on figure 2-5, let's add some data
fields to our Store segment. Let's begin by asking, what we need to know about
each store? The store name, address and phone number would be good to know.
So, we would add these three fields to Store segment. We then would ask, how do
we want to retrieve the records from our Store segment. If we said that store_name and phone_number, we would make
"Store_Name" the key field, and "Phone_Number" a search field.
IMS is well suited for modeling systems in which
the entities (segments) are composed of descending one-to-many relationships.
Relationships are established with "child" and "twin" pointers, and these
pointers are embedded into the prefix of every record in the database.
In figure 2-5, we have six segments:
1.
store segment
2.
customer
3.
employees
4.
drinks
5.
snacks
6.
fuel
We also have four hierarchical paths:
1.
store, customers, drinks;
2.
store, customers, snacks;
3.
store, customers, fuel;
4.
store, employee.
A hierarchical path is how segment types are
retrieved, the path is like an imaginary line that begins at the root segment
and passes through segment types until it reaches the segment type at the bottom
of the inverted tree. One advantage to a hierarchical database is that if you
only wanted information on stores, the program would only have to know the
format and access the store segment. You would not have to know that any of the
other five segments even exist, what their fields are, or what the relationship
between the segments is.
Hierarchical databases have rigid rules in
relationships and data access. For example, all segments have to be accessed
through the parent segment. The exception to this is, of course, the root
segment because it has no parent. As an example, to retrieve data on fuel in
figure 2.5, you would start at the store segment, then the customer segment, and
then get the fuel segment.
The IMS database has concurrent control, and a
full backup and recovery mechanism. The backup and recovery protects the system
from a failure of IMS itself, an application program, a database failure, and a
operating system, network control program etc. failure. The recovery mechanism
for application programs stores "before" and "after" images of each record which
was changed, and these images could be used to "roll-back" the database if a
transaction failed to complete. If there was a disk failure the images could be
"rolled-forward". IMS was used with CICS to develop the first on-line database
systems for the mainframe.
Three main advantages of hierarchical databases
are a large base with a proven technology that has been around for years, the
ease of using a hierarchy or tree structure, and the speed of the system. Some
disadvantages of hierarchical databases are because of rigid rules in
relationships, insertion and deletion can become very complex, access to a child
segment can only be done through the parent segment (start at the root segment).
While IMS is very good at modeling hierarchical data relationships, complex data
relationships such as many-to-many and recursive many-to-many, like BOM
(Bill-of-Material) relationships had to be implemented in a very clumsy fashion,
by using "phantom" records. The IMS database also suffered from its complexity.
To become proficient in IMS you need months of training and experience. As a
result, IMS development remains very slow and cumbersome.
Just like hierarchical databases use pointers to
logically relate records together, database object implementations also use
pointers to link objects. Generally, database objects are linked using their
object ID's just like hierarchical records are linked with pointers. It is also
common to see naturally occurring hierarchies to be represented with
permutations of the child-twin pointer methods.