However, with direct access, there was a need to
process records based on more than a single key. The need for additional
indexing lead to the development of two new access methods. They were ISAM
(Indexed Sequential Access Method) and VSAM (Virtual Storage Access Method).
You can think of a computer index the same way
you think of an index in a book. You use an index in a book to find what you
want quickly the same is true with a computer index. An index always contains
two fields. One is the symbolic key and the second the corresponding storage
address for the record. The index is a separate file from the master file to
which it refers to, and it contains only the record key and the storage
address. To find a record, the program will scan the index and then retrieve
the record from the file at the location specified by the index. (Figure 2-4)
The index makes it possible to read the file sequentially or randomly. Also,
indexing activities are handled by indexing software, and the programmer does
not have to build or maintain the indexes.
Figure 2-4 A sample index retrieval
ISAM, like physical sequential files, stores the
records back-to-back, making for very efficient use of disk space. However,
unlike physical sequential, ISAM files must be stored on disk, since the disk
addresses are needed to create the indexes. The physical location of records
within ISAM is not important since the indexes take care of the access to the
records. A single ISAM file may have many dozens of indexes, each allowing the
files to be retrieved in some pre-defined order. In some case, the size of the
indexes will exceed the size of the base file.
Note: It is also possible to create indexes on
BDAM files, such that the developers enjoy both fast retrieval and storage of
records, as well as indexing for artificial sequencing of records.
Another popular access method is the Virtual
Storage Access Method (VSAM) it is a combination of the best features of QSAM
and ISAM and also adds a few new features. VSAM, like its cousin ISAM, allows
physical sequential files to be indexed on multiple data items. By having
multiple indexes, data can be retrieved directly in several ways and you can
access data anywhere in the file using a different index.
Shortcomings of flat files
Needless to say, there were many problems,
difficulties, and shortcomings to the flat file systems. These included sharing
data. Generally each department developed their own systems. Each department
would have their own file structures and programming languages. Because of
this, it was very difficult for departments to share data and information.
Duplication of data was a problem, since many departments within a company would
often duplicate the same information, leading to higher storage costs. Also, if
one department updated data and another did not, discrepancies would result, and
the values would not be uniform at each location.
Maintenance problems also occurred because of
the number of programs, programming languages, duplication of data, etc. If a
file structure ever changed, trying to identify all of the programs that needed
modification was almost impossible. In addition, flat files possessed no real
backup and recovery methods. Programmers had to write programs to backup a
system before updates started. If a failure occurred, the files were corrupted
and they had to restore from the backup tape or disk and start over. Another
perplexing problem was that there was no standard method for accessing the
data. One application might required Cobol while another used Fortran, and so
on.
Now that we understand the basic file storage
methods, let take a look at how they can be combined to form what is known as a
database management system.