Pre-Disk Data Storage
On the earliest commercial
computers, drum storage was far too expensive to hold large volumes
of transaction data. Consequently, transactions were keyed onto
punched cards, and these cards were sorted and copied to a daily
transaction tape. This daily transaction tape was then processed
against the previous day’s sorted master tape (see Figure 1.1).
Figure 1.1 Transaction processing with magnetic tapes.
In this fashion, the data processing
site collected a historical archive of each day’s transactions.
These daily transaction tapes were used as the input to statistical
programs that read and aggregated the transactions according to
predefined rules, writing the aggregate summaries onto another tape
(see Figure 1.2). These tapes, in turn, were used by managers to
answer decision support queries similar to the queries serviced by
today’s data warehouses.
Figure 1.2 Data aggregation on magnetic tapes.
Early Disk-Based Data Storage
Prior to the development of early
commercial databases such as IMS, many "database" systems were
nothing more than a loose conglomeration of flat-file storage
methods on magnetic disks and drums. The term flat file
includes physical-sequential storage as well as the indexed
sequential access method (IS-AM) and virtual sequential access
method (VSAM). Early flat-file systems such as IS-AM and VSAM were
actually little more than physical-sequential files with indexes
stored on disks or drums.
The data access methods used by
these early disk systems were very primitive when compared to
today’s commercial databases. One of the most common disk access
methods was commonly known as BDAM (Basic Direct Access
Method). BDAM was used for data records that required fast access
and retrieval of information. BDAM uses a hashing algorithm,
which takes a symbolic key and converts it into a location address
on a disk (a disk address). Unfortunately, the range of addresses
generated by hashing algorithms requires careful management. Because
a hashing algorithm always produces the same key each time it reads
an input value, duplicate keys have to be avoided. BDAM file
structures also consume large amounts of disk storage. Because
records are randomly distributed across the disk device, it is
common to see hashed files with more unused spaces than occupied
spaces. In most cases, a BDAM file is considered "logically" full if
more than 70 percent of the space contains data records.