Software Magazine
March 1995It seems as though modern
society is constantly on the lookout for the best
and brightest. Perhaps no place is this more true
than with new technology. As this tendency relates
to software: It is rare for firms to continue using
tools after more productive offerings have come
along.
Database management systems (DBMSs), however, seem
to be an exception. It is hardly uncommon to see a
large, mission-critical system running on a
25-year-old database architecture. In fact,
researchers note that the majority of large data
centers have yet to buy relational systems, and many
have no immediate plans to abandon their "prerelational"
database engines. Neither advances in the speed of
relational databases nor the widespread availability
of migration tools has persuaded these organizations
to test the relational waters.
Furthermore, says Tony Percy, "many nonrelational
systems will still be running the worlds business as
they do today in the year 2000, and even 80% of the
current IMS database systems under MVS will still be
in production in the year 2000." This prediction,
said Percy, vice president and research center
director, systems software technology at Gartner
Group, a Stamford, Conn., consultancy, has been
consistently echoed in the firm's surveys. "Whereas
relational database revenue has now overtaken
prerelational database revenue, there is no doubt
that legacy systems are going to be quite durable,"
he said.
As the relational database model enters its second
decade, even the most conservative IS shops consider
relational database products mature and safe. Why,
then, do some large organizations cringe at the
thought of database migration? One reason may be
that some shops have thousands of programs and have
invested man-centuries in the development of their
mission-critical databases.
According to Percy, there are other reasons why
these systems persevere. The first is that they "run
efficiently and are meeting highly predictable
workloads where consistent response time is
required, and they've been highly tuned," he said.
"Secondly, the [lost] opportunity costs of
reimplementing systems is enormous, so even if
people knew how they might implement them better m a
relational database, they haven't got the time,
money, resources or skills to do it."
A third reason, he said, relates to the learning
curve: "Unless people adopt different techniques of
designing their data and process models, they end up
abusing the models and fail at their migration."
It is true, says E.F Codd, inventor of the
relational database model, that a migration to
relational systems is no small task. "It is a
serious problem to convert between databases such as
IMS and relational because a lot of the activities
with IMS were developed by application programmers
instead of DBAs [database administrators].
"An example is the handling of missing values," he
said. "With IMS there could be umpteen different
ways that a missing value could be represented. When
the system moves to a relational database, this
becomes a nightmare especially when multiple columns
are retrieved by an SQL query. With a relational
database, the DBA defines the rules for missing
values."
Legacy Roots Go Deep
C.J. Date, a database specialist based in
Healdsburg, Calif., and an originator of the
relational database model, understands the "if it
ain't broke, don't fix it" attitude prevalent at
many large shops. "Legacy systems have a huge amount
of inertia. One of the first things I learned as a
programmer is that you don't fix a working system,'
said Date.
Indeed, many sites keep their nonrelational systems
simply because they see no compelling reason to
migrate. Eric Hasjford, manager of software
applications at Colorado Community Colleges and
Occupational Education Systems, in Denver, is very
happy with his prerelational architecture. "We use
VAX clusters with indexed RMS files, and roll data
into local databases. We use the Focus tool [from
Information Builders Inc. (IBI), New York City], and
we can join, merge and do all data manipulation
without SQL or a relational database. We see no
reason to migrate."
Hasjford has also implemented a mechanism that
simulates client/ server data access. "We've given
users menu options that meet their needs," he said.
"We have an update interface where they pull data
from RMS files or a Focus database and put them into
a spreadsheet." Users are then free to play with the
data and create batch transactions to upload their
updated spreadsheet data back to the RMS files on
the VAX, he explained.
Like Hasjford, others say their prerelational
systems suit them fine. Still, say proponents, it is
clear that relational technology has benefited
applications over the last 10 years. Three major
benefits are:
* Declarative data access: Relational SQL allows
queries that are independent of the navigation
method; data structures do not have to be manually
navigated one record at a time. An SQL optimizer
interrogates the query and determines the most
efficient way to return the requested information.
* Flexibility: Unlike their predecessors, relational
databases allow new tables to be freely added to a
system. Users need only provide the column names to
join old tables with the new, and the new table
instantly becomes part of the system.
* Responsiveness to change: Prior to relational SQL,
the modification of data structures was a painful
process. If a new data item was added to a system,
the entire system had to be taken offline while the
DBA ran a utility to visit every affected record,
physically restructuring the record and writing it
back into the database. With relational technology,
data columns can be added to a table with a simple
SQL command.
With the introduction of SQL came a radically new
way to query database information. Unlike the
navigational languages of hierarchical and network
databases, SQL does not require that a user
understand internal data structures to perform a
query. This declarative feature of SQL allows
database navigation to be done automatically by an
SQL optimizer. The optimizer gathers statistics from
the database dictionary and chooses the most
effective path for retrieving the data.
Indeed, when asked what distinguishes relational
databases from other databases, many people would
say SQL. Others cite the inherent flexibility of the
relational model over older databases such as the
hierarchical database and the Codasyl network
database. C.J. Date, however, does not believe that
flexibility or SQL are the most compelling reasons
for migrating to relational technology. "There is
much more to relational databases than SQL. The
relational model offers a sound logical foundation.
It is not an ad hoc architecture," he said.
Logical foundation or not, the University of New
Mexico in Albuquerque is simply not willing to cast
aside the huge investment it has in its large CA-IDMS
database from Computer Associates (CA) International
Inc., Islandia, N.Y The DBMS handles critical
administrative areas, from student scheduling to
financial aid. Although the university is examining
relational databases, they see no immediate need to
undergo an expensive conversion, said Fran McKee,
the university's manager of database administration.
"Our immediate goal is to keep IDMS and introduce
client/server," said McKee. "We are using Ca-Visual
Express to begin client/server development, and
Visual Express communicates with IDMS directly with
SQL. Since we already have SQL with the CA-IDMS SQL
option and we are getting our feet wet [with SQL],
we see no reason to move to a relational database."
Although McKee said programmers are encouraged to
create small relational databases for new systems,
our investment in the IDMS network is huge and we
will stay where we are." She acknowledges that
changes made to the university's databases mean that
the system needs to be taken down each Sunday for
restructuring. But, she said, "its a cost of doing
business."
This need to restructure is one of the cited
downsides of prerelational data architectures, which
include flat file, hierarchical or network. With
hierarchical, network and object databases, data
relationships are established by creating pointers
to link related information. If new data
relationships are required after the database has
been populated, the database must be restructured,
and each affected record must be altered to add a
pointer for the new relationship. This is a
time-consuming operation that requires the system to
be taken offline for the duration of the
restructuring.
Prerelational proponents would counter that
relational technology. has its own problems. Early
relational databases were criticized for their
consumption of processing resources; many were
painfully slower than their hierarchical and network
counterparts. IS shops that had highspeed
requirements were unable to migrate to relational
because they could not support the performance
requirements. Users with transaction volumes that
exceed 500 transactions/sec. still find many
relational databases lacking in performance.
Migration Procession
For those shops that decide that relational
databases are worth the headaches associated with
getting there, there are certain basic steps that
underlie any migration. What's involved is the
migration of data into relational tables and
migration of the application programs to SQL. To
migrate data, it must be offloaded into flat files.
Definitions must be offloaded and converted to table
definitions. Then the data must be imported into
relational tables.
To migrate applications, the database manipulation
language must be mapped to SQL, and parallel and
acceptance testing performed.
Though every migration is different, there are two
general approaches. One migrates applications first,
moving the data into relational tables as the final
step. The other approach adds SQL to the
prerelational database, in what is called a "phased
migration."
In a phased migration, an SQL wrapper is placed over
the nonrelational engine. This way, the applications
can be rewritten over time. This technique is best
for properly normalized databases where there is a
one-to-one correspondence between the nonrelational
record types and the relational tables.
With a "cutover migration," the process begins with
conversion of the nonrelational data. The entire
database (or a subset of data) is stored in tables
that parallel the production system. This technique
is advisable for non-normalized data structures such
as Vsam or Isam that require extensive redesign
before SQL can be used. Following normalization and
conversion of the data, parallel applications are
rewritten to use SQL, and the new system is tested
in parallel with the old system.
Tools that help legacy shops abandon their aging
platforms and enter the relational world cover a
range of functionality, from data migration to data
modeling. Data migration tools include table
normalization tools and table conversion utilities.
Phased migration tools include SQL wrappers and
relational gateway products.
Of the available migration tools on the market,
there are few that facilitate a phased migration
approach. For Atlanta-based World-Span, which
manages reservations for major airlines, the answer
was Informix-SE from Informix Software Inc., Menlo
Park, Calif.
WorldSpan had an order entry system running on Vsam
that they wanted to migrate to DB2. "We were looking
at a very long migration, so we evaluated
Informix-SE," which functions as an SQL wrapper,
allowing WorldSpan to use SQL against Isam files
without immediately changing their file structures,
said Bill Duncan, a project manager at WorldSpan.
"It was the flexibility of the Informix
import/export utility that led the way for us."
Duncan said they easily moved the data from their
mainframe, and used Informix database tools to
develop the new order entry application. Once the
application was running, they normalized the Isam
files, and redesigned the system from the bottom up
using the Informix Online relational database and
various Unix programming tools.
Though WorldSpan migrated the system twice, Duncan
felt it was the only choice. "If we had to struggle
with normalization and the initial migration, we
would have been spinning our wheels. This
intermediate phase with Informix-SE was not that
significant. It allowed us to get off the mainframe
fast, and get the system running under Unix," he
said.
There are those, however, that claim these types of
tools can be abused unless the migration team plans
to normalize Isam files into relational table
structures. One of these is Joe Degido, an Informix
systems engineer, who warns about using tools like
Informix-SE solely for its SQL capabilities without
redesigning the Isam structures. "When using C-Isam
as a migration tool, you can expect that the C-Isam
files are unnormalized and have repetitive keys. You
need to look at normalizing the files into
relational tables.
"If someone said 'I'm going to move my enterprise
forward by staying with C-Isam,' I'd say they were
out of their mind," he added. "Without a full
migration to SQL they couldn't use other helpful
[application development] tools such as Gupta
[provides]."
Like Degido, Joe Celko, SQL guru, and a consultant
at Atlanta-based OSoft Corp., warns against moving
Isam files into relational tables without
redesigning the Isam structures. "Without redesign,
the system will run like molasses and updates will
be very difficult. The redundancy in unnormalized
systems leads to anomalies that can require hundreds
of hours of manual code checking."
However, added Celko, "you can't quickly change
thousands of production programs, and SQL wrappers
have become a necessary evil." In fact, said Celko,
they have become so popular that the National
Institute for Standards and Technology (Nist) is in
the process of creating standards for the
cross-access internals of these products.
Necessary evil, maybe, says Ken Orr, but he
nonetheless warns against reliance on SQL wrappers.
"I have been teaching database design for 25 years,
and the proper redesign of the data structures is
essential to good performance," said Orr, principal
of The Ken Orr Institute, Topeka, Kan. "I teach that
the key to a successful conversion is database
design. If you get that wrong, then everything
becomes more difficult.
"The immaculate conception of database design from
the specifications is a falsehood," Orr continued.
"If the system is not redesigned for the relational
model you get into intellectual cul-de-sacs."
For those shops migrating legacy systems, many new
concepts, database products and tools will be
introduced. Those moving from the mainframe to a
Unix-or PC-based environment are right to be
concerned about the transfer of skills.
Ray Koukari, director of MIS at Arctco Inc., Thief
River Falls, Minn., managed a massive conversion of
a legacy system from a mainframe to a PC/LAN network
with Tricord ES 5000 data servers and a Btrieve
extended relational database engine from Btrieve
Technologies Inc., Austin, Texas. The system had 230
concurrent users and 5OGb of data.
"We had a large Unisys mainframe running the DMS-2
database that we used as a flat-hierarchy. We have a
very large manufacturing system that was eight years
old and badly in need of a rewrite. We couldn't
respond to users' needs and we couldn't provide easy
to use and easy to understand applications."
According to Koukari, the old system used
character-based screens that forced them to use
complex codes. He said training time took anywhere
from six months to a year.
"We wanted to keep all of our existing programmers
since they knew the business. We initially tried
training in Visual C++, but the programmers
protested and we tried Visual Basic where we had
great success. Our system was so tightly coupled
with the business that there was no way that we
could do a phased migration, so we went on a
two-and-ahalf year effort to rewrite all of our
programs," he said.
When they began the rewrite in 1992, there were few
tools to aid in migrating legacy systems so "we did
everything with in-house software," said Koukari.
"We initially expected this conversion to cost more
than our mainframe, but we found that we are saving
hundreds of thousands of dollars in maintenance
costs alone. We are also having better performance
on the PC/LAN than we could ever get from the
mainframe."
Although the conversion was painful at times, said
Koukari, "we are very happy that we migrated."
Not everyone moving to relational technology is as
concerned with skills transfer. For some, the
introduction of new technology provides a chance to
introduce new blood as well. One such firm is Abril
Group, Sao Paulo, Brazil, which is facing a massive
conversion of a large IMS database to relational.
"We looked at options, and it is easier to rewrite
the new system using new concepts than it is to
migrate," said Dorival Dourado Jr., Abril's
technology manager. Dourado works with a system that
manages a critical part of the largest
communications conglomerate in Brazil.
The system consumes 45Gb, has 7,000 programs with
25,000 lines of Cobol code, and a staff of 40
programmers. According to Dourado, Abril sees this
migration as an opportunity to redesign the system
and change the programming team. "We see a lot of
problems retraining our programmers. They all know
CICS, Cobol and IMS and it may be difficult for them
to learn new concepts. It is a very difficult
transition since our staff has old concepts of
system architecture," he said.
He wants to outsource the system redesign,
retraining only the best programmers. "We need new
skills for network management, performance and
database administration."
Dourado envisions a complete transition into a new
environment. "We want to stay with Cobol, since most
of our logic can be ported, but we are looking at
Oracle with Unix and a Novell network with PC front
ends," he said.
Which One's Right?
Even with dozens of tools in the marketplace, it may
be no easy task to find the right one. One company
on the lookout for a migration tool is Agris Corp.,
a Roswell, Ga., company that markets agri-business
management software. According to Karl Swager, a
systems development engineer, the company has
decided to undertake a massive conversion to
relational database technology within its software
products. "We ship a product to manage agri-business
and we want a single product that we can ship to all
of our customers, from giant agri-business concerns
to mom and pop feed stores," said Swager. "We are
looking at a 30 to 35 manyear conversion effort to
create a relational product that is
platform-independent and has a GUI front end."
Agris, said Swager, wants to provide its customers
with a graphical environment that is as independent
and flexible as possible. "We will bundle a database
for the low-end users, but larger customers will
have their own databases and we will use ODBC," he
said.
Despite their ambitious plans, Swager is skeptical
about using a migration tool. "We are evaluating
migration tools and have narrowed them down to three
or four products, but we haven't found one that is
perfect for our needs."
Migration tools are notoriously buggy and we have
more control by doing it ourselves. Our resources
are our people, and they are our tools," said Frank
Smith, an information technologist for Racal Datacom
in Sunrise, Ma. "We've bought our 3090 mainframe and
are heavily invested in our hardware and databases.
We are beginning to move into relational databases
slowly, moving our financial component from IDMS
onto an AS/400 using [IBI's] EDA/SQL for
connectivity."
Claims about the lack of robust migration tools may
be valid. Observers say, however, that options are
expanding.
For example, Unidata Inc., Denver, offers a "nested
relational" engine that assists IMS and IDMS users
in migrating to relational platforms. This nested
relational technology allows for repeating data
groups to be stored in a single cell of a table,
thereby reducing the need to perform SQL join
operations and increasing the speed of the migrated
system. Kenan Technologies, Cambridge, Mass., offers
Multiway, a multidimensional database product with
similar features.
As such database technology advances into the next
century, large shops will find their paths to
relational a bit easier to tread. Even so,
speculation is sure to continue concerning when, if
ever, prerelational databases win end up m the
technological scrap heap. |