|On Legacy Applications and Previous Work PAUL@TDR.COM (Paul Robinson) (1994-03-06)|
|Re: On Legacy Applications and Previous Work email@example.com (chris (c.d.) donawa) (1994-03-21)|
|Announcing "COBOL 2000"! (Was: On Legacy Applications ...) firstname.lastname@example.org (1994-03-10)|
|Re: On Legacy Applications and Previous Work email@example.com (1994-03-14)|
|Re: On Legacy Applications and Previous Work firstname.lastname@example.org (1994-03-16)|
|Re: On Legacy Applications and Previous Work email@example.com (1994-03-22)|
|Re: On Legacy Applications and Previous Work firstname.lastname@example.org (1994-03-23)|
|[4 later articles]|
|From:||Paul Robinson <PAUL@TDR.COM>|
|Organization:||Tansin A. Darcos & Company, Silver Spring, MD USA|
|Date:||Sun, 6 Mar 1994 14:25:30 GMT|
Most of the time on here I see discussions of new work being done in the
development of translator applications. (I say that rather than
'compilers' because a lot of the discussion involves tools like Yacc that
are used to build compilers).
I notice how much - if not most or almost all - of the material deals with
the implementation of the latest techniques in improving code generation,
improving the way compilers are built and designed, and the latest
techniques for implementing the newest ideas.
I'm going to make some comments that I think need to be thought about in
this forum, since people here are probably directly involved in the
creation and maintenance of compilers. (I am assuming people use the
information here for more than just academic background, and that some
real work goes on.)
I wouldn't have thought much about real issues until my eyes were opened.
I happened to purchase a copy of Edward Yourdon's "Death of the American
Programmer." People may not agree with all of his ideas, but he made a
couple of very important points that I do agree with. You may scoff at
the points I am about to make, but would someone have believed someone,
say, in 1971 saying that Japanese cars would almost kill the Auto
Industry in 10 years? Two years later the oil crisis struck.
This article focuses rather heavily on COBOL, but I suspect that in all
but the newest places, more than 1/2 of all pre-existing programs are
COBOL applications more than five years old.
In this article, there is a term I use, when I refer to several programs
all together which are used as part of a complete solution to a problem. I
call this combination an "application group." While there may be
independent programs in the group that stand alone, these programs often
pass information to and from each other and usually share files).
Here are the points to be made:
1. Most programmers - my personal estimate is 90% - will spend 75% of
their time maintaining "legacy applications", e.g. the old programs
that are in use by companies that are the "crown jewels" of the
Where a company has a financial application system - payroll,
accounts payable, and accounts receivable, along with perhaps other
programs - the company may have a large chunk of their
people doing nothing but maintenance on that application group.
If they have, say, ten people doing maintenance on that particular
application group, and have been doing so for perhaps ten years,
based on salaries and costs, they have more than five million dollars
tied up in that package. Further, if it has 10 people working on it,
chances are you would need that many MORE people if you wanted to
create a new one to do the same thing.
And if 10 people are working on this one application group, that's
probably half of the programming staff (remember, this one
application group is the lifeblood of the company, if it was
unavailable the company would probably go bankrupt in a matter of
weeks (if even that long)). Therefore adding 10 more people (assuming
you could find them), would be a major shock to the programming
staff (as any place that suddenly had a 25% increase in personnel),
perhaps causing people to move elsewhere!
Take the numbers any way you want, whether it's 10 people or 40 (as
Yourdon gives in one example), most of the people in the office are
busy doing "life support", usually centering on the care and feeding
on that critical application group that keeps the company alive.
Further, unless you want a shoddy piece of junk that is just as bad as
the application group that is there, there needs to be quality built
into it. That means a new set of applications to replace the current
ones would take a while. If it took, say, two years to create a
replacement for the current application group, we may be looking at,
in effect, paying ten people for two years before they produce any
actual work which is usable by the company. I doubt that very many
companies can justify a 3/4 million dollar investment to replace
something that already works and still spend the same amount keeping
the old system alive until the new one is finished.
Oh, yes, if they do replace modules as they create them there might
be some improved functionality, but can anyone see doubling the staff
working on an application so that eventually, down the line, the new
staff can be eliminated and the old staff reduced in size because
less maintenance is required? Beyond the fact that people are still
needing things to be done in the interim? Or that a manager would
spend money on more people to rewrite the applications to enable the
company to reduce the amount of unpaid overtime? (Too many places
rely on 'Spanish Prisoner' accounting of programmers, e.g. they take
into account in costs of programmers the regular and chronic unpaid
overtime they put in.)
We often refer to these legacy applications as "old COBOL dinosaurs"
(or old FORTRAN dinosaurs, but most places that have significant
maintenance issues are in business applications. If places still had
a lot of RSTS/E systems around, we might talk about old BASIC
But this sort of mudslinging is a disservice: these applications
still do useful work, they are critical to the functioning of the
organization which probably literally depends for its life on the
applications group, and the organization CANNOT AFFORD to replace
them. Further, these large application groups - perhaps a dozen main
programs and hundreds of modules, consisting of perhaps 2 million
lines of code - are usually so complicated that nobody really knows
how they work anymore; rewriting them is out of the question when
you don't even know what they do.
Therefore, what we need - what is despirately needed in the real
world - are better tools to perform maintenance on existing
2. Do you know how much money a programmer with a college degree and
C programming experience makes in China? About $180 a month. And
this is not "slave wages" - I understand it's about the equivalent
of a minister in a cabinet post there, and as such, maintenance
programmers in the U.S. today - and possibly in other Western
nations - are very close to the exact position of U.S. auto
workers in the 1960s or 1970s, before Japan came along and cleaned
our clock for us.
Let's not forget that India, is doing a lot of transshipment of
programmers here, where they can do better quality work than we do
for less money. I won't argue over whether this is good or bad;
the fact is, it is happening now. Even if there wasn't a single
imported programmer allowed into the country, software is not a
product that is capital intensive; the people don't have to be where
the work is.
Which leads to the next step: sending the specifications overseas and
having a shop there write the code, then bringing it back here to have
local programmers adjust it for local functionality (e.g. to ensure
the commas are in the right place, etc.). The cost for to do work like
this is probably an order of magnitude less than having it done
entirely within the U.S.
Don't be too smug if you're in Europe. India has had 350 years of
Britsh Rule; a lot of people there speak English well and have had
extensive education which was developed by England. How can you
tell whether a programmer who wrote a program did so from a terminal
in Cambridge or one in Calcutta? Especially if the program is
published by an EC-based company with a London address.
But don't think cost is everything: if labor cost was the only
consideration of the creation of a product, Haiti would have had
almost no unemployment.
Do you know why someone working on a shop floor in Detroit makes
ten times as much money as someone doing the identical work on
a shop floor in Mexico? Because the American has the education
and tools to make him ten times as productive.
Western programmers have not had a "Japanese Import" problem yet
to shake them up into improving the way they create programs.
A lot of people doing COBOL coding in shops, who haven't taken a
programming course in ten years, are marked men, waiting to be
cut down by the same thing that forced layoffs in Detroit: foreign
imports of better quality that cost less.
Did anyone notice that the premier graphics package, Corel Draw!,
was created by a company in Canada? The game "Tetris" was developed
by two programmers in Moscow. Minor inroads, but they are the hints
of future developments.
If we continue the way things are, eventually, as Japan did to the
auto industry in the 1970s/1980s, some hungry foreign country is
going to eat our lunch in software development. If we change now,
it will be easier. If we wait until we are forced to do so, it
will mean the same gut-wrenching changes - and programmer
unemployment - will hit the software industry as struck Detroit.
We can change this. And those who develop the tools to help todays
programmers stay competitive can make money hand over fist.
To ensure that we are worth the high wages we are getting, we have
to be much more productive. Most people who are programming are
already overworked as it is. They do not need to work harder; what
they need is to work smarter.
- We need better tools, and ways to make maintenance - which is still
70% or more of most people's work - easier to accomplish.
- We need to increase the amount of code which is reused. From the
programming courses on, copying from other people has been frowned
upon; people who write programs by borrowing other code get the
Rodney Dangerfield treatment (which is why maintenance programmers
are held in such low esteem). Yet the best shops are able to
reuse 80% of their code. Most shops are lucky to do 20% reuse.
Does anyone seriously think that rewriting the COBOL overhead -
four divisions and various statements - every single time a program
is created makes any sense? No, and most people have a skeleton
frame of the program to start with. Yet that practice in and of
itself shows that having usable pre-written code to pick from
makes people more productive.
- We need to get better CASE tools. Computers can manipulate text
quite well; it's time to develop better program generators and
precompilers. Who is more productive? Someone who uses a full
screen painter like Borland Resource Workshop to create MS-Windows
panels, or someone who codes them as text statements?
Why then, can't we go further and allow people to pick functions
and create a program graphically? Yes, I know there are some
applications out there that do provide this, but in many cases
they are only of use to you if you are creating new work, not
maintaining old work. And they often don't provide enough
functionality; many times you still have to go in and edit the
generated source to get the results right; and they usually do not
support people adding their own statements. If I manually edit an
.RC file and add a new item, the Borland resource workshop will
create the on-screen window, dialog box and controls to match the
specifications. Such capability does not often obtain with program
- We need to develop better repositories to store source code, and
the other things that go along with it, such as documentation.
And full-screen browsing with good dictionary and definition
specifications. Nobody is going to want to look for a routine
in a 3,000 page listing of routines; on-line browsing has got to be
- Good cross-reference and indexing tools are needed. When was the
last time you saw a cross-reference tool for DBASE or Pascal, or
even saw a PC-Based compiler for any language that created cross
reference or variable usage tables? Probably the best cross
reference tool I've ever seen is the FORMAT program which was
written by someone in Germany (the code is in Pascal; the comments
are in German). It's a Pascal cross-reference, pretty printer and
indexer all in one.
It does one very nice thing that I've not seen in some mainframe
COBOL compilers. For each procedure, it lists the procedures that
it calls by line numbers, and also lists the name and line number
of every procedure that calls it. Listings like this are CRUCIAL
for maintenance of a program, where you are somewhere in a program
and trying to figure out how it got there.
Do you (or your company) want to make some money and provide something
despirately needed for 70% of the work done? Stop thinking about
writing the next YACC and consider the needs of the poor maintenance
programmer out there.
- Create a good "macro" type library for COBOL or FORTRAN or BASIC
or any of the older languages that people will be continuing
to do maintenance in twenty years from now. Even if it's only a
preprocessor, it will help in doing maintenance on existing
applications, by allowing heavier use of pre-written routines.
C has a fairly good macro language with parameter substitution.
COBOL has, at best, some COPY constructs that allow some changing of
prior values, and usually all people use it for is to copy
declarations of variables. We need to encourage people to be able
to copy procedures and even inline code segments.
- We need good "smart" compilers and linkers as well as support
for handling the overhead of compilation by the machine. The
first time I ever saw a "make" routine and how it was smart
enough to know what had been edited and needed to be recompiled,
what had been recompiled and needed to be relinked, blew me away.
I had been doing this sort of thing manually on mainframes.
Also, Borland's compilers strip unused procedures, and for objects,
unused methods, automatically.
- Create better checkout and merge capability for applications
which are worked on by multiple people. Checkout facilities for
mainframes and (now appearing) for PCs do help, but they are
still weak in giving good inclusion capability for procedures
and code fragments. And they usually don't have facilities for
tying documentation files to program sources.
- Integrate these functions so that recompiling a program triggers a
rerun of the cross-reference, which replaces the old cross reference
file if it is stored in the archive, and also replaces old binaries.
- Support archive members at the compiler level; it is easy to
encourage people to keep hundreds of 500 byte or smaller files
if they can store them in a .ZIP, .ZOO or .ARC archive for
subsequent extraction and thus keep all the related files together
(and save space) but not if these small files are stored as single
files using 4K or 8K each.
- Start creating libraries of pre-written, tested procedures and
code segments that people can use when writing applications or when
doing maintenance on existing ones. Sure, the usual library
routines supplied with a compiler provide some of these, but
people often need source because they want a specific function, but
they want it to do less than the prewritten one does, or less
combined with other functionality.
For example, I might have a small table, say 600 entries of
15 bytes each. I read this in from either another program or
supplied by an individual. I get better performance doing searches
by putting it in order. I do not want to order the person supplying
the information to alphabetize it (and it might be coming from
another program which cannot be told to do so), and further I might
be using only part of the data passed on in a file that is yet passed
on to yet another program, which expects the file a certain way,
so I can't change the input to pre-sort it.
In this case, I do not need (and probably don't want to invoke) a
full-blown disk-based sort for a table this small, that might only
be 10K in size; what I need is a reasonably good in-memory sort
routine. Yet for a table this small, even a dictionary sort is
sufficient, since even a worst case is only going to be a second or
so of running time, insignificant if this is, for example, the
initialization of a program that runs for minutes or hours. A
reasonable dictionary sort requires two loops and about 5-10 lines
What usually happens in a case like this is that the programmer
spends time rewriting a dictionary sort because there is nothing
available to give him this functionality. And if he needs to do some
work on the table while sorting, even if there is a built-in simple
sort, he can't use it because he can't change the process. Where
he might just need a 10-line sort to insert into a 20-line procedure,
he wastes considerable time writing a simple sort because the
feature he needs isn't there. If it is there, is there a way he
can find it? To find it, he needs well-indexed libraries, and he
needs on-line search capability.
- Make cross-compiling easier. With the increase in processor
speed and size of disks and memory, it's only a matter of time
until 370 emulators start bringing those old COBOL programs onto
the new "mainframe" which is the size of a Tower microcomputer. If
people can move from MVS COBOL to MS or Ryan McFarlane COBOL, it
will make their job easier, moreso if the program will run in
exactly the same way as the original program, without the user
having to rewrite any standard construct.
Also, there should be support for changing what was VSAM files
into the PC Equivalent, B-Trees, or into data files compatible
with standard files, such as DBASE, Oracle, SQL, or any other
environment which is being made available on PCs. Don't forget
record locking on Novell to replace the standard ENQ/DEQ on
Also, if he's got an inquiry program for CICS, he needs to develop
a way to change his screens that were painted so that he can take
the original IBM Assembler specification and translate that into
a panel in MS-Windows or OS/2, or even a DOS text-mode screen.
- If someone is debugging a program, they should be able to use
full-screen debugging of the source language, not debugging in
assembler. The debugger should support the data structures in
the output file and allow their analysis in the same way as the
source language does, so that I don't have to say 3e48:1023 and
see 3000, instead I should be able to reference "CURRENT-PAY" and
if it's COMP-1, see 480. If that's not unique, of course, then
I have to say "CURRENT-PAY in PAY-RECORD".
- We need tools like FORMAT for other languages. With object
oriented languages referencing procedures indirectly through
access to variables, "CALL TO" and "CALL BY" listings are badly
needed. And they need to support multi-module
Yourdon also mentions one other thing that is going to "blow the socks
off" your average programmer: Object Oriented Cobol. The specifications
(in 1992 when his book was released) were just then being formulated. I
try to keep up on the literature, applications and new developments, and
yet I still don't understand Object Orientation that well; I shudder to
think what this will do to someone who hasn't read a textbook in five
Like it or dislike it, COBOL is not going to wither away any time soon;
there's probably some $50 billion in programming assets tied up in it.
Like the difference between C and C++ or Objective C, you may not
recognize it by the time they get finished with it, but I suspect it will
still be around twenty years from now unless every one of those large
companies are able to move to micros and that their COBOL programs get
translated into other languages.
Look at what has happened to BASIC, it's the macro language for MS Word.
Paul Robinson - Paul@TDR.COM
[Yourdon's book is well worth reading. It's not specifically about compilers
but has a lot to say about the way that programmers do and don't use the
tools they have available.-John]
Return to the
Search the comp.compilers archives again.