On Legacy Applications and Previous Work

Paul Robinson <PAUL@TDR.COM>
Sun, 6 Mar 1994 14:25:30 GMT

From comp.compilers

Related articles
*On Legacy Applications and Previous Work PAUL@TDR.COM (Paul Robinson)* (1994-03-06)**
Re: On Legacy Applications and Previous Work donawa@bnr.ca (chris (c.d.) donawa) (1994-03-21)
Announcing "COBOL 2000"! (Was: On Legacy Applications ...) eifrig@beanworld.cs.jhu.edu (1994-03-10)
Re: On Legacy Applications and Previous Work bill@amber.csd.harris.com (1994-03-14)
Re: On Legacy Applications and Previous Work baxter@austin.sar.slb.com (1994-03-16)
Re: On Legacy Applications and Previous Work steve@cegelecproj.co.uk (1994-03-22)
Re: On Legacy Applications and Previous Work bart@cs.uoregon.edu (1994-03-23)
[4 later articles]

| List of all articles for this month |

Newsgroups:	comp.compilers
From:	Paul Robinson <PAUL@TDR.COM>
Keywords:	tools, comment
Organization:	Tansin A. Darcos & Company, Silver Spring, MD USA
Date:	Sun, 6 Mar 1994 14:25:30 GMT

Most of the time on here I see discussions of new work being done in the
development of translator applications. (I say that rather than
'compilers' because a lot of the discussion involves tools like Yacc that
are used to build compilers).

I notice how much - if not most or almost all - of the material deals with
the implementation of the latest techniques in improving code generation,
improving the way compilers are built and designed, and the latest
techniques for implementing the newest ideas.

I'm going to make some comments that I think need to be thought about in
this forum, since people here are probably directly involved in the
creation and maintenance of compilers. (I am assuming people use the
information here for more than just academic background, and that some
real work goes on.)

I wouldn't have thought much about real issues until my eyes were opened.
I happened to purchase a copy of Edward Yourdon's "Death of the American
Programmer." People may not agree with all of his ideas, but he made a
couple of very important points that I do agree with. You may scoff at
the points I am about to make, but would someone have believed someone,
say, in 1971 saying that Japanese cars would almost kill the Auto
Industry in 10 years? Two years later the oil crisis struck.

This article focuses rather heavily on COBOL, but I suspect that in all
but the newest places, more than 1/2 of all pre-existing programs are
COBOL applications more than five years old.

In this article, there is a term I use, when I refer to several programs
all together which are used as part of a complete solution to a problem. I
call this combination an "application group." While there may be
independent programs in the group that stand alone, these programs often
pass information to and from each other and usually share files).

Here are the points to be made:

1. Most programmers - my personal estimate is 90% - will spend 75% of
        their time maintaining "legacy applications", e.g. the old programs
        that are in use by companies that are the "crown jewels" of the
        organization.

        Where a company has a financial application system - payroll,
        accounts payable, and accounts receivable, along with perhaps other
        programs - the company may have a large chunk of their
        people doing nothing but maintenance on that application group.

        If they have, say, ten people doing maintenance on that particular
        application group, and have been doing so for perhaps ten years,
        based on salaries and costs, they have more than five million dollars
        tied up in that package. Further, if it has 10 people working on it,
        chances are you would need that many MORE people if you wanted to
        create a new one to do the same thing.

        And if 10 people are working on this one application group, that's
        probably half of the programming staff (remember, this one
        application group is the lifeblood of the company, if it was
        unavailable the company would probably go bankrupt in a matter of
        weeks (if even that long)). Therefore adding 10 more people (assuming
        you could find them), would be a major shock to the programming
        staff (as any place that suddenly had a 25% increase in personnel),
        perhaps causing people to move elsewhere!

        Take the numbers any way you want, whether it's 10 people or 40 (as
        Yourdon gives in one example), most of the people in the office are
        busy doing "life support", usually centering on the care and feeding
        on that critical application group that keeps the company alive.

        Further, unless you want a shoddy piece of junk that is just as bad as
        the application group that is there, there needs to be quality built
        into it. That means a new set of applications to replace the current
        ones would take a while. If it took, say, two years to create a
        replacement for the current application group, we may be looking at,
        in effect, paying ten people for two years before they produce any
        actual work which is usable by the company. I doubt that very many
        companies can justify a 3/4 million dollar investment to replace
        something that already works and still spend the same amount keeping
        the old system alive until the new one is finished.

        Oh, yes, if they do replace modules as they create them there might
        be some improved functionality, but can anyone see doubling the staff
        working on an application so that eventually, down the line, the new
        staff can be eliminated and the old staff reduced in size because
        less maintenance is required? Beyond the fact that people are still
        needing things to be done in the interim? Or that a manager would
        spend money on more people to rewrite the applications to enable the
        company to reduce the amount of unpaid overtime? (Too many places
        rely on 'Spanish Prisoner' accounting of programmers, e.g. they take
        into account in costs of programmers the regular and chronic unpaid
        overtime they put in.)

        We often refer to these legacy applications as "old COBOL dinosaurs"
        (or old FORTRAN dinosaurs, but most places that have significant
        maintenance issues are in business applications. If places still had
        a lot of RSTS/E systems around, we might talk about old BASIC
        dinosaurs).

        But this sort of mudslinging is a disservice: these applications
        still do useful work, they are critical to the functioning of the
        organization which probably literally depends for its life on the
        applications group, and the organization CANNOT AFFORD to replace
        them. Further, these large application groups - perhaps a dozen main
        programs and hundreds of modules, consisting of perhaps 2 million
        lines of code - are usually so complicated that nobody really knows
        how they work anymore; rewriting them is out of the question when
        you don't even know what they do.

        Therefore, what we need - what is despirately needed in the real
        world - are better tools to perform maintenance on existing
        applications.

2. Do you know how much money a programmer with a college degree and
        C programming experience makes in China? About $180 a month. And
        this is not "slave wages" - I understand it's about the equivalent
        of a minister in a cabinet post there, and as such, maintenance
        programmers in the U.S. today - and possibly in other Western
        nations - are very close to the exact position of U.S. auto
        workers in the 1960s or 1970s, before Japan came along and cleaned
        our clock for us.

        Let's not forget that India, is doing a lot of transshipment of
        programmers here, where they can do better quality work than we do
        for less money. I won't argue over whether this is good or bad;
        the fact is, it is happening now. Even if there wasn't a single
        imported programmer allowed into the country, software is not a
        product that is capital intensive; the people don't have to be where
        the work is.

        Which leads to the next step: sending the specifications overseas and
        having a shop there write the code, then bringing it back here to have
        local programmers adjust it for local functionality (e.g. to ensure
        the commas are in the right place, etc.). The cost for to do work like
        this is probably an order of magnitude less than having it done
        entirely within the U.S.

        Don't be too smug if you're in Europe. India has had 350 years of
        Britsh Rule; a lot of people there speak English well and have had
        extensive education which was developed by England. How can you
        tell whether a programmer who wrote a program did so from a terminal
        in Cambridge or one in Calcutta? Especially if the program is
        published by an EC-based company with a London address.

        But don't think cost is everything: if labor cost was the only
        consideration of the creation of a product, Haiti would have had
        almost no unemployment.

        Do you know why someone working on a shop floor in Detroit makes
        ten times as much money as someone doing the identical work on
        a shop floor in Mexico? Because the American has the education
        and tools to make him ten times as productive.

        Western programmers have not had a "Japanese Import" problem yet
        to shake them up into improving the way they create programs.
        A lot of people doing COBOL coding in shops, who haven't taken a
        programming course in ten years, are marked men, waiting to be
        cut down by the same thing that forced layoffs in Detroit: foreign
        imports of better quality that cost less.

        Did anyone notice that the premier graphics package, Corel Draw!,
        was created by a company in Canada? The game "Tetris" was developed
        by two programmers in Moscow. Minor inroads, but they are the hints
        of future developments.

        If we continue the way things are, eventually, as Japan did to the
        auto industry in the 1970s/1980s, some hungry foreign country is
        going to eat our lunch in software development. If we change now,
        it will be easier. If we wait until we are forced to do so, it
        will mean the same gut-wrenching changes - and programmer
        unemployment - will hit the software industry as struck Detroit.

        We can change this. And those who develop the tools to help todays
        programmers stay competitive can make money hand over fist.

        To ensure that we are worth the high wages we are getting, we have
        to be much more productive. Most people who are programming are
        already overworked as it is. They do not need to work harder; what
        they need is to work smarter.

        - We need better tools, and ways to make maintenance - which is still
            70% or more of most people's work - easier to accomplish.

        - We need to increase the amount of code which is reused. From the
            programming courses on, copying from other people has been frowned
            upon; people who write programs by borrowing other code get the
            Rodney Dangerfield treatment (which is why maintenance programmers
            are held in such low esteem). Yet the best shops are able to
            reuse 80% of their code. Most shops are lucky to do 20% reuse.
            Does anyone seriously think that rewriting the COBOL overhead -
            four divisions and various statements - every single time a program
            is created makes any sense? No, and most people have a skeleton
            frame of the program to start with. Yet that practice in and of
            itself shows that having usable pre-written code to pick from
            makes people more productive.

        - We need to get better CASE tools. Computers can manipulate text
            quite well; it's time to develop better program generators and
            precompilers. Who is more productive? Someone who uses a full
            screen painter like Borland Resource Workshop to create MS-Windows
            panels, or someone who codes them as text statements?

            Why then, can't we go further and allow people to pick functions
            and create a program graphically? Yes, I know there are some
            applications out there that do provide this, but in many cases
            they are only of use to you if you are creating new work, not
            maintaining old work. And they often don't provide enough
            functionality; many times you still have to go in and edit the
            generated source to get the results right; and they usually do not
            support people adding their own statements. If I manually edit an
            .RC file and add a new item, the Borland resource workshop will
            create the on-screen window, dialog box and controls to match the
            specifications. Such capability does not often obtain with program
            generators.

        - We need to develop better repositories to store source code, and
            the other things that go along with it, such as documentation.
            And full-screen browsing with good dictionary and definition
            specifications. Nobody is going to want to look for a routine
            in a 3,000 page listing of routines; on-line browsing has got to be
            available.

            - Good cross-reference and indexing tools are needed. When was the
            last time you saw a cross-reference tool for DBASE or Pascal, or
            even saw a PC-Based compiler for any language that created cross
            reference or variable usage tables? Probably the best cross
            reference tool I've ever seen is the FORMAT program which was
            written by someone in Germany (the code is in Pascal; the comments
            are in German). It's a Pascal cross-reference, pretty printer and
            indexer all in one.

            It does one very nice thing that I've not seen in some mainframe
            COBOL compilers. For each procedure, it lists the procedures that
            it calls by line numbers, and also lists the name and line number
            of every procedure that calls it. Listings like this are CRUCIAL
            for maintenance of a program, where you are somewhere in a program
            and trying to figure out how it got there.

        Do you (or your company) want to make some money and provide something
        despirately needed for 70% of the work done? Stop thinking about
        writing the next YACC and consider the needs of the poor maintenance
        programmer out there.

        - Create a good "macro" type library for COBOL or FORTRAN or BASIC
        or any of the older languages that people will be continuing
        to do maintenance in twenty years from now. Even if it's only a
        preprocessor, it will help in doing maintenance on existing
        applications, by allowing heavier use of pre-written routines.

        C has a fairly good macro language with parameter substitution.
        COBOL has, at best, some COPY constructs that allow some changing of
        prior values, and usually all people use it for is to copy
        declarations of variables. We need to encourage people to be able
        to copy procedures and even inline code segments.

        - We need good "smart" compilers and linkers as well as support
        for handling the overhead of compilation by the machine. The
        first time I ever saw a "make" routine and how it was smart
        enough to know what had been edited and needed to be recompiled,
        what had been recompiled and needed to be relinked, blew me away.
        I had been doing this sort of thing manually on mainframes.
        Also, Borland's compilers strip unused procedures, and for objects,
        unused methods, automatically.

        - Create better checkout and merge capability for applications
        which are worked on by multiple people. Checkout facilities for
        mainframes and (now appearing) for PCs do help, but they are
        still weak in giving good inclusion capability for procedures
        and code fragments. And they usually don't have facilities for
        tying documentation files to program sources.

        - Integrate these functions so that recompiling a program triggers a
        rerun of the cross-reference, which replaces the old cross reference
        file if it is stored in the archive, and also replaces old binaries.

        - Support archive members at the compiler level; it is easy to
        encourage people to keep hundreds of 500 byte or smaller files
        if they can store them in a .ZIP, .ZOO or .ARC archive for
        subsequent extraction and thus keep all the related files together
        (and save space) but not if these small files are stored as single
        files using 4K or 8K each.

        - Start creating libraries of pre-written, tested procedures and
        code segments that people can use when writing applications or when
        doing maintenance on existing ones. Sure, the usual library
        routines supplied with a compiler provide some of these, but
        people often need source because they want a specific function, but
        they want it to do less than the prewritten one does, or less
        combined with other functionality.

        For example, I might have a small table, say 600 entries of
        15 bytes each. I read this in from either another program or
        supplied by an individual. I get better performance doing searches
        by putting it in order. I do not want to order the person supplying
        the information to alphabetize it (and it might be coming from
        another program which cannot be told to do so), and further I might
        be using only part of the data passed on in a file that is yet passed
        on to yet another program, which expects the file a certain way,
        so I can't change the input to pre-sort it.

        In this case, I do not need (and probably don't want to invoke) a
        full-blown disk-based sort for a table this small, that might only
        be 10K in size; what I need is a reasonably good in-memory sort
        routine. Yet for a table this small, even a dictionary sort is
        sufficient, since even a worst case is only going to be a second or
        so of running time, insignificant if this is, for example, the
        initialization of a program that runs for minutes or hours. A
        reasonable dictionary sort requires two loops and about 5-10 lines
        of code.

        What usually happens in a case like this is that the programmer
        spends time rewriting a dictionary sort because there is nothing
        available to give him this functionality. And if he needs to do some
        work on the table while sorting, even if there is a built-in simple
        sort, he can't use it because he can't change the process. Where
        he might just need a 10-line sort to insert into a 20-line procedure,
        he wastes considerable time writing a simple sort because the
        feature he needs isn't there. If it is there, is there a way he
        can find it? To find it, he needs well-indexed libraries, and he
        needs on-line search capability.

        - Make cross-compiling easier. With the increase in processor
        speed and size of disks and memory, it's only a matter of time
        until 370 emulators start bringing those old COBOL programs onto
        the new "mainframe" which is the size of a Tower microcomputer. If
        people can move from MVS COBOL to MS or Ryan McFarlane COBOL, it
        will make their job easier, moreso if the program will run in
        exactly the same way as the original program, without the user
        having to rewrite any standard construct.

        Also, there should be support for changing what was VSAM files
        into the PC Equivalent, B-Trees, or into data files compatible
        with standard files, such as DBASE, Oracle, SQL, or any other
        environment which is being made available on PCs. Don't forget
        record locking on Novell to replace the standard ENQ/DEQ on
        Mainframes.

        Also, if he's got an inquiry program for CICS, he needs to develop
        a way to change his screens that were painted so that he can take
        the original IBM Assembler specification and translate that into
        a panel in MS-Windows or OS/2, or even a DOS text-mode screen.

        - If someone is debugging a program, they should be able to use
        full-screen debugging of the source language, not debugging in
        assembler. The debugger should support the data structures in
        the output file and allow their analysis in the same way as the
        source language does, so that I don't have to say 3e48:1023 and
        see 3000, instead I should be able to reference "CURRENT-PAY" and
        if it's COMP-1, see 480. If that's not unique, of course, then
        I have to say "CURRENT-PAY in PAY-RECORD".

        - We need tools like FORMAT for other languages. With object
        oriented languages referencing procedures indirectly through
        access to variables, "CALL TO" and "CALL BY" listings are badly
        needed. And they need to support multi-module

Yourdon also mentions one other thing that is going to "blow the socks
off" your average programmer: Object Oriented Cobol. The specifications
(in 1992 when his book was released) were just then being formulated. I
try to keep up on the literature, applications and new developments, and
yet I still don't understand Object Orientation that well; I shudder to
think what this will do to someone who hasn't read a textbook in five
years.

Like it or dislike it, COBOL is not going to wither away any time soon;
there's probably some $50 billion in programming assets tied up in it.
Like the difference between C and C++ or Objective C, you may not
recognize it by the time they get finished with it, but I suspect it will
still be around twenty years from now unless every one of those large
companies are able to move to micros and that their COBOL programs get
translated into other languages.

Look at what has happened to BASIC, it's the macro language for MS Word.

---
Paul Robinson - Paul@TDR.COM
[Yourdon's book is well worth reading. It's not specifically about compilers
but has a lot to say about the way that programmers do and don't use the
tools they have available.-John]
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

On Legacy Applications and Previous Work

Paul Robinson <PAUL@TDR.COM>Sun, 6 Mar 1994 14:25:30 GMT

Paul Robinson <PAUL@TDR.COM>
Sun, 6 Mar 1994 14:25:30 GMT