Re: Testing approaches?

Nick Roberts <nick.roberts@acm.org>
8 Nov 2003 01:45:03 -0500

          From comp.compilers

Related articles
Testing approaches? jeyrich@gmx.net (2003-10-31)
Re: Testing approaches? nick.roberts@acm.org (Nick Roberts) (2003-11-08)
Re: Testing approaches? dietz@dls.net (Paul F. Dietz) (2003-11-11)
| List of all articles for this month |
From: Nick Roberts <nick.roberts@acm.org>
Newsgroups: comp.compilers
Date: 8 Nov 2003 01:45:03 -0500
Organization: Compilers Central
References: 03-10-140
Keywords: testing
Posted-Date: 08 Nov 2003 01:45:03 EST

Joern Eyrich wrote:


> In the preparation for an upcoming project, a document written by
> someone who is no longer available mentioned that one of the modules
> essentially behaves like a compiler and should be tested like one.
>
> I can imagine that there might be techniques that have proven
> themselves suitable to this particular kind of software.
>
> I tried to google for some information on the topic of testing
> approaches for compilers, but wasn't able to find anything specific.
>
> Do you know of/can you point me to any documentation about this?


Probably at the fringe of interest to you, I can point you to a web
page which links to evaluation tools for the Ada language:


        http://www.adaic.org/compilers/eval.html


I'd suggest that all compiler testing broadly goes along the following lines.


There are three basic categories of test: static error; dynamic error;
semantics.


The static error category is for checks that the compiler detects the
error (or warning) conditions that it is supposed to, at compile
time. Typically you have a test 'harness', which is often a script (or
script generator). You have a whole set of (often hundreds or
thousands) of small source code files (snippets), each of which has a
different kind of deliberate static error that the compiler is
supposed to detect. The test harness must submit each source file to
the compiler and check that the compiler's response (output messages,
output listing, or output log file) is correct. It may be necessary to
have subjective (human) evaluation of the (accuracy and usefulness of
the) diagnostics produced.


The dynamic error category is for checks that the emitted code from a
compiler detects the run time errors it is supposed to. Typically you
have a large set of correct (compilable) source code programs that are
designed to deliberately do something, when run, which is supposed to
be detected as an error. Again you may have some kind of test harness,
which must compile each program, run it, and then check that the
program's behaviour is correct (it outputs the right error
message). It may be necessary for the harness to have the ability to
catch 'crashes' and recover from them (to continue testing).


The semantics category is the most difficult, and is for checks that
the code emitted from a compiler does what the specification (language
standard) says it should, in the absence of actual errors. Again,
typically there will be a very large set of programs, although it is
possible to have a small set of programs (or just one) that combine(s)
many or all of the tests. It is possible for (each of) the programs to
check itself; there may be no need for a separate test harness. The
nature of each test will differ according to what it is being
tested. Some may check that a certain value is produced by a
predefined or library function. Others may do complicated mathematical
manipulations to test the accuracy of mathematical operations. Some
may test that certain operations are performed suffciently quickly.
And so on.


In addition, it is likely that certain other things will have to be
tested, probably by hand. This might include testing that all options,
flags, arguments, etc., to the compiler do what they are supposed to
(and are correctly documented, well designed, sufficient to
requirements, etc.) It might even be appropriate to test that the
compiler handles internal errors (unimplemented features, etc.)
gracefully.


A full evaluation ought to include the compiler's documentation, which
should be comprehensive and correct. Ideally, it should also be well
written, well organised (indexed etc.), and well presented.


Finally, compilers often have different modes, and it may be necessary
to have different or modified tests to properly test all the various
different modes.


You may have already got the impression that testing compilers is a
big job, and it is. But it is also typically very important, since a
compiler producing bad code is one of life's more insidious threats to
the world. In the worst cases, we can imagine nuclear power stations
melting down, planes falling out of the sky, and so on. Even if these
are unlikely, there are few things so annoying to a user than finding
that their compiler is misbehaving.


--
Nick Roberts


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.