|Testing approaches? firstname.lastname@example.org (2003-10-31)|
|Re: Testing approaches? email@example.com (Nick Roberts) (2003-11-08)|
|Re: Testing approaches? firstname.lastname@example.org (Paul F. Dietz) (2003-11-11)|
|From:||"Paul F. Dietz" <email@example.com>|
|Date:||11 Nov 2003 14:35:28 -0500|
|Posted-Date:||11 Nov 2003 14:35:27 EST|
Nick Roberts wrote:
> The semantics category is the most difficult, and is for checks that
> the code emitted from a compiler does what the specification (language
> standard) says it should, in the absence of actual errors. Again,
> typically there will be a very large set of programs, although it is
> possible to have a small set of programs (or just one) that combine(s)
> many or all of the tests. It is possible for (each of) the programs to
> check itself; there may be no need for a separate test harness. The
> nature of each test will differ according to what it is being
> tested. Some may check that a certain value is produced by a
> predefined or library function. Others may do complicated mathematical
> manipulations to test the accuracy of mathematical operations. Some
> may test that certain operations are performed suffciently quickly.
> And so on.
A complementary approach is to randomly generate programs and test the
compiler on them. The two important problems are: how do you generate
random programs that will exercise the compiler sufficiently, and how
do you check the compiled code does the right thing? The former is a
problem of random constrained AST generation, and the latter can
exploit the semantic constraints inherent in the compilation process
(for example, that a correct, conforming program has the same meaning
under different optimization settings, or with different correct
compilers, or under an interpreter.) Of course, the compiler crashing
or suffering an internal assertion failure is never correct.
Random testing has had a bad name in some testing literature, but
being able to generate and run millions of tests automatically is very
attractive (even if those tests are, individually, not as good as
well-planned manually written tests) -- and it becomes even more
attractive as computer power becomes ever cheaper. Random testing
will tend to find a different set of bugs than spec-driven conformance
tests (for example, optimization or feature interaction bugs
vs. improper implementation of a specific language feature), so both
approaches should be used for higher reliability.
Unit testing is also important (IIRC, statistics on escaping defects
show that inadequate unit testing is the most important reason they
were not found).
Whataver testing approach you take, you should monitor test coverage
(what fraction of branches or statements are executed in the program
by the tests) to assess the quality of the testing strategy and to
suggest where additional testing effort is needed. It may be easier
to write unit tests to cover odd paths than to write integration tests
to do the same.
Eventually, any testing strategy suffers from Beizer's 'Pesticide
Paradox' (the bugs become resistant to it, becoming located in the
parts of 'bug space' the testing method doesn't reach), so it makes
sense to use many different approaches at once.
Return to the
Search the comp.compilers archives again.