m5: macro processor

Tue, 20 Oct 1992 15:38:53 GMT

          From comp.compilers

Related articles
m5: macro processor Dain.Samples@UC.Edu (1992-10-20)
Macro processors gjditchf@plg.uwaterloo.ca (1992-10-21)
| List of all articles for this month |

Newsgroups: comp.compilers
From: Dain.Samples@UC.Edu
Organization: Compilers Central
Date: Tue, 20 Oct 1992 15:38:53 GMT
Keywords: macros

(I think this will be of interest to readers of comp.compilers since so
many compilers (viz, C and C++) have macro processors as an implicit part
of the language system.)

I have not found cpp or m4 to be adequate for some of the pre-processing I
needed to do for a system I wrote as part of my dissertation work. So I
created m5: I started with the basic macro processing paradigm that cpp
and m4 use, but added the controls and parameter passing mechanisms that I

I am pleased with the result, am still using the processor, and thought
that others might like to take a look and play around with it.

The m5 source is configurable for both Unix and DOS environments.

The PostScript of the User's manual and the source code can be fetched by
anonymous ftp from thor.ece.uc.edu:pub/dain/m5. The abstract,
introduction, and a comparison with m4 is included below to whet or kill
your appetite, as the case may be. Please forgive the occasional
LaTeXisms in what follows.

Also, I'm unaware of any name conflicts with m5, but I wouldn't be at all
surprised if there was at least one other m5 out there.

User's Guide to the M5 Macro Language: $2^{nd}$ edition}
A. Dain Samples

The author is can be reached at the
University of Cincinnati, Dept. of ECE, ML\#30, Cincinnati, Ohio
45221-0030; or at Dain.Samples@uc.edu.


M5 is a powerful, easy to use, general purpose macro language. M5's
syntax allows concise, formatted, and easy to read specifications of
macros while still giving the user control over the appearance of the
resulting text. M5 macros can have named parameters, can have an
unbounded number of parameters, and can manipulate parameters as a single
unit. M5 provides separate macro name spaces called `pools' that simplify
the creation and management of context-dependent macros and dynamic macros
(macros defined by macros based on the input). Several examples
demonstrate m5's power and flexibility, including a random string
generator that uses BNF grammars to specify the form of the strings; an
implementation of a Turing machine; and a demonstration of how m5 can
process \LaTeX\ input and programming language source in the same file to
simplify code maintenance and documentation.



Macro processors are an extremely useful means of transforming text, and
are a standard part of many language systems. For example, the C language
system has a macro preprocessor as a {\em de facto\/} part of the language
standard. However, the general UNIX\footnote{Unix is a trademark of
AT\&T.} user does not have access to a powerful general purpose macro
processor. The macro processors distributed with UNIX ({\em cpp} and {\em
m4\/}) do not qualify on either point.

The phrase `powerful, general purpose macro processor' should mean that
the processor implements a robust set of functions\footnote{ At a minimum,
it should be Turing equivalent; see appendix~\ref{turing} for a proof that
m5 is Turing equivalent. I assert that {\it cpp\/} is not Turing
equivalent, and have not bothered to check {\it m4\/}, though I suspect it
is.}. The macro language should implement scopes, much as modern
programming languages allow name overloading through scopes. Debugging
should be easy. The macro language should also give the programmer great
flexibility in dealing with macros with variable numbers of parameters.
Recursive macros should be easy to write. Perhaps most importantly, the
user should find it easy to read macro definitions the day after they are

M5 implements a large set of pre-defined macros that simplify the
manipulation of text. It has a powerful scoping mechanism that allows the
user to define `pools' of macros (think of them as a file system
subdirectory) and a macro name search stack of such pools. For instance,
in a programming language source file, a user defined macro might expand
one way if it occurs in a global environment, and another in local
environments. M5 can be used to track nesting of function declarations,
begin/end blocks, or nesting of C's curly braces. Pools of macros can be
defined for the various contexts, and the appropriate pool used to control
macro expansion in the desired context. Pools are an associative data
structure mapping names to definitions and so are useful for collecting
and manipulating many different kinds of data. The example in
appendix~\ref{implementation} demonstrates the usefulness of pools: in the
implementation of the m5 processor, the names of parameters to macros can
be referenced both in \LaTeX\ documentation and in C code using the same
name, but each expands into something useful for that particular context;
the \LaTeX\ version of the macros sets the parameter names in a particular
font, while the C version expands to the code to access the runtime data
structure containing the value of the indicated parameter.

M5 debugging allows the user to track easily each step of a macro's
expansion, and control how much debug output is desired.

M5 macros can handle large numbers of named or anonymous parameters,
allowing recursive macros to be defined easily and naturally. M5 has many
control constructs to simplify the user's task, including {\tt Case}, {\tt
Match}, and several flavors of conditional tests.

M5 can be configured to scan various kinds of source files; it understands
C, C++, and Ada commenting conventions. The user can disable macro
definitions in contexts where their expansion would be a nuisance.

Finally, m5 allows the user precise control over the text that appears as
the result of macro expansion without making the definition of the macro
difficult to read. Macro definitions can be textually formatted to
reflect logical structure without sacrificing the readability of the
expanded text.

The latest version of m5 is available via anonymous ftp from the
University of Cincinnati at {\tt thor.ece.uc.edu:pub/dain/m5}. A
PostScript version of this document is available via anonymous ftp from
the Sequoia project at the University of California at Berkeley, as well
as via anonymous ftp from babbage.ece.uc.edu.

The following sections describe m5 in more detail. There has been a
strenuous effort to keep this documentation current with the
implementation. The source to m5 is written in C and m5; these m5 macros
automatically extract the cross reference manual in section~\ref{REFMAN}.
Please report any and all errors you may discover in the manual or m5 to
the author.

\section{Summary of M5 Features}

This section summarizes the attractive features of m5. For contrast, it
points out the major differences between m5 and m4. While the basic
paradigm is the same (macro invocation begins with the recognition of a
macro word optionally followed by a parenthesized list of comma separated
parameters), there are many differences\footnote{Most of these comments
are with respect to m4 as it existed about 1986-1988.}.

{\bf Keywords, Predefined Macros:}
All m5 predefined macros begin with a capital letter. ({\tt Eval,}
{\tt Include}, {\tt Ifdef}, etc.). This is necessary if you're using
m5 in conjunction with other preprocessors (e.g., cpp)
which also use words such as define, include, and ifdef. However,
user defined macros may begin with any case of alphabetic character
(including underbar and colon)
followed by more of the same or digits. A small number of
non-alphanumeric characters can be defined as single-character macros.
See section~\ref{names}.

{\bf Debugging:}
M5 allows you to specify several different levels of debug output, from
printing the names of recognized macros, to showing the resulting text of
each macro expansion. When macros are not behaving as expected, the
insertion of a simple call on the Debug macro will allow you to quickly
determine where the problem lies. See Section~\ref{DEBUG}.

{\bf Comments, Cpp, C, and C++:}
In addition to m4's `\#' comment convention, m5 supports several other
ways for commenting m5 code. These additional comments provide control
over white space in output, allow the comment character to be tailored to
new environments (Section~\ref{CONFIG}), and permit scanning text that has
its own special comment conventions. M5 works very well with {\em cpp\/},
C, Ada, and C++. See Section~\ref{COMMENTS}.

{\bf Scoping:}
Macros are defined in {\em pools\/} and macro name resolution is
determined by the state of the {\em pool stack\/} (see
Section~\ref{POOLS}). This permits macro definitions to be visible only
when the user wants them to be visible. It is easy, for instance, to
define macros that are visible only when scanning files with a certain
suffix, or only after certain keywords have been identified in the text.
This is totally under the macro writer's control; Section~\ref{TIPS}
details an example.

{\bf Environment:}
M5 reads all variables in the environment and defines them in a
special `Environment' pool. For example, a macro could access the
environment variable HOME to define file names.
where {\tt HOME\twdl Environment} specifies that the macro is to
be found in the macro pool {\tt Environment}.

{\bf Initialization:}
M5 reads .m5rc in the current directory as the first input file.

{\bf Extended parameter specifications:}
M5 allows an unlimited number of parameters, and has concise notations
for manipulating them. A macro's formal parameters can be named, making it
much easier to write, read, and debug macro definitions. See

{\bf Arithmetic expression evaluation:}
M5 adopted m4's method of arithmetic expression evaluation, but
increased the number of available numeric functions, and added number
formatting capabilities. See Section~\ref{EXPRS}.

{\bf Files:}
Files in m5 can be named, opened, and written to directly. See

A. Dain Samples, Dain.Samples@uc.edu, wk:(513)556-4783, hm:(513)771-5492
Dept of ECE, ML#30 University of Cincinnati, Cin'ti OH 45221-0030

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.