Re: Need an interesting topic for an undergraduate project on Compilers

BGB <cr88192@hotmail.com>
Sat, 06 Aug 2011 14:10:12 -0700

          From comp.compilers

Related articles
Need an interesting topic for an undergraduate project on Compilers amit.codename13@gmail.com (amit karmakar) (2011-08-06)
Re: Need an interesting topic for an undergraduate project on Compiler bumens@dingens.org (Volker Birk) (2011-08-06)
Re: Need an interesting topic for an undergraduate project on Compiler cr88192@hotmail.com (BGB) (2011-08-06)
Re: Need an interesting topic for an undergraduate project on Compiler bc@freeuk.com (BartC) (2011-08-09)
Re: Need an interesting topic for an undergraduate project on Compiler gene.ressler@gmail.com (Gene) (2011-08-10)
Re: Need an interesting topic for an undergraduate project on Compiler haberg-news@telia.com (Hans Aberg) (2011-08-10)
Re: Need an interesting topic for an undergraduate project on Compiler jgk@panix.com (2011-08-27)
Re: Need an interesting topic for an undergraduate project on Compiler cr88192@hotmail.com (BGB) (2011-08-31)
Re: Need an interesting topic for an undergraduate project on Compiler thomas.mertes@gmx.at (tm) (2011-08-30)
[8 later articles]
| List of all articles for this month |

From: BGB <cr88192@hotmail.com>
Newsgroups: comp.compilers
Date: Sat, 06 Aug 2011 14:10:12 -0700
Organization: albasani.net
References: 11-08-006
Keywords: courses
Posted-Date: 06 Aug 2011 20:59:50 EDT

On 8/6/2011 10:28 AM, amit karmakar wrote:
> I would like to have some suggestions as to what *new* and
> *innovative* project i can do which are based on compiler design.
> Also, considering the time i have to implement the compiler, i can
> think of cutting down work, like working on subset of a language. I
> would preferably not tend to work on only a specific part(phase) of
> compiler. It will be better if I implement a complete compiler for
> some architecture and see the executable running.


new+innovative and compilers, don't often go together, and another
problem is that terms like new/innovative/interesting/... depend
highly on who one is dealing with and their personal biases and
preferences (a cool idea for one person, may be considered stale,
boring, unworkable, ... by another).




a few thoughts:


most traditional research into compilers has been in how to squeeze as
much performance as possible out of them. maybe one can look into trying
for new and interesting features instead.




rather than work on subset languages, maybe it may make sense to work
with a simpler language design.


for example, a fairly simple language is Scheme (except for a few edge
cases) where often a person can throw together a working implementation
fairly quickly (or, at least IME with R5RS and earlier, dunno about R6RS
as I was mostly no longer dealing much with Scheme by this point, and
R6RS at the time looked a bit strange vs what came before).


a slightly less simplistic, but still relatively simple language, is
ECMAScript (basic core language for JavaScript, ActionScript, ...).


probably not worth trying to implement up-front are languages like:
C or C++ (fairly complex languages to implement);
Java (a lot more hairy than it looks, syntax can be deceiving);
...




note that dynamic typing generally makes things much easier to implement
(static typing makes things faster, and is "closer to the metal", ...
but it doesn't make things easier).


a more recent language of mine is using a "soft typing" model, which
basically combines elements of static typing on top of an otherwise
dynamically-typed VM (potentially using types as optimization hints in
the codegen, but treating type-checking, behavioral semantics, and
optimization, as separate issues).




personally, I like RPN / Stack-Machine style ILs (recently got into a
big argument over this though, a person who for whatever reason really
dislikes stack-machine ILs despite them being well proven in the JVM,
.NET, AVM2, ...).


examples of stack-machine languages would include Forth, PostScript,
Factor, ... (PostScript has had a notable influence on the design of my
ILs).


the upside of stack machines is that they are fairly easy to produce
code for (it is often very straightforward to unwind an AST into a stack
machine format), are themselves relatively simple, and are very capable
despite their relative simplicity.


a downside though is that they are relatively fussy about ordering
issues, and a general-purpose native codegen can get a bit hairy (mostly
due to ABI interfacing, for example, the SysV/AMD64 ABI is itself a
complex beast, and one has to effectively "pull a rabbit out of a hat"
to mesh it up directly with a stack machine IL). they are also far less
"du jour" with many people than are other options, such as TAC-SSA
(Three Address Code - Static Single Assignment).


granted, things should be much simpler if one doesn't want to go about
trying to directly call into native (statically-compiled) code, but
instead uses special functions to marshal the calls (I have later found
that this strategy can be fairly transparent as well).


also possibly useful is allowing for eval/... as well...


also, in my case, working to try to make the C interface fairly
transparent (marshaling calls and data-types and similar in both
directions, ideally eliminating nearly all cases of manually-written
boilerplate code).


ideally, the time of isolated languages and frameworks, and of languages
which don't have features like eval, will soon be nearing an end (this
doesn't mean I want many of the existing languages to go away, but
ideally most should have eval as a relatively common library feature, ...).




for example, my language has:
"native import C.foo;"


which allows implementing libraries from C land (the foo is a library
name, and where a tool is used to mine information from C headers/...).


"native package C.foo { ...body... }"
allows exporting the code ("...body...") to C land (in this case, the
boilerplate is written automatically by a tool).




granted, yes, none of this is really terribly new or original, as most
of this has been around for decades.




as for languages containing some interesting ideas:
Scheme (nice core language design);
Self (nice object system, partly carried over in a limited form into
JavaScript);
PostScript (relatively clean stack-machine model);
ECMAScript / JavaScript (simplistic yet conventional syntax);
ActionScript (like JavaScript but more "grown up");
Erlang (concurrent programming features);
...




granted, to be original, one needs to be, errm, original.
like maybe try to come up with some new/interesting language feature or
idea to try exploring, or something interesting to do at the
compiler/codegen level, ...


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.