A compiler by itself may be interesting to the compiler designer, but it is not much use to anyone else. In practice a compiler is part of a software development environment, which includes assemblers, linkers, debuggers and grapic user interfaces to support software development. This page collects some random notes and links on these topics.
Jack Crenshaw's compiler tutorial
Few people seem to study compiler design. So I see posting asking about how to generate code from an ANTLR parser, for example. Much of the material on compiler design takes a lot of work to read, since it contain a lot of formalism. A low key, largely non-technical introduction to compiler constructon has been written by Jack Crenshaw, who currently writes an outstanding column for Embedded Systems.
Parsing Expressions By Recursive Descent by Theodore Norvell
ANTLR, the parser generator that I use, generates a recursive descent parser from an LL(k) grammar. Recursive descent is fine, but as the levels go up in the grammar, the parsing speed can go down. This is not a big problem for a modern compiler on a modern computer system, since most of the time is spent in other phases (e.g., semantic analysis, intermediate generation, optimization). Norvell proposes a technique called "precedence climbing" which reduces the time to parse expressions with precedence.
Language verification test suites
Probably the most well known vendor is Plum-Hall. Another option is Modena Software. Modena Software has validation suites for Java and C++. They also have an optimization test suite that seems to compete with Nullstone. But for optimization test suites I recommend Nullstone.
Object code format and debug format
The pointers to the material on DWARF came from a comp.compilers posting by James Cownie. Any errors, of course are my responsibility.
DWARF 2 debug format. SGI has published the open source for a set of DWARF 2 access libraries at http://reality.sgi.com/davea/objectinfo.html
There is apparently an informal group that is also working on DWARF. The evolving DWARF 2 Standard is published on http://www.eagercon.com web pages.
The Concurrent Version System (CVS). This is an open source network transparent source control system (useful for doing source control on large software bases like compiler source). A book, Open Source Development with CVS by Karl Fogel is one source for CVS documentation.
Linking
As languages have gotten more complex (e.g., C++) the complexity and sophistication of linkers and other post-compilation tools has followed. This includes resolution of generics, like templates, and managing name space issues for fetures like overloaded functions. Although computer memories today are larger than our wildest dreams twenty years ago, code bloat is still and issue. Post processing tools that remove unreferenced code can help address this problem.
Squeeze (http://www.cs.arizona.edu/squeeze)
Bjorn De Sutter writes in comp.compilers:
Squeeze is a post-link-time optimizer that removes unreachable code. It includes some very sophisticated analyses, but also some simpler analyses that we have implemented in post-link-time tools that take only some seconds and result in 20% code elimination. The system is written for Alpha code, but is generally applicable. It achieves roughly the same result on Compaq and gcc compiled code.
Ian Kaplan, April 14, 2000