Memory Allocation and Garbage Collection


True confession time: my web pages serve several purposes. One of these is to act as a notebook that I will have access to where ever I have a computer and an Internet connection. This page is under construction. In the short term the purpose of this page is to provide a draft of an article on memory and to provide links to garbage collection sources. In the longer term I hope to provide a polished article that will be of interest to other software engineers. This page was prompted by J.D. Marrow, who kindly sent me a reference to Hans-J. Boehm's Web page on garbage collection.

Dynamically Allocated Data Structures

Dynamicly created data structures like trees, linked lists and hash tables (which can be implemented as arrays of linked lists) are key to the construction of many large software systems. For example, a compiler for a programming language will maintain symbol tables and type information which is dynamically constructed by reading the source program. Many modern compilers also read the source program (e.g., parse it) and translate it into an internal tree form (commonly called an abstract syntax tree) that is also dynamically created. Graphics programs, like 3D rendering packages, also make extensive use of dynamic data structures. In fact it is rare to find any program that is larger than a couple of thousand lines that does not make use of dynamically allocated data structures.

Problems with Dynamic Allocation

To avoid consuming vast amounts of virtual memory and destroying performance through page swapping, many programs that use dynamic allocation also dynamically deallocate memory when the programmer believes it is no longer needed. This leads to a number of problems including:

Even carefully crafted code written by experienced software engineers tends to suffer from problems with memory leaks and references to deallocated memory. Products like like purify (for UNIX) and BoundsChecker (for Windows NT) are literally worth their weight in gold when trying to track down these errors.

Java and Garbage Collection

There are a variety of reasons that the Java programming language seems to be popular. Some of these reasons actually involve software engineering issues, rather than hype and religion. Java does not support explicit pointers (which are a source of a lot of complexity in C and C++) and it supports garbage collection. This can greatly reduce the amount programming effort needed to manage dynamic data structures. Although the programmer still allocates data structures, they are never explicitly deallocated. Instead, they are "garbage collected" when no live references to them are detected. This avoids the problem of having a live pointer to a dead object. If there is a live pointer to an object, the garbage collection system will not deallocate it. When an object is no longer referenced, its memory is automaticly recovered. So, in theory, Java avoids many of the problems listed above. The cost, however, is in performance. Automatic garbage collection is usually not as efficient as programmer managed allocation and deallocation. Also, garbage collectors tend to deallocate objects from a low level, which can hurt performance (e.g., deallocating an openGL display list at the element level). There is a rich body of work on garbage collector architecture and garbage collection algorithms (stemming largely from LISP runtime support, which makes heavy use of garbage collection). A lot of this work is aimed at avoiding the more severe pitfalls of garbage collection.

C/C++ and Garbage Collection

Many garbage collection systems rely on compiler support to help the system determine when a pointer goes out of scope, so the object the pointer points to can be deallocated. However, most C and C++ compilers don't support garbage collection. There are several packages that support garbage collection for C and C++. See, for example, Hans-J. Boehm's Web page on the Boehm-Demers-Weiser conservative garbage collector. Another version of the web page on the Hoehm-Demers-Weiser garbage collector is also available at a site hosted by Xerox PARC. This garbage collector is designed to be a replacement for malloc in C and new in C++. The Web pages also provides a reference list. So far I have not looked into the pros and cons of this software (as I mentioned above, this Web page is still under construction).

Pool Allocation

In pool allocation, objects are allocated from a pool of large blocks. When a point in the program is reached where this memory is no longer needed, the entire pool can be deallocated at once. For a discussion of pool allocation of C++ objects see Overloading New in C++.

Other Web pages

Ian Kaplan
Last updated, December, 2001


back to Notes on Software and Software Engineering

back to home page