The D Language: A Successor to C++

Software controls more and more systems that directly affect human life, from airplanes to factory automation. At a less critical scale, our economy is built around software, from retail to market trading systems.

Except for those rare instances when cash is shipped to places like Iraq on pallets, money exists as notations in the national and international banking system. Financial instruments like stocks, bonds, futures and options are traded on computer mediated exchanges. The days when trading took place in "pits" by tough individuals screaming and spitting on each other are largely gone.

Recently a software error in the Knight Capital Group trading system cost the firm $440 million, in something like 40 minutes. Knight Capital barely survived this debacle, losing 70% of their equity, along with the fortunes of some Knight Capital employees.

That software is important should come as no surprise. A corollary, that software quality and maintainability is important, is an idea that is given more voice than practice.

We may never know what exactly happened in the software engineering process at Knight Capital, but the catastrophic software failure was almost certainly due to serious flaws in design, testing and the software release process.

Knight Capital was a market maker for a number of stocks and exchange traded funds (ETFs). A market maker provides capital (liquidity) to the market to match buyers and sellers. In doing so, market makers profit from the spread between the buy price and the sell price.

The transactions processed by Knight were routed by computer driven exchanges, which means that performance was an important issue for Knight. We can reasonably assume that the system that caused the failure was written in C++ (or, perhaps, C), perhaps running on a balanced cluster of Linux systems. The interpretive overhead of Java would not be acceptable in such an environment.

C++ tends to the be language of choice for high performance applications in finance (or even for applications where performance is not critical). In implementing a critical application like the Knight software, the best tools should be used. Most leading edge language designers would not say that C++ is the best tool available.

There are a number of problems with C++, but the core problems that make reliable C++ software difficult to implement are a lack of memory bounds checking and memory "garbage collection". Every programming language that I know of that has been designed in the last decade or so includes these features.

Along with memory bounds checking and garbage collection, I've included a list of other features that would be nice to have in a language that replaces C++:

  1. As noted, the language should support bounds checking and should, at least optionally, support garbage collected data structures.
  2. The compiled native code should have performance that is as good or better than C++.
  3. Support for C++/Java style objects (perhaps omitting multiple inheritance).
  4. Support for threads, or a similar construct to handle writing parallel software for multi-core systems.
  5. Allow linking to C libraries and drivers
  6. Ideally, the language syntax should be as similar as possible to C++/Java to reduce the learning curve.
  7. The language should be open source so the compilers are always available.
  8. There should be support for multiple platforms (e.g., at a minimum, Windows and Linux).

There are programming languages that support bounds checking, garbage collection and are compiled to native code.

The experience of developing a large complex, performance sensitive application, the Mozilla Firefox web browser, prompted some members of the project to experiment with the Rust programming language. Although Rust has the curly brackets of the C family of languages it "differs significantly in syntactic and semantic details". This makes Rust harder to pick up for C++ and Java programmers. Also, Rust is a relatively new experimental language so the compiler and runtime are mature.

The Go Language is also a native compiled systems language that supports bounds checking and garbage collection. The language was originally designed by Robert Griesemer, Rob Pike and Ken Thompson (Bob Pike and Ken Thompson where both at Bell Labs during the UNIX days and made significant contributions). Go is a throwback to the module structured languages like Modula-2 and Oberon (designed by Niklaus Wirth, who was at ETH Zurich). Apparently Google has used Go for some internal projects. I find Go uncompelling. Go is a throwback to an earlier era of language design.

Another option is the Excelsior JET Java to native compiler. The JET compiler was implemented by some talented compiler developers in Russia. The JET compiler will compile all or most of Java into native code on Windows and Linux. The compiler is also quite mature, having been around for a decade or so. However, there are a couple of problems. JET is not open source. Excelsior is the only company that sells a Java to native compiler. If Excelsior goes out of business then there are few alternatives (the GNU project has a Java to native compiler, but I'm not sure how robust it is).

The most promising candidate I know of for a C++ replacement is the D Programming Language (dlang.org) originally designed by Walter Bright.

Walter Bright developed one of the first C++ compilers that supported most or all of the language. The D language was Walter's response to the problems he recognized in C++. He spent several years developing D, always without license fee. The compilers, runtime and language are now published with open source licenses. D has been around for years, so it's had time to mature.

The D language meets the wish list above, and more. There are now books on D and it is slowly catching on. Using D means escaping the constant memory errors that C++ codes suffer from, with compiled code that is as fast or faster than C++.

Experience with the D Programming Language

Programming languages and environments are complex. A language can look good on paper and be problematic in practice. One example of this is the Java derived Scala programming language. In the case of Scala some projects adopted it and found later that there were problems in the utopia of Scala. To some degree, such disillusionment is to be expected with any new programming language. In the end, the proof of a programming language is in its use.

I had a chance to experiment with the D programming language during an event where we had 24 hours to implement an application. Since I wanted to eat and sleep, then turned out to be more like 14 hours. One challenge in taking part in such event is to choose a programming project that is challenging enough to be interesting but still can be completed in this relatively brief period of time.

The application that I chose was a TCP/IP socket server that would serve time series data to multiple clients. In this application multiple clients can be instantiated. Each client will request a different time series that includes a time stamp. The server will spawn off a socket and a threat to server this time series, honoring the time delay specified by the time stamp. Among other things, this kind of software is useful in testing software that processes time series data. This includes applications like the Passive Remote Sensing Instrument Control application that process device status time series.

The architecture of the time series server is diagrammed below:

The D source code for the time series server can be found here. I've included the test data as well in a zip file.

Before taking part in this event, I had not programmed in D before. I spent some time getting the D environment installed on my Windows laptop and setting up D in Eclipse (I was not able to get the debugger to work, however). I also spent several hours the evening before reading Andrei Alexandrescu's book The D Programming Language. But in keeping with the rules of event day, I didn't write any code the starting time.

In addition to the book on D, I also made frequent reference to the library reference on dlang.org. Because the D language syntax is heavily influenced by C++ and Java, I was able to pick the language up rapidly and implement 400 lines of working time series server code in about 14 hours. There were some things that I could not figure out how to do with the D language or library (for example, converting digit strings to integer values) that I ended up writing my own support for.

One of the barriers to adopting languages like Scala or Rust is the learning curve required for new people who join a project that uses these languages. My experience with D suggests that this learning curve is not terribly steep.

One of the biggest struggles I had was with the class library. Some of the classes have examples that are associated with them, although in some cases the examples are not well written (the TCP/IP server contains a goto(!)). In other cases, D streams for example, there are few examples and it required some experimentation. I would have liked a "D Cookbook" but this book has not been published yet.  The D language plug-in for Eclipse will show the functions available for an object, just as the Java environment does. This was useful in understanding which functions where available in a class hierarchy.

An important feature of a systems programming language is the performance offered by native compilation. I was not able to fully explore this in this project. The test code that I wrote for the time series server had 16 parallel time series streams and very little processing power was used. But a Java application might have exhibited similar performance.

Ian Kaplan
Last updated, August 2012