C as a Hardware Design Language

July 2001

One of the hardest things we all need to learn when starting out with HDL is that we're not programming - we're building hardware and arrays of gates. Having done a *lot* of C and applications programming before I started on VHDL and Verilog I can tell you it took a while to shake off the programmer in me and become a hardware developer. Applying general-purpose programming tactics to HDL too often makes too many gates and highly inefficient chip and logic layouts.

"motorsabbath" responding to a question posted on slashdot.org about hardware design in JHDL® (Java HDL), January 16, 2002

In the last few years there has been a lot of work on tools for VLSI layout and timing analysis. However, for high level design the last great productivity break through was logic synthesis (e.g., tools like Synopsys' HDL Compiler). A number of software tools have been developed to improve productivity in various areas of design. Many of these tools have been developed by some of the brightest people working in the computer industry. So far, for various reasons, none of these tools has resulted in broad increases in productivity for VLSI designers. This is a real problem, since chip design complexity is approaching or exceeding the complexity of large software systems.

Anyone attending the annual Design Automation Conference, which is the EDA industry's "show and tell" will find an amazing amount to hype and what I tend to think of as a lot of naked emperors. One of these naked emperors is the idea of using algorithmic languages like C or C++ for chip design.

The idea behind C or C++ as a hardware design language is similar, in broad concept, to compiling behavioral Verilog or VHDL into logic netlists. There can be a huge distance between an algorithm specified in a language like C or behavioral Verilog or VHDL and the actual hardware design. A behavioral HDL compiler allows some behavioral HDL features to be compiled into logic netlists. But there is a constant tension between the desire to work at a high level and the need to control the details of the hardware architecture.

Like the behavioral HDL compilers, the companies that make tools that support C or C++ as a hardware design language claim some ability to compile algorithms into hardware. For C to be compiled into efficiently synthesizable HDL a huge amount of architectural specification must be moved from the designer to the tool.

I've heard some tool vendors state that their C to Verilog compiler would allow digital signal processing (DSP) experts to design hardware. This is only true if the performance of the resulting hardware is not even remotely an issue. The idea that a C to Verilog compiler can be given to a DSP software engineer, who has no hardware design experience, and used to design a fast FPGA or ASIC based implementation of a DSP algorithm is naive, at best. In most cases the distance between an algorithm and its efficient expression in hardware is just too great. For example, elsewhere on these Web pages I discuss wavelet algorithms. Wavelets are highly parallel signal processing algorithms and can be efficiently implemented in hardware. However, the current generation of C to Verilog compilers would not produce much parallelism compiling a C version of the wavelet code into synthesizable Verilog.

A great deal of effort and ingenuity has gone into the C to HDL compilers. They allow a wide range of algorithms to be compiled into synthesizable Verilog or VHDL. However impressive a feat it is to compile C into synthesizable logic, it is worth remembering that simply implementing an algorithm in hardware does not necessarily make it faster than writing the same algorithm in C and running it on a microprocessor. Custom hardware can be faster when the designer makes use of parallelism and hardware modules that are customized for the algorithm. For example, see for example Keshab K. Parhi's book VLSI Digital Signal Porcessing Systems, Wiley, 1999. The DSP designer must consider issues of sampling rate and calculation throughput, frequently using a parallel architecture to achieve the desired performance. In contract, the hardware that results from C to HDL compilers is frequently so inefficient that it would not meet the timing requirements for most applications targeted at hardware implementation.

At one time I had a fairly long dialog with a vendor of a compiler that could translate C into Verilog. As a test case I gave them a copy RSA Data Security's RC5 cryptographic algorithm, which I had rewritten to reflect some of the hardware modules I envisioned implementing. My version of the RC5 algorithm can be found here. The RAM table and the shifter (named ROT in the C code) are intended to be shared modules.

The C to Verilog compiler did an impressive job of translation, in that it was able to properly produce Verilog that implemented the same algorithm as the C code. This might be a good way to implement a prototype of the algorithm in FPGA, assuming that hardware efficiency and performance were not a concern. Given the speed of modern microprocessors, simply placing an algorithm in FPGA does not guarantee that it will be faster than a software implementation. The C to Verilog compiler vendor's results showed that the performance of many algorithms was not significantly improved by compiling them for FPGA. This was also true of the RC5 algorithm.

Synthesizable Verilog or VHDL are languages for specifying hardware design and architecture. When the RC5 algorithm is implemented in a synthesizable HDL, the designer is not just producing a translation to Verilog, but an architectural description for the hardware. The architecture chosen will be based many factors, including how the chip will fit into a larger system. In the case of a cryptographic chip the designer might want to pipeline the design so that the chip can efficiently stream data. Synthesizable HDL can describe hardware architecture in a way that C cannot.

The hardware design for synthesizable logic implementing the RC5 algorithm would be much larger than the C code, since much more detail is specified. This includes busses and bus control logic between the modules and details about how the various components are sensitive to clock edge (e.g., does the module respond to the positive or negative edge of the clock). An example showing the logic implementing RC5 initialization and the RAM block can be found here. The corresponds to the "setup" part of the C code. This Verilog design includes a test bench to test the logic. However, even when this test bench is removed, the Verilog is considerably larger and more complicated than the C code.

When the limitations of a tool like the C to Verilog compiler were pointed out to one vendor, they claimed that designers don't have to worry about hardware efficiency. In an era of chips that can easily contain millions of logic gates hardware is "cheap". In the same vein, the C to Verilog compiler vendor stated, computer memory is cheap. No one would suggest implementing large software systems in assembly language.

The argument that hardware is cheap so a designer can concentrate on the abstract algorithm, forgetting about hardware architecture is not true for most systems which are implemented in hardware. Silicon resources may be abundant relative to the past, but many designs, whether in ASIC or FPGA, must meet demanding timing, power and architectural constraints. The only way to address these issues is with a well chosen architecture. C as a hardware design language lacks the idioms (wires, delays, non-blocking assignment) to deal with hardware specification.

A C to Verilog compiler is not even attractive for test bench implementation. Test benches frequently test chip timing and events (e.g., delays and activation conditions). For example, a test bench that tests a design like a Universal Serial Bus or a PCI bus, must test the bus response to events through time. Event and timing driven test benches are also used to test microprocessor designs, when testing features like cache access or MMU operation. Events and delays cannot be specified in C and the synthesizable HDL generated by a tool like a C to Verilog compiler.

Simulation of large VLSI designs is becoming a problem in the EDA industry. Even when simulated on an FPGA based emulator, the design can take significant amounts of time to simulate. Setup and debug on an emulator can also be time consuming. Some of the proponents of C as a hardware design language have proposed C as a solution for this simulation bottleneck. Since compiled C runs much faster than simulated Verilog or VHDL, a hardware systems specified in C would "simulate" much faster than the design specified in Verilog or VHDL.

There are several problems with this argument. If the designer cannot specify an efficient parallel architecture, it does not matter how fast the design simulates. The designer is not being given the tools they need. In order to get good hardware utilization and performance architectural details must be specified in C. There have been a variety of proposals to extend C with various HDL like constructs. But if you are going to use an HDL, you might as well use Verilog or VHDL. These languages are more expressive when it comes to hardware and they allow test benches to be written that have timing and event control. Verilog and VHDL "cycle based" simulators already exist which do not have all of the over head of event based simulators.

The idea that compiled C is faster than compiled Verilog or VHDL is an illusion. The code generator for the Quickturn SimServer Verilog compiler (which I designed and implemented) generated powerPC code of similar quality to C. The extra overhead incurred by Verilog or VHDL results from properly managing events and simulation semantics. An environment that uses C as a design language must either give up these semantics up (e.g., and use a cycle based approach) or pay the same price.

There is always a danger in stating "it will never work". Skeptical claims are especially dangerous in the EDA industry, where there a very bright and talented people working on the software tools. C or C++ as a design language may have some utility in a cycle based design and simulation environment. Some companies have mandated a cycle based design style where only a single clock edge may be used. Even in such an environment, attempting to extend C or C++ as hardware design languages will probably prove to be too limiting for many design engineers.

Verilog and VHDL are old languages that were designed for VLSI design specification and simulation. They were not designed for logic synthesis. The EDA industry has avoided any serious discussion of replacing these languages because their use is so wide spread. Discussion of new languages has also been difficult because the discussion has been clouded by hype. I am a software engineer, not a hardware engineer. The issues that I've brought up here should be obvious to anyone who has done RTL level design. Yet I continue to see them glossed over, not only by startups selling C to HDL software, but by large companies that have proposed similar design approaches. The EDA industry will have a hard time moving forward in high level design when there are so many naked emperors and so little honest discussion.


Local sub-page links referenced above

Disclaimer and all that

This web page is an outgrowth of a fairly detailed set of email notes that I exchanged with a vendor of C to HDL translation software after they gave a presentation at Quickturn (at the invitation of the marketing department, which did not know enough to recognize what was hype and what was reality). Names have been omitted to protect the innocent and the guilty. The careful reader of these web pages will note that this Web page was written about a year after I left Quickturn. However, the discussion is based on email I saved from this earlier period. None of this material has direct relation to the work I was doing at Quickturn. The RC5 rewrite and the Verilog design was done at night and on weekends. The opinions stated here don't necessarily reflect those of Quickturn or Cadence. I doubt that Cadence would officially agree with my claim that naked emperors are running rampant in the EDA industry (an interesting image, you have to admit).

Ian Kaplan, July 2001
Revised: January 2002

back to VLSI Design and Simulation