Subbuilds: build avoidance done rightOctober 21, 2009 — Eric Melski
I’ve heard it said that the best programmer is a lazy programmer. I’ve always taken that to mean that the best programmers avoid unnecessary work, by working smarter and not harder; and that they focus on building only those features that are really required now, not allowing speculative work to distract them.
I wouldn’t presume to call myself a great programmer, but I definitely hate doing unnecessary work. That’s why the concept of build avoidance is so intriguing. If you’ve spent any time on the build speed problem, you’ve probably come across this term. Unfortunately it’s been conflated with the single technique implemented by tools like ccache and ClearCase winkins. I say “unfortunate” for two reasons: first, those tools don’t really work all that well, at least not for individual developers; and second, the technique they employ is not really build avoidance at all, but rather object reuse. But by co-opting the term build avoidance and associating it with such lackluster results, many people have become dismissive of build avoidance.
Subbuilds are a more literal, and more effective, approach to build avoidance: reduce build time by building only the stuff required for your active component. Don’t waste time building the stuff that’s not related to what you’re working on now. It seems so obvious I’m almost embarrassed to be explaining it. But the payoff is anything but embarrassing. On my project, after making changes to one of the prerequisites libraries for the application I’m working on, a regular incremental takes 10 minutes; a subbuild incremental takes just 77 seconds:
Not bad! Read on for more about how subbuilds work and how you can get SparkBuild, a free gmake- and NMAKE-compatible build tool, so you can try subbuilds yourself.
What is a subbuild?
A subbuild is just the smallest part of a full build tree that must be built in order to completely build a single component of the build, including all its prerequisites. For example, my project consists of several applications and the libraries they depend on. Each of these components resides in a separate directory, and we use recursive make invocations to build everything. (Nota bene: if you have a non-recursive make then you probably already enjoy many of the benefits of subbuilds, but you should definitely still check out the other features of SparkBuild!)
The dependency graph for my project looks like this:
You can see that to build the agent component, for example, we only need to build the util, xml, and http libraries, and the agent application code, of course:
This subset defines the agent subbuild.
Subbuilds and developers
What makes subbuilds really interesting for developers is the realization that usually you’re working on just one component at a time. For example, on any given day I might be working on the agent component, or the cm, but rarely both. Most of the edits I make will be on code in the agent directory, with occassional edits to the agent’s prerequisites. As I’m running through the edit-compile-test cycle, I have some choices about how to run the build. The most natural thing for me is to simply run make in the agent directory. After all, most of the changes I make are in that directory, so that will do the right thing most of the time. Of course, if I have made changes to any of the prerequisites, or if I resync with the source depot and pick up somebody else’s changes in one of those prerequisites, I’ll probably get a busted build.
The next most obvious approach is a rebuild from the root of my source tree. This ensures that I always update all the pieces I need for the agent, but at the cost of also building components that are irrelevant to my current focus: if I’m just trying to rebuild to run the agent’s unit tests, there’s no need for me to rebuild the cm application, or the ldap library.
The best choice is the agent subbuild, the minimum set of things that must be built to be sure that the agent component is fully up-to-date. But although it’s possible on a small project like this to execute the subbuild manually, it’s a nuisance, and on a bigger project it may not be practical or even possible. You need a build tool that can automatically determine which parts of the build make up the subbuild for any component, and then automatically execute that subbuild. That tool is SparkBuild emake.
Subbuilds with SparkBuild
Subbuilds with SparkBuild start with a full build, during which emake captures information about which targets are produced by each submake. In subsequent builds, emake references that database anytime it can’t find a rule to build a particular target. If a match is found, emake runs the corresponding submake before proceeding. For example, the rule for the actual agent target looks like this:
$(OUT)/agent/agent: $(OUT)/agent/*.o $(OUT)/xml/xml.a $(OUT)/http/http.a g++ -o $@ $^
In a normal build, gmake would see the dependency on $(OUT)/xml/xml.a and use that file if it existed already, regardless of whether it was actually up-to-date; or report “no rule to make” if the file did not exist. With SparkBuild, emake checks the subbuild database for an entry matching $(OUT)/xml/xml.a and sees that it must run make in the xml directory before proceeding. Like magic, each of the agent’s prerequisites is updated without requiring me to take any action other than swapping emake –emake-subbuild-db=my.db for gmake in my build command-line.
Still not convinced that it’s worth a look? Here’s some more concrete results comparing a few different build scenarios from my project. These comparisons assume that I’m actively working on the agent component, and that I ran either a standard incremental, from the root of the source tree, or a subbuild using SparkBuild emake:
|No changes, standard:||
|No changes, subbuild:||
|Changes in agent, standard:||
|Changes in agent, subbuild:||
|Changes in util, standard:||
|Changes in util, subbuild:||
Conclusion and Availability
Tools like ccache and ClearCase winkins have co-opted the term build avoidance, but in fact they do object reuse, not build avoidance, and they are not very useful for developer builds. Subbuilds are a simple but highly effective approach to build avoidance that save significant time during developer builds by literally skipping parts of the build tree that are unrelated to your current focus.
If you want to try out subbuilds yourself, you can download SparkBuild from www.sparkbuild.com. It’s completely free (as in beer), so you’ve got nothing to lose… except those long coffee breaks, of course!