Etsy Icon>

Code as Craft

Building A Better Build: Our Transition From Ant To SBT main image

Building A Better Build: Our Transition From Ant To SBT

  image

A build tool is fundamental to any non-trivial software project.  A good build tool should be as unobtrusive as possible, letting you focus on the code instead of the mechanics of compilation.  At Etsy we had been using Ant to build our big data stack.  While Ant did handle building our projects adequately, it was a common source of questions and frustration for new and experienced users alike.  When we analyzed the problems users were having with the build process, we decided to replace Ant with SBT, as it was a better fit for our use cases.  In this post I’ll discuss the reasons we chose SBT as well as some of the details of the actual process of switching.

Why Did We Switch?

There were two perspectives we considered when choosing a replacement for Ant.  The first is that of a user of our big data stack.  The build tool should stay out of the way of these users as much as possible.  No one should ever feel it is preventing them from being productive, but instead that it is making them more productive.  SBT has a number of advantages in this regard:

  1. Built-in incremental compiler: We used the stand-alone incremental compiler zinc for our Ant build.  However, this required a custom Ant task, and both zinc and that task needed to be installed properly before users could start building.  This was a common source of questions for users new to our big data stack.  SBT ships with the incremental compiler and uses it automatically.

  2. Better environment handling: Because of the size of our code base, we need to tune certain JVM options when compiling it.  With Ant, these options had to be set in the ANT_OPTS environment variable.  Not having ANT_OPTS set properly was another common source of problems for users.  There is an existing community-supported SBT wrapper that solves this problem.  The JVM options we need are set in a .jvmopts file that is checked in with the code.

  3. Triggered execution: If you prefix any SBT command or sequence of commands with a tilde, it will automatically run that command every time a source file is modified.  This is a very powerful tool for getting immediate feedback on changes.  Users can compile code, run tests, or even launch a Scalding job automatically as they work.

  4. Build console: SBT provides an interactive console for running build commands.  Unlike running commands from the shell, this console supports tab-completing arguments to commands and allows temporarily modifying settings.

The other perspective is that of someone modifying the build.  It should be straightforward to make changes.  Furthermore, it should be easy to take advantage of existing extensions from the community.  SBT is also compelling in this regard:

  1. Build definition in Scala: The majority of the code for our big data stack is Scala.  With Ant, modifying the build requires a context switch to its XML-based definition.  An SBT build is defined using Scala, so no context switching is necessary.  Using Scala also provides much more power when defining build tasks.  We were able to replace several Ant tasks that invoked external shell scripts with pure Scala implementations.  Defining the build in Scala does introduce more opportunities for bugs, but you can use scripted to test parts of your build definition.
  2. Plugin system: To extend Ant, you have to give it a JAR file, either on the command line or by placing it in certain directories.  These JAR files then need to be made available to everyone using the build.  With SBT, all you need to do is add a line like addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2") to the build definition.  SBT will then automatically download that plugin and its dependencies.  By default SBT will download dependencies from Maven Central, but you can configure it to use only an internal repository:

    
    resolvers += resolver("internal-repo", "") externalResolvers <<= resolvers map { rs => Resolver.withDefaultResolvers(rs, mavenCentral = false) }
    
    
  3. Ease of inspecting the build: SBT has an "inspect" command that will provide a lot of information about a setting or task: the type, any related settings or tasks, and forward and reverse dependencies.  This is invaluable when trying to understand how some part of the build works.
  4. Debugging support: Most SBT tasks produce verbose debug logs that are not normally displayed.  The "last" command will output those logs, making tracking down the source of an error much easier.

This is not to say that SBT is perfect.  There are some aspects of SBT that are less than ideal:

  1. Terse syntax: The SBT build definition DSL is full of hieroglyphic symbols, such as the various assignment operators: <<=, <+=, and <++=.  This terse syntax can be intimidating for those that are not used to it.

  2. Support for compiling a subset of files: The Scala compiler can be slow, so our Ant build supported compiling only a subset of files.  This was very easy to add in Ant.  SBT required a custom plugin that delved into SBT’s internal APIs.  This does work, but we have already had to do some minor refactoring between SBT 0.13.1 and 0.13.2.  We have to accept that that such incompatibility is likely to occur between future minor versions.

  3. Less mature: SBT is a relatively new tool, and the plugin ecosystem is still growing.  As such, there were several times we had to write custom plugins for functionality that would likely already be supported by another tool.  We also experienced a couple of bugs in SBT itself during the switch.

At this point it’s fair to ask why SBT, and not Maven, Gradle, or some other tool.  You can certainly get features like these in other tools.  We did not do a detailed comparison of multiple tools to replace Ant.  SBT addressed our pain points with the Ant build, and it was already popular with users of our big data stack.

The Transition Process

It may sound cliché, but the initial hurdle for actually switching to SBT was gaining an in-depth understanding of the tool.  SBT’s terminology and model of a build is very different from Ant.  SBT’s documentation is very good and very thorough, however.  Starting the switch without having read through it would have resulted in a lot of wasted time.  It’s a lot easier to find answers online when you know how SBT names things!

The primary goal when switching was to have the SBT build be a drop-in replacement for the Ant build.  This removed the need to modify any external processes that use the build artifacts.  It also allowed us to have the Ant and SBT builds in parallel for a short time in case of bugs in the SBT build.  Unfortunately, our project layout did not conform to SBT’s conventions -- but SBT is fully configurable in this regard.  SBT’s "inspect" command was helpful here for discovering which settings to tweak.  It was also good that SBT supports using an external ivy.xml file for dependency management.  We were already using Ivy with our Ant build, so we were able to define our dependencies in one place while running both builds in parallel.

With the configuration to match our project layout taken care of, actually defining the build with SBT was a straightforward task.  Unlike Ant, SBT has built-in tasks for common operations, like compiling, running tests, and creating a JAR.  SBT’s multi-project build support also cut down on the amount of configuration necessary.  We have multiple projects building together, and every single Ant task had to be defined to build them in the correct order.  Once we configured the dependencies between the projects in SBT, it automatically guaranteed this.

We also took advantage of SBT’s aliasing feature to make the switch easier on users.  The names of SBT’s built-in tasks did not always align with the names we had picked when defining the corresponding task in Ant.  It’s very easy to alias commands, however:

addCommandAlias("jar", "package")

With such aliases users were able to start using SBT just as they had used Ant, without needing to learn a whole set of new commands.  Aliases also make it easy to define sequences of commands without needing to create an entirely new task, such as

addCommandAlias("rebuild", ";clean; compile; package")

The process of switching was not entirely smooth, however.  We did run into two bugs in SBT itself.  The first is triggered when you define an alias that is the prefix of a task in the alias definition.  Tab-completing or executing that alias causes SBT to hang for some time and eventually die with a StackOverflowError.  This bug is easy to avoid if you define your alias names appropriately, and it is fixed in SBT 0.13.2.  The other bug only comes up with multi-module projects.  Even though our modules have many dependencies in common, SBT will re-resolve these dependencies for each module.  There is now a fix for this in SBT 0.13.6, the most recently released version, that can be enabled by adding

updateOptions := updateOptions.value.withConsolidatedResolution(true)

to your build definition.  We saw about a 25% decrease in time spent in dependency resolution as a result.

Custom Plugins

As previously mentioned, we had to write several custom plugins during the process of switching to SBT to reproduce all the functionality of our Ant build.  We are now open-sourcing two of these SBT plugins.  The first is sbt-checkstyle-plugin.  This plugin allows running Checkstyle over Java sources with the checkstyle task.  The second is sbt-compile-quick-plugin.  This plugin allows you to compile and package a single file with the compileQuick and packageQuick tasks, respectively.  Both of these plugins are available in Maven Central.

Conclusion

Switching build tools isn’t a trivial task, but it has paid off.  Switching to SBT has allowed us to address multiple pain points with our previous Ant build.  We’ve been using SBT for several weeks now.  As with any new tool, there was a need for some initial training to get everyone started.  Overall though, the switch has been a success!   The number of issues encountered with the build has dropped.  As users learn SBT they are taking advantage of features like triggered execution to increase their productivity.