Byjean

Types and Craftspersonship

Bringing Play! 2 Heroku slug size under control

** EDIT 2013-10-11 **

The pull request has been merged. In the discussion, Josh Suereth suggested it should be possible to use the sbt-native-packager which replaced the dist command in play 2.2 to make a better build pack.

** EDIT 2013-08-19 **

I recommend reading up the pull request discussion at github. Basically, if you have a JVM based application, your absolute minimum slug size will be around 77MB because you _must_ package your own JRE in your slug. The default JRE for the stack is not upgraded on a regular basis leaving you exposed to security vulnerabilities.

I had been bothered by my Play! 2 apps slug size before, but never took the time to investigate. I couldn’t understand why sbt dist would yield a 34MB zip while Heroku would end up with a > 100MB archive. While deploying an upgrate to Play! 2.1.3, I noticed it had bloated to 142MB: I had to act.

The current Heroku buildpack uses sbt clean compile stage as its main command instead of sbt dist. I haven’t tried to change that as I wanted something working fast, but I speculate it would be the best way to go for a Play! 2 app.

I cloned the official Heroku buildpack for scala, added some debug output in bin/compile through du -sh ./* and find . \! -type d | xargs ls -Slh to try and understand were the bloat was coming from. To configure a custom buildpack for you app, all you have to do is run the following command :

$ heroku config:set BUILDPACK_URL=https://github.com/jeantil/heroku-buildpack-scala.git

Here is the output from du -sh ./*:

4.0K	./.gitignore
8.0K	./.ivy2
77M		./.jdk
12K		./.profile.d
251M	./.sbt_home
4.0K	./.travis.yml
4.0K	./LICENSE
4.0K	./Procfile
8.0K	./README.md
92K		./app
32K		./conf
55M		./project
1.1M	./public
4.0K	./system.properties
44M		./target
56K		./test

My first reaction was : 55MB in project ?! Since I had run a find on the whole directory I was able to check out what was in project, looking only for MB sized artifacts. Here is what I found:

$ grep ./project deploy.log  | grep M
55M	./project
 14M Aug 15 08:19 ./project/boot/scala-2.10.0/lib/scala-compiler.jar
6.8M Aug 15 08:19 ./project/boot/scala-2.10.0/lib/scala-library.jar
3.1M Aug 15 08:19 ./project/boot/scala-2.10.0/lib/scala-reflect.jar
 11M Aug 15 08:19 ./project/boot/scala-2.9.2/lib/scala-compiler.jar
8.5M Aug 15 08:19 ./project/boot/scala-2.9.2/lib/scala-library.jar
1.2M Aug 15 08:19 ./project/boot/scala-2.9.2/org.scala-sbt/sbt/0.12.3/ivy-2.3.0-rc1.jar
2.0M Aug 15 08:19 ./project/boot/scala-2.9.2/org.scala-sbt/sbt/0.12.3/main-0.12.3.jar
1.1M Aug 15 08:23 ./project/target/streams/$global/update/$global/out

Keeping both versions of the scala compiler in the production slug is not really useful. I haven’t tried to check why sbt places this here on Heroku and not on my workstation, but the first thing I added to my buildpack was :

  if [ -d $BUILD_DIR/project/boot ] ; then
    echo "-----> Dropping project boot dir from the slug"
    rm -rf $BUILD_DIR/project/boot  
  fi

If you read carefully the output of du above, you will have noted a .jdk folder weighing in at 77MB. That’s right, the default buildpack will leave the JDK in the slug. Remove it since it is only used for compilation:

  if is_play $BUILD_DIR && [ -d $BUILD_DIR/.jdk ] ; then
    echo "-----> Dropping jdk from the slug"
    rm -rf $BUILD_DIR/.jdk
  fi

Along the same line, you can drop all the intermediate compilation artifacts with :

  if [ -d $BUILD_DIR/target ] ; then
    echo "-----> Dropping compilation artifacts from the slug"
    rm -rf $BUILD_DIR/target/scala-*
    rm -rf $BUILD_DIR/target/streams
    rm -rf $BUILD_DIR/target/resolution-cache
  fi

And now my slug is back to 39MB which is still a bit fat but not so bad. As I said in the introduction, the best would probably be to change the buildpack and use the artifacts generated by sbt dist.

Until the pull request is accepted by the Heroku maintainers, you can fork my version of the build pack. I suggest you don’t use my version directly as I will not maintain my fork after the pull request.