[–] JohnBooty link

Wow! This is potentially a real gain for us. We have a big in-house, monolithic Rails app.

My initial experiment was encouraging. Boot time in development mode went from ~23sec to ~16sec, and I only enabled it for the main engine that comprises about 85% of our codebase so the real gains might be larger.

Looking forward to seeing what it can do in production mode - our boot times there are horrendous and it's a big deal for things like cron jobs. Thank you to all those who worked on this.

reply

[–] burke link

The particularly frustrating thing, when I've started thinking about optimizing boot time at a VM level, is that it's near-impossible to "understand" what loading a file actually does, since it's all just evaluated in a single namespace.

It would be great if we somehow had a way to load a module-as-file without unknown side-effects, and without depending so deeply on the other contents of the global namespace.

But this is basically describing a complete overhaul of most of what makes ruby ruby, so... ¯\_(ツ)_/¯

reply

[–] chrisseaton link

Yes if there were special Ruby source files that only had classes and modules at the top level, and only defined nested classes and modules and methods in those, then it would be a lot quicker to load things.

reply

[–] burke link

Yep. But even then though, what if:

    class A
      B = "c".freeze
    end
And elsewhere:

    class String
      def freeze
        raise "because I can, that's why"
      end

      # or even method_added, TracePoint, ...
    end
It feels like something should be possible here, but it's really steep uphill battle.

reply

[–] chrisseaton link

Right - that's why I said nothing but methods and nested classes - expressions in method bodies would be disallowed. And tracing and method added hooks and so on, yes.

You could say it was a separate language .rb-module or something, to make it formal.

reply

[–] dragonwriter link

> Right - that's why I said nothing but methods and nested classes - expressions in method bodies would be disallowed.

I presume you mean "expressions that aren't method definitions in class bodies" (though that's a problem, because of things like attribute declarations) rather than "expressions in method bodies", since methods with no expressions in their bodies would be pointless.

reply

[–] burke link

Oh, huh, yeah, that makes sense. That would totally work. Cool idea.

reply

[–] rurban link

I'm also planning to add lazy parsing to perl. Do you store the whole string of the method body or do you mmap your source files, and store only the mmap ptr, offset and length for the body?

reply

[–] tomstuart link

How difficult is lazy parsing for Ruby? How much parsing do you need to do just to find where the method body ends?

reply

[–] chrisseaton link

You basically have to do all the parsing, but you can delay creating the actual AST and other data structures like byte code, which for us is the really expensive bit.

reply

[–] tomstuart link

Makes sense, thanks. So you end up redoing most of this (parsing) work later when you want the AST, but a) you might not have to redo it at all if the method's never actually called, and b) it's not the expensive bit anyway?

reply

[–] chrisseaton link

Yes.

But if you have source code that you know you will likely be requiring such as the standard library, you can do the initial parsing while compiling the Ruby VM, so you don't end up doing the parse twice at runtime.

Long term what we hope to do is to provide a build of the Ruby VM that includes the version of Rails you are using pre-parsed.

And then longer term we'd like to actually fully parse and initialise Rails (run the top level of the files which are loaded) during compilation, and freeze the heap and store it in the Ruby VM executable. When you run this special Ruby/Rails VM the Rails code is simply mmapped into your address space with all objects initialised and ready to go.

Obviously it'll require some tweaking to delay doing this like starting the web server so that doesn't get run during compile time.

reply

[–] burke link

We've talked about trying to implement this strategy (load everything, dump/restore the heap) with MRI, but our thought experiment was on the scale of a fully-booted application. In that context, it gets difficult to determine how to proceed when an application source file has changed, since re-loading isn't safe in ruby.

It's a really interesting idea to pre-load the heap with just a set of libraries, which wouldn't be subject to as much change.

reply

[–] chrisseaton link

In the implementation of Ruby that I work on, TruffleRuby, we've been exploring lazy parsing, where the parser will find a method but not fully parse it until the method is called for the first time. I wonder if there's any other modifications you could make to the VM itself to improve startup time.

reply

[–] kellysutton link

Hi burke!

This definitely looks interesting. Boot times for majestic monoliths is a pain that I've experienced many times.

How does the fit in with zeus and/or spring?

How similar is this to bootscale? (https://github.com/byroot/bootscale) Or rails-dev-boost? (https://github.com/thedarkone/rails-dev-boost)

reply

[–] burke link

> How does the fit in with zeus and/or spring?

Two orthogonal optimizations: It definitely plays nice with spring, speeding up the pre-fork rather a lot, and the post-fork a little bit. I can't think of any reason it wouldn't also work with zeus, but I haven't tried it.

> How similar is this to bootscale?

The load-path-caching features are a minor evolution of bootscale. The major difference is that the caching is a little more aggressive in order to be confident enough to return definitive negative results when it thinks a feature is non-present on the load path.

I remember using rails-dev-boost years ago, but I can't really remember what it does (EDIT: Should be a similar story to Spring -- complementary optimizations)

reply

[–] kellysutton link

Cool. Thanks!

reply

[–] johne20 link

Nice work Burke! I saw the title, and I immediately thought that has to be Burke. Sure enough... :)

reply

[–] burke link

Old habits die hard :)

reply

[–] wolco link

Curious when did shopify go with Ruby/rails. If I remember when the company was initially started they were looking for php developers. Was the orginal stack built in Ruby/rails?

reply

[–] burke link

Nope, we've been Rails since before Rails was even public. I'm sure we've hired a PHP developer here and there over the years, but our core platform has always been Ruby.

reply

[–] ksec link

>Nope, we've been Rails since before Rails was even public.

How is that possible? Or you mean Ruby before Rails?

reply

[–] burke link

Tobi knows DHH; apparently we started building with rails before its public release. Don't quote me on that, but I've heard it in passing a few times.

reply

[–] halostatue link

Probably before Rails 1.0, certainly.

I don’t remember if Tobi was involved in the Ruby community in late 2004 when DHH introduced Rails at RubyConf 2004 (in DC), and I don’t remember him being at that conference, either.

But I do remember seeing Tobi involved in ruby-talk by early 2005.

But Shopify has always been a Rails shop.

reply

[–] girvo link

Might be thinking of Etsy, they (are/were?) PHP

reply

[–] daviding link

Does it work on a Heroku stack?

reply

[–] trustfundbaby link

Doesn't seem to work for me. Deploy fails with massive stacktrace

  remote:  !     Could not detect rake tasks
  remote:  !     ensure you can run `$ bundle exec rake -P` against your app
  remote:  !     and using the production group of your Gemfile.
  remote:  !     rake aborted!
  remote:  !     Errno::ENOSPC: No space left on device

reply

[–] burke link

load path caching should; compilation caching won't, for now at least.

reply

[–] johne20 link

Nice work Burke! I saw the title, and I immediately thought that has to be Burke. Sure enough... :)

reply

[–] burke link

I'm the primary author, can answer questions if you have any.

reply

[–] burke link

I've tossed around the idea of writing zeus again now that I actually understand the language I wrote it in. Spring is much simpler, but because of the manner in which it's loaded, it isn't capable of detecting certain types of file change, which reduces developer confidence in it.

Zeus is capable of detecting any sort of invalidating file change, but is pretty buggy (or at least was historically -- the Stripe guys improved it a lot after I stopped working on it).

reply

[–] ischi link

Still fairly buggy,some terminal issues should be fixed now but reloading has race conditions still.

reply

[–] dobs link

Gave this a quick shot on my own monolithic app and it cut startup time almost in half. Impressive considering how easy it was to configure!

Startup time was one reason we started migrating away from Rails in a previous workplace, between frustrating startup time in development and test and occasional quirkiness of zeus and spring. Bootsnap would have been a godsend.

reply

[–] burke link

Sadly, the 2.3 requirement is inherent, since the RubyVM::InstructionSequence dump/load API was introduced in 2.3.0. However, you could probably still benefit from http://github.com/byroot/bootscale.

reply

[–] guu link

Have you considered trying mruby? That would allow you to ship standalone binaries.

https://github.com/hone/mruby-cli

reply

[–] burke link

I can't believe I hadn't though of this. Could be a really useful idea!

reply

[–] jacobevelyn link

As a fellow Ruby CLI developer, I feel your pain exactly. I've been planning on exploring Traveling Ruby[^1] for exactly this reason (as well as the fact that telling users they need to `sudo gem install` something is non-ideal) but hadn't yet gotten around to it.

Out of curiosity, what's your tool(s)?

[^1]: https://github.com/JacobEvelyn/friends/issues/160

reply

[–] jitl link

I write an internal developer tool at Airbnb that is conceptually similar to Vagrant.

We get around the 'sudo gem' problem by distributing our tool as a git repo, then bootstrapping a vendored install of Bundler to manage our own little gem path using /usr/bin/ruby. We take care to remove most ruby-related env vars during init so we're safe from whatever crazy RBENV or RVM shenanigans are happening on the system. This setup works fine, but we don't get recent language perf improvements since we use system Ruby.

reply

[–] burke link

Heh. I write an internal developer tool at Shopify that is (somewhat) conceptually similar to vagrant. (https://twitter.com/burkelibbey/status/858013844626649092)

Like you, we distribute it as a git repo. We don't use bootsnap with it, but we have a few strategies that give us reasonable times:

    $ time ./bin/dev help up >/dev/null
    0.07s user 0.03s system 98% cpu 0.102 total
* We vendor every dependency (and try really hard to avoid them in the first place -- we have 5, only one of which is >5 source files), and prevent loading rubygems. Rubygems takes a long time to load. Our shebang is `/usr/bin/ruby --disable-gems`.

* Autoload everything. Our toplevel lib/dev.rb file is a whole-namespace autoload registry. Only a few other constants are defined there. Everything is loaded just by cascading through autoloads.

* Defer stdlib requires: We load most stdlib features within the method body from which they're used. Several stdlib features take a surprisingly long time to load.

reply

[–] jitl link

I pulled these requires into the method bodies where they are used:

- openssl

- digest

- resolv

- rgl

- net/ssh

- (internal http client)

- (internal package manager)

Doing this saved about 40% of our CLI boot time:

    $ time /usr/local/bin/airlab > /dev/null
    /usr/local/bin/airlab > /dev/null  0.27s user 0.15s system 79% cpu 0.521 total
    $ git co jake--no-rubygems
    $ time /usr/local/bin/airlab > /dev/null
    /usr/local/bin/airlab > /dev/null  0.24s user 0.08s system 98% cpu 0.326 total

reply

[–] jitl link

Great tips! We should do the refactor work to switch to lazy-loading everything, but that will take some time. I can certainly get the Rubygems savings today though.

reply

[–] jacobevelyn link

Thanks for the tips! I'm definitely going to use these as well.

reply

[–] jitl link

This is awesome, and I'd love to use it for the command-line dev tools that I write. Unfortunately this gem requires Ruby 2.3+, but macOS built-in Ruby, which is the Ruby we target, is only 2.0.0.

Does anyone know of a good solution for prebuilt, relocatable Rubies on macOS that I could easily bundle with my tool? I'm reluctant to use Homebrew or another package manager like rbenv, where I'd have to implement a non-trivial bootstrap process. Phusion's travelling-ruby project would be perfect, but it's unmaintained.

I just want my CLI to boot in 0.05s without needing to change languages. Love Ruby, but getting decent perf takes a bit of effort.

reply

[–] burke link

The largest culprit for slow ruby boot times is an O(n) number of syscalls over the LOAD_PATH each time `require` is called, so the number of syscalls is essentially O(n*2) to the number of gems. The load-path-caching feature of bootsnap (cf. bootscale) fixes this, and accounts for a reduction from 25 to ~9.5 seconds. The iseq/yaml caching only accounts for the last ~3.5 seconds.

reply

[–] lathiat link

Aaron Patterson did a really great talk detailing the process called "Code Is Required". He's a really great presenter both humour wise and manages to often explain relatively technical things very understandably. Highly recommend watching this (and his other stuff)

you can watch it here: https://www.youtube.com/watch?v=_bDRR_zfmSk

reply

[–] EvilTrout link

Discourse co-founder here.

I'm not sure about those stats you posted from 3 years ago since they aren't using the same `rake stats` numbers that are built in to Rails. Discourse's Rails app is currently 63k SLOC not including tests.

On my relatively fast computer booting takes 4s without bootsnap and 2.5s with it, which is a nice quality of life improvement.

reply

[–] jitl link

In this case, you need to analyze not only the applications source code, but also the size and quantity of its dependencies, which inflate Ruby's LOAD_PATH, which as discussed makes `require` slow. The issues raised here are typical for a large Ruby application with many gem dependencies.

I think it's safe to assume the author using reasonable SSDs on a Macbook Pro, given that the iseq cache targets only macOS.

reply

[–] choward link

Does your Python app load all at once or lazily load as you hit different parts of the app?

reply

[–] dismantlethesun link

It loads all at once, part of the bootstrap does a check of all the mvc modules and templates.

I'm not 100 percent sure if third party modules for background tasks get loaded and the same time but they aren't part of my line of code count.

reply

[–] undefined link
[deleted]

reply

[–] burke link

FWIW, the machine that generated all of those times:

* MacOS Sierra

* 2.6 GHz Intel Core i7

* 16GB 2133 MHz LPDDR3

* 500GB SSD, whatever Apple ships.

reply

[–] est link

If you try zc.buildout, your python code start time will drop significantly. It will insert gazillion sys.path.

reply

[–] dismantlethesun link

I'm kinda shocked that Ruby boot times can be up to 25 seconds for a monolithic app.

A Python project I work on has 279,124 lines of code and boots up in 2.5 seconds.

Without downloading it, all I can find is Discourse had 60,000 lines of code 3 years ago [1]. Assuming as an extreme estimate they tripled their code size in 3 years, we have 180,000 LOC taking 6 seconds to boot up according to the article.

Is this normal for Ruby? Is the author using a spinning disk drive rather than an SSD?

[1] https://github.com/bleonard/rails_stats

reply

[–] burke link

Yep, you're probably using linux. The cache backend for compiled artifacts is filesystem extended attributes, which have a maximum size of 64MB on darwin, but as little as 4kB on some linux configurations (if they're even enabled, which they often are not).

Practically speaking, the compilation caching features are not supported on linux. Eventually we'll change the cache backend or add a different one that does work on linux.

reply

[–] Cerium link

Yes. I am using Linux. Thanks for the quick response.

reply

[–] Cerium link

Thanks for releasing this, I gave it a try.

Starting benchmark time: 13.05 seconds. With load_path_cache: 10.01 seconds

Sadly, with compile_cache on I'm getting an error. /vendor/bundle/ruby/2.3.0/gems/bootsnap-0.2.14/lib/bootsnap/compile_cache/iseq.rb:30:in `fetch': No space left on device (Errno::ENOSPC)

Any ideas on what causes this?

reply

[–] deedubaya link

Avoiding a flame war, it depends on what your goals are.

From a language standpoint: Ruby emphasizes developer happiness at the expense of some things, like performance/concurrency for example.

From a career standpoint: There is a lot of ruby in the world today. There will be lots of applications to maintain as the years go on, which is +1 from a career perspective. Lots of people will also continue to write new ruby software, because it's effective and easy to be productive in.

All the languages you mentioned + ruby are all good languages to learn for various reasons. All have their weaknesses and strengths. None of them are an effective hammer for every nail you'll encounter.

reply

[–] camus2 link

> There is a lot of ruby in the world today

If you live in the west coast, certainly. Anywhere else in the world absolutely not.

> Ruby emphasizes developer happiness at the expense of some things,

That's a strange statement. Plenty of developers enjoy writing PHP, or C++ or even Java. Ruby doesn't make developers more happy, by no serious metrics.

Ruby had its shot but wasted it because of the petulance, the arrogance, the immaturity and the toxicity of its community.

> None of them are an effective hammer for every nail you'll encounter.

Ruby (in fact Rails since that's really what it is all about) is clearly redundant in the era of light weight servers and thick clients.

reply

[–] paulddraper link

You are being unduly pessimistic (or maybe petulant/arrogant/toxic).

Ruby is the language for the second most popular HTTP MVC framework (Rails) and the first most common popular tool (Chef).

My biggest grip with Ruby are mostly that the community seems more amateurish than average. SO Ruby questions are like Javascript questions in 2008. with a lot of misinformation and the assumption that you were using jQuery (or Rails for Ruby). I'm sure there are a lot of Ruby experts that know how to program well. They just don't seem as common as say, in the Python community.

reply

[–] tychver link

"I always thought Smalltalk would beat Java. I just didn't know it would be called 'Ruby' when it did so." - Kent Beck

There's a decent demand for Ruby developers worldwide. I worked in New Zealand before getting paid relocation to Germany as a Ruby dev and I'm still getting daily "come to London" LinkedIn spam for Ruby work.

"Developer happiness" is how Matz describes minimising friction between the developer and the language. Ruby aims to provide abstractions with the lowest possible cognitive overhead. Contrast this with Rust which aims to provide abstractions with the lowest possible performance overhead. Everything is a trade-off.

Ruby's stagnation has nothing to do with the community. Improving Ruby performance is extremely difficult because of the sheer flexibility of the language. It doesn't help that until fairly recently no one got paid to work on Ruby. Now Heroku are paying several of the core team.

The IBM OMR project is working on bringing a companion JIT to standard Ruby. Oracle are working on a ground up re-implementation of Ruby using a new technology called Truffle + Graal for implementing languages on top of the JVM with performance which will be on par with Go.

reply

[–] qmr link

PHP is an absolutely horrible language. I suggest learning ruby or python.

reply

[–] spo81rty link

And .NET Core/C#

reply

[–] quotha link

5.times { print "Odelay!" }

reply

[–] gabrielc link

or elixir/erlang.

reply

[–] zapt02 link

I think Ruby is essentially dead in the waters. It just has no unique selling point. Python is better at small scripts, machine learning and mathematical application. PHP7 has much better tooling for HTTP, the largest CMS systems and doesn't have any boot time to speak of. JS & Node is the new kid on the block with tons of great libraries being written for it. Why would you start to learn Ruby today?

reply

[–] Something1234 link

Using jekyll and maintaining existing stacks.

reply

[–] ausjke link

Considering PHP7, Java8/Kotlin, Go, C++17, Python3, Javascript/ES6 etc these days, how will Rudy be doing in the long run? any reason for new comers to pick up Ruby instead of the mentioned list? I just started using PHP myself.

reply

[–] yxhuvud link

For starters, because it only works on mac.

reply

[–] omarforgotpwd link

Might have missed something, but why not just merge these changes into Rails?

reply

[–] burke link

No. I'm not opposed to it, but we don't use it at Shopify and I doubt the RubyVM::InstructionSequence API is compatible.

Bootscale should work, and the load-path-caching feature of bootsnap should work too, if you can get the gem to install.

reply

[–] misterbowfinger link

Are there plans to support JRuby?

reply

[–] undefined link
[deleted]

reply

[–] burke link

Honestly, it works well for us. DHH may have been a little rosier than necessary: there are some downsides, to be sure, but we can mitigate them to a large extent (e.g.: TFA), and we get a lot of benefit out of the architecture.

It is definitely a net positive for us. YMMV, of course.

reply

[–] noir_lord link

> In 2017 there is really no reason to defend a monolithic architecture.

I wonder if in 2019 I'll be seeing "In 2019 there is really no reason to define a micro-services architecture".

The pendulum it keeps on swinging.

reply

[–] rhizome link

thin/thick clients all over again.

reply

[–] noir_lord link

Yep and others, I've been around programming long enough to have seen that come and go several times now.

reply

[–] iagooar link