Matthew Lindfield Seager

Matthew Lindfield Seager

So happy to solve a longstanding performance bug! 🎉

I finally solved a hugely annoying performance bug in one of my Rails apps! To give you an idea of how bad it had gotten, just before the fix an admin page I use 5-10 times every Saturday was averaging 21,114.2 milliseconds per load!!! 😱 Although really, when the times are that large, milliseconds is probably the wrong unit of measurement… That page was taking 0.00024 days to load!!! And the trend was only getting worse!

That same page is now averaging 22.4 milliseconds, about 3 orders of magnitude quicker!

I’d been trying to figure it out for months but I was hampered by a combined lack of:

  • motivation; public facing pages were “only” taking 2-4 seconds and no one had complained
  • expertise; I’ve never had to solve performance problems on my own before
  • tooling; I had inexplicable issues getting Rack Mini Profiler flamegraphs to work on the site, in development or on production 🤷‍♂️

The tooling problems were particularly frustrating so I owe a massive thank you to Nate Berkopec from Speedshop who not only puts out a ton of great content but was also very patient with my beginner questions about why my flamegraphs weren’t working. I didn’t end up figuring out that problem, but at Nate’s suggestion I switched to rbspy to create a flamegraph and within about an hour I’d figured out the problem that I’d previously spent months off and on trying to solve.

It turns out that every time I fetched a person from the database, my app was generating a random password for them. That process is a bit slow (possibly by design? to avoid timing attacks?? maybe???). Individually, 200 milliseconds to generate a password isn’t a big deal… but on the admin page I load a list of all the people, so every time a new person got added, the page slowed down by another 1/5 of a second 🤦‍♂️

In the end the fix was super simple, I now only generate the random password (and other default values) if the person isn’t already in the database:

# Before
after_initialize :set_default_values

# After
after_initialize :set_default_values, if: :new_record?

18 more characters to remove a 94,160% slow down! Plus now the user facing pages are down below 30 milliseconds too! 🥳

For future reference, here’s how I tracked down the issue:

  • Installed rbspy using homebrew
  • Obtained my web server process’ PID from tmp/pids/server/pid
  • Captured a trace on my Mac using sudo --preserve-env rbspy record --pid 86229 --format speedscope --subprocesses
  • [Not tested but could maybe combine the above two steps with something like sudo --preserve-env rbspy record --pid `cat tmp/pids/server/pid` --format speedscope --subprocesses]
  • Request the offending page or pages then go back and end the rbspy recording with control-c
  • Drag the resulting speedscope file from ~/Library/Caches/rbspy into speedscope
  • Found the offending call at the very bottom of the flamegraph “stalactites”