Profiling Node.js

It’s incredibly easy to visualize where your Node program spends its time using DTrace and node-stackvis (a Node port of Brendan Gregg’s FlameGraph tool):

  1. Run your Node.js program as usual.
  2. In another terminal, run:
    $ dtrace -n 'profile-97/execname == "node" && arg1/{
        @[jstack(100, 8000)] = count(); } tick-60s { exit(0); }' > stacks.out

    This will sample about 100 times per second for 60 seconds and emit results to stacks.out. Note that this will sample all running programs called “node”. If you want a specific process, replace execname == "node" with pid == 12345 (the process id).

  3. Use the “stackvis” tool to transform this directly into a flame graph. First, install it:
    $ npm install -g stackvis

    then use stackvis to convert the DTrace output to a flamegraph:

    $ stackvis dtrace flamegraph-svg < stacks.out > stacks.svg
  4. Open stacks.svg in your favorite browser.

You’ll be looking at something like this:

This is a visualization of all of the profiled call stacks. This example is from the “hello world” HTTP server on the Node.js home page under load. Start at the bottom, where you have “main”, which is present in most Node stacks because Node spends most on-CPU time in the main thread. Above each row, you have the functions called by the frame beneath it. As you move up, you’ll see actual JavaScript function names. The boxes in each row are not in chronological order, but their width indicates how much time was spent there. When you hover over each box, you can see exactly what percentage of time is spent in each function. This lets you see at a glance where your program spends its time.

That’s the summary. There are a few prerequisites:

  • You must gather data on a system that supports DTrace with the Node.js ustack helper. For now, this pretty much means illumos-based systems like SmartOS, including the Joyent Cloud. MacOS users: OS X supports DTrace, but not ustack helpers. The way to get this changed is to contact your Apple developer liason (if you’re lucky enough to have one) or file a bug report at bugreport.apple.com. I’d suggest referencing existing bugs 5273057 and 11206497. More bugs filed (even if closed as dups) show more interest and make it more likely Apple will choose to fix this.
  • You must be on 32-bit Node.js 0.6.7 or later, built --with-dtrace. The helper doesn’t work with 64-bit Node yet. On illumos (including SmartOS), development releases (the 0.7.x train) include DTrace support by default.

There are a few other notes:

  • You can absolutely profile apps in production, not just development, since compiling with DTrace support has very minimal overhead. You can start and stop profiling without restarting your program.
  • You may want to run the stacks.out output through c++filt to demangle C++ symbols. Be sure to use the c++filt that came with the compiler you used to build Node. For example:
    c++filt < stacks.out > demangled.out

    then you can use demangled.out to create the flamegraph.

  • If you want, you can filter stacks containing a particular function. The best way to do this is to first collapse the original DTrace output, then grep out what you want:
    $ stackvis dtrace collapsed < stacks.out | grep SomeFunction > collapsed.out
    $ stackvis collapsed flamegraph-svg < collapsed.out > stacks.svg
  • If you’ve used Brendan’s FlameGraph tools, you’ll notice the coloring is a little different in the above flamegraph. I ported his tools to Node first so I could incorporate it more easily into other Node programs, but I’ve also been playing with different coloring options. The current default uses hue to denote stack depth and saturation to indicate time spent. (These are also indicated by position and size.) Other ideas include coloring by module (so V8, JavaScript, libc, etc. show up as different colors.)

For more on the underlying pieces, see my previous post on Node.js profiling and Brendan’s post on Flame Graphs.


Dave Pacheco blogs at dtrace.org

This entry was posted in Uncategorized. Bookmark the permalink.

8 Responses to Profiling Node.js

  1. Looking forward to have DTrace enabled and 64-bit support in upcoming Node release, been using x64 in production in 6 months, would love to do such profiling to optimize the application logic as soon as possible.

  2. Dave Pacheco says:

    @David: Thanks for that feedback. What OS are you running on?

  3. NodeJs News says:

    (Sorry I hit the Post button to fast and leave an empty name and website)

    Hi,

    I wrote 2 articles with video about Dtrace and SmartOS (see below). Really thank’s to you for this tutorial. Anyway, we have tried to use SmartOS, but setup is a little tricky, even for guys like us who daily use Linux.

    Is SmartOS a full fork of OpenSolaris, and you follow there dev stream or does it stand beside ?

    Thx

    http://www.nodejs-news.com/NodeConf/dtraceconf2012-dtrace-in-nodejs
    http://www.nodejs-news.com/nodejs-tech/profiling-nodejs-app

    • Dave Pacheco says:

      @NodeJs News: Thanks for the comment, and sorry you’ve had trouble with SmartOS. If you haven’t already checked it out, there’s a pretty good community in #smartos and #illumos on IRC and the respective mailing lists. OpenSolaris is dead. That was Sun’s distribution, which Oracle killed in 2010. illumos (initially a fork of OpenSolaris) is now where the dev actually happens. SmartOS is an open source fork that closely follows illumos. I’ll be sure to check out those posts!

      • NodeJs News says:

        Thank’s for the quick answer ! We will dive into IRC for help request 😉 As we aren’t used to Opensolaris I wasn’t aware that Oracle ended it 😦
        Thx

  4. anup10000 says:

    This seems good. Is there anything planned on similar lines for guys like me who use Linux? Curious to know why it was done only for Solaris. Is there anything here, that a Linux user can take advantage of?

    • Dave Pacheco says:

      @anup1000: We used DTrace for this (rather than writing our own tool from scratch) because it already provides a robust framework for tracing both userland and kernel events in production. All we had to do was describe (in D) how to take a userland stack trace and print something meaningful. By using DTrace, we’re also able to capture userland stacktraces not just for profiling, but at arbitrary events in both the kernel in userland, like when the kernel scheduler takes a thread off-cpu, when a disk i/o is dispatched or completed, when an HTTP request is received, and so on. We’ve used this to debug several nasty non-performance issues (by identifying the JavaScript code that’s responsible for causing a system-level event).

      Besides illumos/Solaris-based systems, this approach could in principle work on OS X and BSD, since both have DTrace implementations, except that OS X’s doesn’t support ustack helpers at the moment (see above). I don’t know whether BSD’s does. There’s also a Linux DTrace port in progress.

      Brendan Gregg has made flame graphs using Linux’s SystemTap, with mixed results (see http://dtrace.org/blogs/brendan/2012/03/17/linux-kernel-performance-flame-graphs/). I don’t know whether SystemTap has anything analogous to the ustack helper for printing a human-readable name for JITed code (which normally just shows up as hex addresses, since they don’t correspond to functions in the binary’s symbol table).

  5. anup10000 says:

    Thanks Dave for the quick answer. I guess I will have to pull up a SmartOS instance to profile my app. I will need the “ustack helper” I believe.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s