Return to Blog

Factual’s API: A Good Fit for Node

Most people have heard the buzz surrounding Node. Words like “fast”, “scalable”, “concurrency” come to mind. At Factual, we pride ourselves on using (and finding) the right tool for the job. We use everything from jquery to hadoop, postgresql to mongodb and fortunately our engineering culture gives us the leeway to experiment a little.  However, any technology that we deploy has to be measured and justified.  Qualities such as agility, performance, stability and cost are all part of the equation.

We understand that there are many misinformed developers out there that think Node means instant scalability and performance.  The truth is, as with a lot of technologies out there, it solves a very specific problem.  What does it solve?  How well does it solve it?  What are the trade offs?  This is our brief experience with Node and how it worked for us.

We’ve been using JavaScript since the very first iteration of our product with an interactive grid that allowed users to upload, update and fuse datasets into usable visualizations of data. I actually created a Ruby library that allowed us to build on some of the fundamentals of traditional object oriented programming at the time  (now known as Mochiscript).

Since then, we’ve started to focus more on curated data and having an API that can deliver it quickly.  This major shift in our product came with a whole set of new requirements.  We now have to deliver responses in under 200ms while handling basic things like: authentication, permission checking, real time statistics, and query processing.  We had to do this in an agile language that allowed us to build out features quickly to respond to popular developer requests.

Since we had experience in Ruby, we built our first prototype using a barebones Sinatra stack.  This gave us decent performance (+20ms with 120 concurrent connections on top of our datastores), but still, it didn’t quite scale enough for the type of traffic that we were anticipating.

At this point, we were considering Java and/or Clojure (both of which are used by other teams at Factual) but decided to look into Node because we needed to be agile and were already quite familiar with JavaScript.

After porting over Mochiscript from Ruby to Node and seeing just how fast Google’s V8 JavaScript Engine performed, we decided to write a prototype and pit it against our Ruby version.  The difference was outstanding: +10ms on top of our datastores processing 400 concurrent connections.  The combination between Node’s performance and the ability to use Mochiscript to help us scale our code base allowed us to go from prototype to production in a couple of months.  After a few more optimizations, we’re now under +5ms per request.

Before we dive too deep into praising Node, let’s list out some of the tradeoffs we made:

  • Still an immature framework (we discovered a socket leak in the earlier versions of their http libraries)
  • Spaghetti code: callbacks galore.  This can be mitigated through use of good design and sticking with certain patterns.  However, it’s still not fun.
  • Debugging can be soul crushing at times (mostly due to the first 2 problems)

The reason why Node was such a good fit for us was because:

  1. We were IO bound
  2. We used very little CPU per request
  3. We were able to use evented programming to help us aggressively cache

The first two reasons are nothing new.  In fact, the combination of these two seem to be the poster child of what Node does well.  These two reasons fit our problem set perfectly.  The third reason, however, is the one we want to shed some light onto.

Using node can be a bit of a paradigm shift from what many engineers are accustomed to.  Callbacks can get out of control.  However, the closures provided by JavaScript can be leveraged in other ways.  One way we found extremely useful is to reduce the number of redundant database queries.

Consider the following example of getting user data from the database:

Now let’s add a caching layer to this:

Since we have caching, we need a way to invalidate this.  The Redis pubsub feature is a great way to handle things like this for realtime updates to your cache:

For fun, we have a call that gives us stats on urls that were visited:

Using these patterns was a byproduct of the evented model Node/JavaScript employs and really gives us flexibility into caching, realtime statistics, and on-the-fly configuration.  Our experiences with redis and/or memcache for caching were great, but in order to squeeze every ounce of performance from our Node servers, we wanted to limit any redundant processing (even to the point where we didn’t want to parse json).

We understand that there are more cases where Node is NOT the solution.  However, in our case, it was a great fit.  Since we’ve started using Node about a year ago, it has matured tremendously and has great support from both Joyent and the rest of the open source community.  We’ve started using it for various other internal projects.  It is becoming more of a general solution and less niche.  I encourage any developer to explore Node and see if it is a good fit for his problem.  If anything, it’ll get you thinking about IO and how much time you spend waiting on it.