Factual Blog / Tagged:


A Day in the Life of a Factual Engineer: Polygon Compression

In this series of blog posts, Factual’s blog team asked engineers to describe what a typical day looks like. Background Chris Bleakley, our resident polygon and Lucene expert, had written meticulous documentation about the problem he was solving. The first paragraph read: “Because search times are dominated by the cost of deserializing JTS objects from when...

The Humongous nfu Survival Guide

Github: github.com/spencertipping/nfu A lot of projects I’ve worked on lately have involved an initial big Hadoop job that produces a few gigabytes of data, followed by some exploratory analysis to look for patterns. In the past I would have sampled the data before loading it into a Ruby or Clojure REPL, but increasingly I’ve started to...

nfu: Command-line Numeric Fu

Note: Explore nfu on Github here We often use the UNIX command line for ad-hoc data crunching. Most of the time we have the good sense to use a better tool after the first 100 characters or so, but sometimes we’ll just blow past the right margin with a string of sort, uniq -c, sort -nr,...