Login

Factual Blog /

Data-Driven SXSW 2012: Vote Now

South by Southwest (SXSW) invites the Internet community at-large to vote on the 3285 panel and speaker proposals. This popular input is balanced with editorial and staff contributions to decide the 300+ presentations and panels that will form SXSW Interactive 2012.

The Panel Picker closes midnight, Friday Sept 2. Get your votes in for your favorite Big Data proposals. Here’s a small section, including our own proposal with colleagues Pete Warden of Jetpak, George Oates from the Open Library, and Ian Eure from Simple Geo:

Creating an Internet of Entities (Factual et al.) The Internet today consists of a morass of redundant content: the ~17m businesses and POI in the US, for example, are duplicated over 1.2 billion website across over 5 million domains. This tangle of duplicate, fragmentary, and often incorrect information ensures that unequivocally identifying a person, place or thing on the Internet will always be a challenge. The members of this panel are working to fix this, and will discuss their projects in the Library, Geo, and Big Data sectors that are creating an Internet where real-world things and concepts can be referenced unambiguously. It focuses on pragmatic, real-world examples: the panelists each highlight their specific experiences in creating platforms and apps that identify and disambiguate individual entities across applications and verticals, and describe both the pitfalls and benefits of working towards an Internet of Entities.

Disambiguation: Embrace wrong answers & find truth (Infochimps) Where are all the coffee shops in my neighborhood? Seemingly easy questions can become complex when you consider ambiguity. This one sounds simple until you consider that folks may define “coffee shop” differently and the boundaries of your “neighborhood” differently. One person’s Central Austin, may be someone else’s South Dallas. How about instead of working too hard to define the parameters in an attempt to completely remove the ambiguity, we instead look at what people do, interact with and talk about. We can watch what people do and decide from there what a coffee shop is and where the boundaries of your neighborhood are. It might not be the “truth”, but it can be darn close. When we learn to embrace ambiguity, not only can we still find the answers to our questions, but we can also find answers to questions we hadn’t even thought to ask.

Big Data: Powering the Race for the White House (Engage et al.) Despite the advent of new media, campaigns for President still measure the electorate in pretty much the same way they did 40 years ago, through traditional polls to landline phones. That could all change this year. The hottest job in today’s Presidential campaigns is the Data Mining Scientist – whose job it is to sort through terabytes of data and billions of behaviors tracked in voter files, consumer databases, and site logs. They’ll use the numbers to uncover hidden patterns that predict how you’ll vote, if you’ll pony up with a donation, and if you’ll influence your friends to support a candidate. This panel will delve deep into the world of real-time data on Presidential campaigns, showing how it’ll be used to make decisions on everything from the layout of a signup form to where to spend millions of advertising dollars in the closing days of a campaign. Forget about which candidate has the most likes on Facebook or followers on Twitter – and learn why 2012 will be the year of Big Data in American politics.

How to get super powers with crowdsourcing (Gigwalk) Ever looked at your e-mail inbox and thought to yourself “If only I could replicate me, maybe I could get all this stuff done?” Ever needed to validate data in 100 different locations – such as every bill board you’ve placed across America – all at once? Every wondered which market in your neighborhood has the cheapest beer? Ever thought, hmmm wouldn’t it be funny if Moby Dick was translated into emoji emoticons? Well, you can do all of that and more – and you don’t really need super powers to make it happen. Just a crowd. A tech-savvy, distributed, connected crowd who can extend you. This presentation will provide a serious and sometimes humorous look into crowdsourcing and the impact it has on the work we do.

Freakin’ fast Cassandra: how do they do it? (Twitter, Acunu) Ryan King (System Engineer at Twitter) and Tom Wilkie (co-founder and VP engineering at Acunu) delve into the bowels of Apache Cassandra, the highly scalable second-generation distributed database in use at Twitter, Netflix and more others. In this talk, they’ll look at how Cassandra works and show you how to make it growl! This dual session will share the journey the Cassandra team at Twitter has taken to make Cassandra deliver on its promises while Acunu will talk about the dramatic performance improvements that take place when you move some of the heavy-lifting into the Linux kernel, by using the open source storage engine for Big Data, codename Castle.

- Tyler Bell