Merry Christmas from Byte Mining!

To all of my readers and followers, I wish you a very Merry Christmas and a very joyous and safe Happy New Year! I will be spending the holidays coding on my new Motorola Droid X (goodbye AT&T!).

Some Lessons in Production Development (Hadoop) - Part 1

Wow. I can’t believe it has been a month since I have posted. On December 1, I started a new chapter in my life, working full time as a Data Scientist at the Rubicon Project. Needless to say, that has been keeping me occupied, as well as thinking about working on my dissertation. For the time, I am getting settled in here.

When I accepted this position, one of my hopes/expectations would be to become professionally competent and confident in C, Java, Python, Hadoop, and the software development process rather than relying on hobby and academic knowledge. That is something a degree cannot help with. It has been a great experience, although very frustrating, but that is expected when jumping into development professionally.
    

I am writing this post to chronicle what I have learned about using Hadoop in production and how it majorly differs from its use in my research and personal analysis.
To start, I was asked to check out a huge stack of code from a Subversion repository. But then what?

But you’re a Computer Scientist! This should be easy!

The first part is true, but there is a stark difference between a garden variety computer scientist and one that converts from [...]