It’s Been a While

This past three years has really flown. It’s time for me to finally get back to my roots and also start blogging more, like I did previously.

My last post was about Strata 2013. During this time period, I was taking a break from working full-time to finish a Ph.D. dissertation that I had neglected during my previous two positions. I learned my lesson the hard way, never work externally if you want a Ph.D. in a reasonable amount of time! I quickly got my dissertation from an intro to the first 65 pages or so during this gap. I then received an offer from Facebook. I was ready to move to Silicon Valley and enjoy all the things I had been envious over for so many years: the perks, the culture of innovation and intelligence, and the technology community. This was an opportunity I could not pass, and the dissertation went on the back-burner for another two years as I spent the majority of my waking hours, both during the week and the weekend… and on holidays… coding into a frenzy. I was looking forward to living in a world where I was entrenched in the technology and data ecosystem. But…

The Grass Isn’t Greener on the Other Side

The technology community is definitely there and is obviously very strong, but it isn’t what I thought it would be. Due to the sheer size of the industry, meetups and other events were very impersonal compared to what I was used to in the LA area. Additionally, it seems that most of that original Silicon Valley startup energy has moved to San Francisco. To get to meetups, I would spend hours on shuttles, Caltrain, BART and Muni getting to SoMa and then being disappointed at the frequent company pitches instead of discussing actual science and technology. Not all groups are like that, as I attended plenty of meetups that were technical and whetted my appetite to learn more. Of course, there was also the question if I could even get into the meetup. The majority of the meetup groups I was a part of would fill up in a few hours for a hot topic or engaging speaker with waiting lists sometimes 100 to 200 people long. The final blow was that my attendance assumed I could get away from my projects at work, which I really could not. My technology community ended up being the others at the company, which may have been helpful for my job, but gave me a narrower focus than I wanted, and was just one more thing that kept me at work. Meetups are not the only important thing in the technology community though. I did attend a few conferences such as ACM SIGCIKM, BayLearn, Strata 2014 (but for recruiting), and I spoke at PyData when it was held at Facebook. To be fully immersed in the technology community and experience, it seems one now needs to live in San Francisco, and San Francisco is definitely not the city for me — I am more of a Silicon Valley suburb type but the energy wasn’t the same.

I am not alone when I say that I spent most of my waking life working. Since I had moved there for a job, and I didn’t have any roots, friends or family in the area I thought it would make sense for me to do this. But, working at this rate took a toll on me physically, mentally and emotionally. Although there is a lot to do in the Bay Area, there really wasn’t any time to do it because of the work culture. And people didn’t seem to have time for me for the same reason. This is not true for everyone, but I found it much more true in the Bay Area than anywhere else I have lived. To add to the long work hours, this is not the first time I have been an “overachiever” in life — this is something I had been afflicted with since high school (the 90s).

Not only is there stress from long hours and a lack of any outside world, the Bay Area is extremely expensive — nobody argues against this point. My 850 square foot one-bedroom apartment in Mountain View is now on the market for $4200/month. Buying a house is typically not practical for new and mid-career engineers unless they have been at a large company for a while, or had a big payout from a startup, or were willing to have a longer commute from outside of the valley. A small one-bedroom house can easily list at over one-million dollars in Mountain View and Palo Alto. Next, there are going to be n other bidders that also want the same house. It is incredibly common for already ridiculously overpriced houses in Palo Alto to sell for a lot more than the listing price. If you are single, you are going to need that tax writeoff, or you are in for a huge surprise at tax time. This lifestyle is not sustainable in the long run. And for me, it was an issue that does not make me miss the area much.

On the other hand, many parts of the Bay Area are absolutely beautiful. From the green forests above Santa Cruz, to the pristine coastline from Monterey north, the green rolling hills in San Jose and the East Bay, to the bizarre other-worldly marshes along the bay. It was a dry two years so the weather was not all that different from LA.

I learned a valuable lesson. It’s true that the grass isn’t greener on the other side. You can shower a person with free meals, free rides and other perks (I even forget what they were… they ended up not being important), but all it does is keep you at work, and keep you engaged with only that one part of your life. Your “friends” end up being at the company and ends up being a bad thing in such a competitive environment. Other perks like an onsite doctor, dentist and physical therapist may sound nice, but they were not up to par with services I received elsewhere, and again are just ways to keep you at work. These things are gimmicks. They are good to entice people, they are good to make life convenient, but they really are just ways to keep you at work and pay you less.

Burn Out: Time to Reflect and Slow Down

When I returned to LA, I drove down Pacific Coast Highway and looked out to the ocean. As the orange winter sun beat on my face through the window, I could not believe it had been 2 years since I had taken that drive. That was not like me. I lived for the beach atmosphere and the sense of unwinding it provided me. At that moment, I realized that I wanted to slow down my “nerves” — not only back to my original levels, but even slower. I wanted to take time out for myself, not only to finish my research, but also to enjoy life, and think about what makes me happy both as a person and as a professional. I realized that in that two years, I lost myself and with each lonely day, I lost my passion for my interests and I did not have many hobbies other than working.

For the past ten years, I’ve spent time in the Eastern Sierra but not nearly enough. I was finally able to buy a vacation home in Mammoth and now have the time to enjoy it. I have spent the past several months hiking, mountain biking, snowshoeing and taking drives through some of nature’s finest beauty. The solitude and intrigue of the wilderness is very cleansing and good for the soul. I’ve also lost quite a bit of weight since leaving behind the free meals and becoming so active. I had always been a brisk walker, and always preferred to use my legs rather than my wheels, but this was the first time since college I did vigorous heart-pumping activity on a routine basis. When I look back, I realize it was not just the past three years that have burned me out. It’s the entire way I have been living my adulthood. 

Things Have Changed

I am not closing any doors in this post, but I have learned to value a sane work environment and work hours over perks and pure compensation. Rather than focus on compensation and working at a “hot” company, my intention is to do work that benefits the common good with respect to my interests, while providing me with the means to live, retire, and have funds available for my own hobbies and side projects. There is only one thing that I will not compromise on (ok, two): I must be able to wear shorts, and I must be able to have flexible hours. Whether or not to accept a position is now a lot more complex than looking at a company’s base product and having coding, machine learning and statistics in the job requirements/description. I do not want spend 1% of my time doing machine learning using some basic model (i.e. Naive Bayes or Logistic Regression) and the other 99% scaling it to billions of observations. Rather, I would like to be able to explore more on the machine learning side, and learn new algorithms and methods for prediction and classification. This does not mean that I completely want to move away from the systems engineering stuff, but it will really depend on the product and the team rather than just the company.

For the time being, I am consulting and also mentoring a startup. I may continue to do that as my career, I may not. I have several ideas for startups that I may pursue, but I may not. Who knows, maybe I will return to the Bay Area (under the constraints I mentioned earlier), or I may not. I will probably return to being more active in the community like I used to be, but I have realized that there is a lot of noise, hype and ego in the blogosphere, Twittersphere, and these thousands of dollars conferences. There is something to be said by just doing my own thing. Maybe I have just grown as a professional, I don’t know. I just know that these things should be taken with a grain of salt.

Switching Fields?

After a lot of introspection, I want to take a look at some other fields outside of “pure tech” including but not limited to:

  • Environmental and activity geospatial data. After living in the mountains, I’ve become very interested in environmental data, particularly using time series, GPS telemetry and geospatial analysis. My interest in this field has applications from everything from efficient placement of snowmobiles for SAR operations, to action sports and activity intelligence, even navigation.
  • Finance. Finance used to be on my list of “never ever.” After learning more about economics and Wall Street from my time in startups and Silicon Valley, I am also interested in some applications in finance. Machine learning is obviously very useful for automated investing, but data visualization has proven to be useful in manual transactions for me.
  • Education. My original draw to statistics was the field of psychometrics and the develop of educational assessments. I am considering going back in this direction. I am also interested in the educational technology sector improving the delivery of educational materials and assessment of learning. Of course, I may go into teaching altogether, most likely at the college level, or as some type of training consultant.
  • Aviation: Airliners and drones. People that know me well know that I love airports, airlines and flying. Aviation uses a lot of different data science techniques. Drones are an emerging technology and routing drones in the sky has become a challenge that companies are working on. Routing, both for drones and airliners, uses geospatial/map data and network/graph data and takes into account many variables that affect flight, airspace congestion, and airport/ground resource usage. Wait time and queuing theory is also very important for runway operations. There is a lot of game theory, network analysis, and other data science involved in pricing and scheduling of airliner flights. All of these challenges are interesting to me.
  • “Internet of Things.” It annoys me that the emerging field of embedded systems, their development and data processing has become yet another cheap buzzword like “big data” or the misuse of the term “data science.” Devices such as the Raspberry Pi, Arduino and custom printed circuit boards allow the masses to create new data collection devices that unobtrusively fit anywhere data need recording. While the data itself is interesting, in this one particular case, I am actually more interested in the hardware, and pure engineering side rather than the data science side.
  • Security is an exponentially growing field that has become pivotal not only for national security, but for privacy. Security is a field that is very interesting to me, but one I know very little about, and thus is an option for a more ambitious change of field. I can see it being a field I would be passionate about the more I learn about it. Security would be unchartered waters for me, but I do not see it as a field that will be disappearing anytime soon.

After typing up this list and re-reading it, I realize I still have the same level of passion I always did, and perhaps my soul needed to focus on something else for a while. Now I just have to make the choices of which ones are the most rewarding, and which ones provide the best opportunities for me. In any job interview, there is always the “Do you have any questions for me/us?” Over the past several years, I have compiled a long list of questions. And if I do not like the answer, or if I can tell the interviewer is BSing the answer, abort! Perks and big names are not the key to happiness or a more fulfilled life — becoming a better person and being able to enjoy the process of life is.

Below are some pictures from my neighborhood!

The Village at MammothThe View from my WindowTwin Lakes near Tamarack Lodge, During a SnowstormMinaret Summit facing the White MountainsMinaret VistaShadow LakeAgnew MeadowsThe Road to Red's Meadow ValleyTwin Lakes in SummerDuck LakeMountain Bike!Lakes George and MaryMountain Biking off the top of Mammoth MountainRainbow FallsLake George

Summary of My First Trip to Strata #strataconf

In this post I am goIing to summarize some of the things that I learned at Strata Santa Clara 2013. For now, I will only discuss the conference sessions as I have a much longer post about the tutorial sessions that I am still working on and will post at a later date. I will add to this post as the conference winds down.

The slides for most talks will be available here but not all speakers will share their slides.

This is/was my first trip to Strata so I was eagerly awaiting participating as an attendant. In the past, I had been put off by the cost and was also concerned that the conference would be an endless advertisement for the conference sponsors and Big Data platforms. I am happy to say that for the most part I was proven wrong. For easier reading, I am summarizing talks by topic rather than giving a laundry list schedule for a long day and also skip sessions that I did not find all that illuminating. I also do not claim 100% accuracy of this text as the days are very long and my ears and mind can only process so much data when I am context […]

Merry Christmas and Happy Holidays!

Wishing you all a very Merry Christmas, Happy Holidays and Happy New Year!

An update on me. In October, I began working at Riot Games, the developers of League of Legends. It has been an amazing experience and has occupied the majority of my free time as has my dissertation. My New Year’s resolution this year is to dust the cobwebs off this blog!

Have a safe holiday season!

Here in California, I will be having Christmas in the Sand

A New Data Toy -- Unboxing the Raspberry Pi

Last week I received two Raspberry Pis in the mail from AdaFruit and just now have some time to play with them. The Raspberry Pi is a minimal computer system that is about the size of a credit card. In the embedded systems community, the excitement is for obvious reasons, but I strongly believe that such a device can help collect and use data to help us make better decisions because not only is it a computer, but it is small and portable.

For development, Raspberry Pi can connect to a television (or other display) via HDMI or composite video (the “yellow” plug for those still stuck in the 1900s haha). A keyboard, mouse and other devices can be connected via two USB ports. A powered hub can provide support for even more devices. There are also various pins for connecting to a breadboard for analyzing analog signals, for a camera or for an external (or touchscreen) display. An SD Card essentially serves as the hard disk and probably a portion of the RAM. The more recent Model B ships with 256MB RAM. Raspberry Pi began shipping in February 2012 and these little guys have been very difficult to get a […]

Adventures at My First JSM (Joint Statistical Meetings) #JSM2012

During the past few decades that I have been in graduate school (no, not literally) I have boycotted JSM on the notion that “I am not a statistician.” Ok, I am a renegade statistician, a statistician by training. JSM 2012 was held in San Diego, CA, one of the best places to spend a week during the summer. This time, I had no excuse not to go, and I figured that in order to get my Ph.D. in Statistics, I have to have been to at least one JSM. […]

OpenPaths and a Progressive Approach to Privacy

OpenPaths is a service that allows users with mobile phones to transmit and store their location. It is an initiative by the New York Times that allows users to use their own data, or to contribute their location data for research projects and perhaps startups that wish to get into the geospatial space. OpenPaths brands itself as “a secure data locker for personal location information.” There is one aspect where OpenPaths is very different from other services like Google Latitude: Only the user has access to his/her own data and it is never shared with anybody else unless the user chooses to do so. Additionally, initiatives that wish to use a user’s location data must be asked personally via email (pictured below), and the user has the ability to deny the request.The data shared with each initiative provides only location, and not other data that may be personally identifiable such as name, email, browser, mobile type etc. In this sense, OpenPaths has provided a barebones platform for the collection and storage of location information. Google Latitude is similar, but the data stored on Google’s servers is obviously used by other Google services without explicit user permission.

The service is also opt-in, that […]

SIAM Data Mining 2012 Conference

Note: This would have been up a lot sooner but I have been dealing with a bug on and off for pretty much the past month!

From April 26-28 I had the pleasure to attend the SIAM Data Mining conference in Anaheim on the Disneyland Resort grounds. Aside from KDD2011, most of my recent conferences had been more “big data” and “data science” oriented, and I wanted to step away from the hype and just listen to talks that had more substance.

Attending a conference on Disneyland property was quite a bizarre experience. I wanted to get everything I could out of the conference, but the weather was so nice that I also wanted to get everything out of Disneyland as I could. Seeing adults wearing Mickey ears carrying Mickey shaped balloons, and seeing girls dressed up as their favorite Disney princesses screams “fun” rather than “business”, but I managed to make time for both.

The first two days started with a plenary talk from industry or research labs. After a coffee break, there were the usual breakout sessions followed by lunch. During my free 90 minutes, I ran over to Disneyland and California Adventure both days to eat lunch. I managed to […]

My Interview about the Statistics Major

Recently, I participated in an email interview about what being a Statistics major entailed, how I got interested in the field and the future of Statistics. I figured this might be of interest to those that are contemplating majoring in Statistics, or considering a career in Data Science.

Q1: Why did you decide to pursue a major in statistics in college?

A: “When I was a kid, I really enjoyed looking at graphs, plots and maps. My parents and I could not make of what was behind the interest. At the same time, I was also heavily interested in education. My mother was a teacher and the first set of statistics I ever encountered were standardized test scores. I strived to understand what the scores attempted to say about me, and why such scores and tests are so trustworthy. When the stakes increased with the AP and SAT exams, I began reading articles published by the Educational Testing Service and learned a ton about how these tests are constructed to minimize bias, and how scores are comparable across forms. It fascinated me how much science goes into these tests, but in the end of the day they are still just one factor […]

“Hold Only That Pair of 2s?” Studying a Video Poker Hand with R

Whenever I tell people in my family that I study Statistics, one of the first questions I get from laypeople is “do you count cards?” A blank look comes over their face when I say “no.”

Look, if I am at a casino, I am well aware that the odds are against me, so why even try to think that I can use statistics to make money in this way? Although I love numbers and math, the stuff flows through my brain all day long (and night long), every day. If the goal is to enjoy and have fun, I do not want to sit there crunching probability formulas in my head (yes that’s fun, but it is also work). So that leaves me at the video Poker machines enjoying the free drinks. Another positive about video Poker is that $20 can sometimes last a few hours. So it should be no surprise that I do not agree with using Poker to teach probability.  Poker is an extremely superficial way to introduce such a powerful tool and gives the impression that probability is a way to make a quick buck, rather than as an important tool in science and society. The only […]

Merry Christmas 2011 From Byte Mining!

To all of my readers and followers, I wish you a very Merry Christmas and a very joyous and safe Happy New Year! This year, I am thankful for the community that has sprung up around Data Science and open-source data collection and processing. This blog is almost two years old, and like with Twitter, I have been able to communicate with many data scientists, enthusiasts and some of the most prolific contributors to the data science software community. I am thankful for all of the wonderful people I have met and have yet to meet, and for your comments and reading.