Estimating Population Size: Animals and Web Pages

Thinking about all of the things in the statistical world that we can estimate, the one that has always perplexed me is estimating the size of an unknown population \(N\). Usually when we compute estimates based on samples, we involve the size of the sample \(n\) somewhere, thus we take “size” for granted — the size of a sample is known. We also make inferences based on sample statistics using theory such as the Central Limit Theorem, but seem to never care about the population size \(N\), we either know it, or assume it is infinite. But, in fields like ecology and environmental studies, this attitude of gluttony is dangerous!

In the summers I spend a lot of time in the Eastern Sierra where there are deer and bears. These animals are so large that we cannot consider their population in the area to be infinite (like we may with ants or bacteria). A matter of fact, one of our local animal behavior specialists knows the exact number of bears that live in my mountain community. I always had just assumed he spent all day tracking down the local bears and marking them with radio collars. There must be some way […]

Highlights from My First NIPS

The first few hundred registrations received a mug.

As a machine learning practitioner in the Los Angeles area, I was ecstatic to learn that NIPS 2017 would be in Long Beach this year. The conference sold out in a day or two. The conference was held at the Long Beach Convention Center (and Performing Arts Center), very close to the Aquarium of the Pacific and about a mile from the Queen Mary. The venue itself was beautiful, and probably the nicest place I’ve ever attended a conference. It’s also the most expensive place I’ve ever had a conference. $5 for a bottle of Coke? $11 for two cookies? But I digress.I attended most of the conference, but as someone who has attended many conferences, I’ve learned that attending everything is not necessary, and is counterproductive to one’s sanity. I attended the main conference, and one workshop day, but skipped the tutorials, the Saturday workshops and the industry demos. The conference talks were livestreamed via Facebook Live at the NIPS Foundation’s Facebook page, and the recordings are also archived there.

This may make some question why one would actually want to attend the conference in person, but there are several!

to talk with the authors of interesting […]

Some New Year Resolutions for (this) Data Scientist in 2017

I’ve never been very big on New Year’s resolutions. I’ve tried them in the past, and while they are nice to think about, they are always overly vague, difficult to accomplish in a year, trite, or just don’t get done (or attempted). This year I decided to try something different instead of just not making resolutions at all. I set out some professional goals for myself as a Data Scientist. So without further ado…

1. Don’t Complain about It, Fix It: Contribute to Open Source Software (More)

Open source software is only as good as its community and/or developer(s). Developers are human and typically cannot manage all bugs and feature requests themselves. My goal is to routinely contribute back to the community either with new features, or by fixing bugs that I discover. This not only helps the community at large, but also helps me as a software engineer. There is no better way to become an even better engineer than by wading through someone else’s code. While this is something I did all day every day at my $DAYJOB, I do it less while on my sabbatical.

Some of the projects I use the most and that I hope to contribute to are scikit-learn and […]