<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments for Byte Mining</title>
	<atom:link href="http://www.bytemining.com/comments/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.bytemining.com</link>
	<description>My thoughts on data mining, machine learning, programming languages, open-source software and general nerdery.</description>
	<lastBuildDate>Tue, 31 Jan 2012 00:17:15 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>Comment on My First Few Days with RStudio by MattNapsAlot</title>
		<link>http://www.bytemining.com/2011/03/my-first-few-days-with-rstudio/comment-page-1/#comment-2770</link>
		<dc:creator>MattNapsAlot</dc:creator>
		<pubDate>Tue, 31 Jan 2012 00:17:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=532#comment-2770</guid>
		<description>Great post. After stubbornly resisting graphical R IDEs for over 10 years, I finally switched from ssh/terminal + emacs to RStudio. Definitely the best R IDE ever made and I recommend it to anyone who hasn&#039;t tried it.

What finally got me to switch was RStudio server edition which I run on my 12-core MacPro. I did some port-forwarding magic and now I can easily rejoin my session from anywhere. It&#039;s great that they even have an integrated shell tool so you can jump to the system shell whenever you need to. Great software!

I should mention that it was a bit tricky to get the Rstudio server to build and run on OSx Lion. I documented the procedure on a wiki in case there are any mac users out there who want to run Rstudio server: http://sagebionetworks.jira.com/wiki/display/SYNR/Building+RStudio+from+Source+%28OSx+Lion%29</description>
		<content:encoded><![CDATA[<p>Great post. After stubbornly resisting graphical R IDEs for over 10 years, I finally switched from ssh/terminal + emacs to RStudio. Definitely the best R IDE ever made and I recommend it to anyone who hasn&#8217;t tried it.</p>
<p>What finally got me to switch was RStudio server edition which I run on my 12-core MacPro. I did some port-forwarding magic and now I can easily rejoin my session from anywhere. It&#8217;s great that they even have an integrated shell tool so you can jump to the system shell whenever you need to. Great software!</p>
<p>I should mention that it was a bit tricky to get the Rstudio server to build and run on OSx Lion. I documented the procedure on a wiki in case there are any mac users out there who want to run Rstudio server: <a href="http://sagebionetworks.jira.com/wiki/display/SYNR/Building+RStudio+from+Source+%28OSx+Lion%29" rel="nofollow">http://sagebionetworks.jira.com/wiki/display/SYNR/Building+RStudio+from+Source+%28OSx+Lion%29</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on &#8220;Hold Only That Pair of 2s?&#8221; Studying a Video Poker Hand with R by Ryan</title>
		<link>http://www.bytemining.com/2012/01/hold-only-that-pair-of-2s-studying-a-video-poker-hand-with-r/comment-page-1/#comment-2740</link>
		<dc:creator>Ryan</dc:creator>
		<pubDate>Thu, 19 Jan 2012 21:30:01 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=1013#comment-2740</guid>
		<description>Thanks. Fixed.</description>
		<content:encoded><![CDATA[<p>Thanks. Fixed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on &#8220;Hold Only That Pair of 2s?&#8221; Studying a Video Poker Hand with R by Leonard de Assis</title>
		<link>http://www.bytemining.com/2012/01/hold-only-that-pair-of-2s-studying-a-video-poker-hand-with-r/comment-page-1/#comment-2739</link>
		<dc:creator>Leonard de Assis</dc:creator>
		<pubDate>Thu, 19 Jan 2012 21:10:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=1013#comment-2739</guid>
		<description>hi,

I guess you forgot to insert &#039;library(ggplot2)&#039; in your code, because when someone runs it, there s an error in last row (chart using ggplot2 syntax)

Regards,

Leonard</description>
		<content:encoded><![CDATA[<p>hi,</p>
<p>I guess you forgot to insert &#8216;library(ggplot2)&#8217; in your code, because when someone runs it, there s an error in last row (chart using ggplot2 syntax)</p>
<p>Regards,</p>
<p>Leonard</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on &#8220;Hold Only That Pair of 2s?&#8221; Studying a Video Poker Hand with R by Ryan</title>
		<link>http://www.bytemining.com/2012/01/hold-only-that-pair-of-2s-studying-a-video-poker-hand-with-r/comment-page-1/#comment-2737</link>
		<dc:creator>Ryan</dc:creator>
		<pubDate>Wed, 18 Jan 2012 08:18:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=1013#comment-2737</guid>
		<description>Yes, you&#039;re right. The payout is more important. I did write some code to do that, but after seeing that every win was more likely under strategy 1, and me getting lazy, I didn&#039;t analyze it.</description>
		<content:encoded><![CDATA[<p>Yes, you&#8217;re right. The payout is more important. I did write some code to do that, but after seeing that every win was more likely under strategy 1, and me getting lazy, I didn&#8217;t analyze it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on &#8220;Hold Only That Pair of 2s?&#8221; Studying a Video Poker Hand with R by Richie Cotton</title>
		<link>http://www.bytemining.com/2012/01/hold-only-that-pair-of-2s-studying-a-video-poker-hand-with-r/comment-page-1/#comment-2735</link>
		<dc:creator>Richie Cotton</dc:creator>
		<pubDate>Tue, 17 Jan 2012 22:10:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=1013#comment-2735</guid>
		<description>This is awesome. Ta.</description>
		<content:encoded><![CDATA[<p>This is awesome. Ta.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on &#8220;Hold Only That Pair of 2s?&#8221; Studying a Video Poker Hand with R by Evan Sparks</title>
		<link>http://www.bytemining.com/2012/01/hold-only-that-pair-of-2s-studying-a-video-poker-hand-with-r/comment-page-1/#comment-2734</link>
		<dc:creator>Evan Sparks</dc:creator>
		<pubDate>Tue, 17 Jan 2012 19:00:44 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=1013#comment-2734</guid>
		<description>Very cool post, and nice, easy-to-read code.

You&#039;re pretty explicit about the question here - &quot;what proportion of hands am I likely to win?&quot; in each of the two situations. Of course, the decision of what to do in this situation should be &quot;which of these two options maximizes my expected profit?&quot; Because a full house pays out more than a pair, for example, the second strategy may be more profitable. The way to handle this would be to weight the outcomes by the payouts when you take the average which would give you average dollars expected by from each strategy. Then, pick the strategy that gives you the highest expected payout.

As you say, though, all outcomes are more likely in strategy 1 than in strategy 2, so weighting the outcomes differently won&#039;t have an impact.</description>
		<content:encoded><![CDATA[<p>Very cool post, and nice, easy-to-read code.</p>
<p>You&#8217;re pretty explicit about the question here &#8211; &#8220;what proportion of hands am I likely to win?&#8221; in each of the two situations. Of course, the decision of what to do in this situation should be &#8220;which of these two options maximizes my expected profit?&#8221; Because a full house pays out more than a pair, for example, the second strategy may be more profitable. The way to handle this would be to weight the outcomes by the payouts when you take the average which would give you average dollars expected by from each strategy. Then, pick the strategy that gives you the highest expected payout.</p>
<p>As you say, though, all outcomes are more likely in strategy 1 than in strategy 2, so weighting the outcomes differently won&#8217;t have an impact.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Ryan</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2687</link>
		<dc:creator>Ryan</dc:creator>
		<pubDate>Sat, 24 Dec 2011 20:00:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2687</guid>
		<description>Awesome! I will send you an email when I get a chance. WikiHadoop has the potential to be extremely useful. Cloud9 was great, but since Java is not my preferred language, it was a pain to setup at first.</description>
		<content:encoded><![CDATA[<p>Awesome! I will send you an email when I get a chance. WikiHadoop has the potential to be extremely useful. Cloud9 was great, but since Java is not my preferred language, it was a pain to setup at first.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Ryan</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2686</link>
		<dc:creator>Ryan</dc:creator>
		<pubDate>Sat, 24 Dec 2011 19:59:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2686</guid>
		<description>Hey Grant! Are these filters available online somewhere? I really should be using Lucene for all of this instead of reimplementing everything in Hadoop.</description>
		<content:encoded><![CDATA[<p>Hey Grant! Are these filters available online somewhere? I really should be using Lucene for all of this instead of reimplementing everything in Hadoop.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Grant Ingersoll</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2641</link>
		<dc:creator>Grant Ingersoll</dc:creator>
		<pubDate>Thu, 15 Dec 2011 12:57:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2641</guid>
		<description>Hey Ryan,

Not sure how it compares, but a while back I wrote some tokenizers/token filters for Lucene that work on Wikipedia.  They aren&#039;t perfect, but if you know Lucene, it may not be too hard to extend them for your needs.  Naturally, you can then feed them into Lucene&#039;s n-gram capabilities and other filters to build up what you need.</description>
		<content:encoded><![CDATA[<p>Hey Ryan,</p>
<p>Not sure how it compares, but a while back I wrote some tokenizers/token filters for Lucene that work on Wikipedia.  They aren&#8217;t perfect, but if you know Lucene, it may not be too hard to extend them for your needs.  Naturally, you can then feed them into Lucene&#8217;s n-gram capabilities and other filters to build up what you need.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Taking R to the Limit, Part II &#8211; Large Datasets in R by Large Data Sets in R &#124; luiz p. c. de freitas</title>
		<link>http://www.bytemining.com/2010/08/taking-r-to-the-limit-part-ii-large-datasets-in-r/comment-page-1/#comment-2634</link>
		<dc:creator>Large Data Sets in R &#124; luiz p. c. de freitas</dc:creator>
		<pubDate>Tue, 13 Dec 2011 03:13:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=301#comment-2634</guid>
		<description>[...] Via Ryan Rosario&#8217;s Byte Mining, challenges of and solutions to performing analytics on 10 to 15 gig data sets using R. It&#8217;s a long deck, but Ryan covers a lot of very cool material. I&#8217;m looking forward to trying a few these myself. SlideShare below, and grab the PDF as reference. Taking R to the Limit (High Performance Computing in R), Part 2 &#8212; Large Datasets, LA R Users&#8217; Group 8/17/10  View more presentations from Ryan Rosario    This entry was posted in analytics and tagged analytics, big data, R, statistical programming by Luiz. Bookmark the permalink. [...]</description>
		<content:encoded><![CDATA[<p>[...] Via Ryan Rosario&#8217;s Byte Mining, challenges of and solutions to performing analytics on 10 to 15 gig data sets using R. It&#8217;s a long deck, but Ryan covers a lot of very cool material. I&#8217;m looking forward to trying a few these myself. SlideShare below, and grab the PDF as reference. Taking R to the Limit (High Performance Computing in R), Part 2 &#8212; Large Datasets, LA R Users&#8217; Group 8/17/10  View more presentations from Ryan Rosario    This entry was posted in analytics and tagged analytics, big data, R, statistical programming by Luiz. Bookmark the permalink. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Diederik van Liere</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2618</link>
		<dc:creator>Diederik van Liere</dc:creator>
		<pubDate>Fri, 09 Dec 2011 19:11:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2618</guid>
		<description>Hi Ryan,

I am one of the authors and would love to get more detailed feedback on how we can improve this.
Best,
Diederik</description>
		<content:encoded><![CDATA[<p>Hi Ryan,</p>
<p>I am one of the authors and would love to get more detailed feedback on how we can improve this.<br />
Best,<br />
Diederik</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Ryan</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2617</link>
		<dc:creator>Ryan</dc:creator>
		<pubDate>Fri, 09 Dec 2011 19:09:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2617</guid>
		<description>Thanks! We tried WikiHadoop  but it did not seem very generalizable. The authors seemed to only be familiar with using it for diffing revisions. It could be an extremely powerful project if the documentation were better and if it was not restricted to Hadoop 0.21+.</description>
		<content:encoded><![CDATA[<p>Thanks! We tried WikiHadoop  but it did not seem very generalizable. The authors seemed to only be familiar with using it for diffing revisions. It could be an extremely powerful project if the documentation were better and if it was not restricted to Hadoop 0.21+.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on My First Few Days with RStudio by Elda</title>
		<link>http://www.bytemining.com/2011/03/my-first-few-days-with-rstudio/comment-page-1/#comment-2580</link>
		<dc:creator>Elda</dc:creator>
		<pubDate>Wed, 30 Nov 2011 20:49:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=532#comment-2580</guid>
		<description>I am trying to plot a linear model and a regression line using GSS7210.  I am using the variables educ,opknow, and earndes.  How would I incorporate these things in order to make these two graphs? 

Thanks</description>
		<content:encoded><![CDATA[<p>I am trying to plot a linear model and a regression line using GSS7210.  I am using the variables educ,opknow, and earndes.  How would I incorporate these things in order to make these two graphs? </p>
<p>Thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Chris Han</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2561</link>
		<dc:creator>Chris Han</dc:creator>
		<pubDate>Tue, 29 Nov 2011 06:52:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2561</guid>
		<description>After attempts to parse the Wikipedia dump myself, I end up experimenting with DBpedia data (http://wiki.dbpedia.org/Downloads37) instead. The DBpedia data includes (after cleanup) wikipedia article title, abstract, categories, redirects, disambiguation, which might be enough for my use.

I am just wondering why DBpedia didnot extract the fulltext article content, but only the abstract.

As I am still half way playing with the DBpedia data, no conclusions can be made with regards to whether it has enough info for me. 

Expecting to see more efforts in this space to make Wikipedia data more accessible for programmers, especially python geeks.</description>
		<content:encoded><![CDATA[<p>After attempts to parse the Wikipedia dump myself, I end up experimenting with DBpedia data (<a href="http://wiki.dbpedia.org/Downloads37" rel="nofollow">http://wiki.dbpedia.org/Downloads37</a>) instead. The DBpedia data includes (after cleanup) wikipedia article title, abstract, categories, redirects, disambiguation, which might be enough for my use.</p>
<p>I am just wondering why DBpedia didnot extract the fulltext article content, but only the abstract.</p>
<p>As I am still half way playing with the DBpedia data, no conclusions can be made with regards to whether it has enough info for me. </p>
<p>Expecting to see more efforts in this space to make Wikipedia data more accessible for programmers, especially python geeks.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by David</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2560</link>
		<dc:creator>David</dc:creator>
		<pubDate>Tue, 29 Nov 2011 04:18:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2560</guid>
		<description>Hi Ryan,

I&#039;ve struggled a bit trying to get data from Wikipedia. For example, I&#039;d love to get a plain text data dump of all the Wikis, edit histories and discussion pages on S&amp;P 500 companies or of all the Wikis under &quot;Companies founded in year XYZ&quot;

It would be interesting to see if older companies have longer Wikis or do some kind of sentiment analysis on S&amp;P 500 companies or how frequent changes are made of what byte size.

Any tips on how I&#039;d get started trying to get the data for this? I&#039;m technical enough to know PHP, but I don&#039;t know - this might be beyond me.

-David</description>
		<content:encoded><![CDATA[<p>Hi Ryan,</p>
<p>I&#8217;ve struggled a bit trying to get data from Wikipedia. For example, I&#8217;d love to get a plain text data dump of all the Wikis, edit histories and discussion pages on S&amp;P 500 companies or of all the Wikis under &#8220;Companies founded in year XYZ&#8221;</p>
<p>It would be interesting to see if older companies have longer Wikis or do some kind of sentiment analysis on S&amp;P 500 companies or how frequent changes are made of what byte size.</p>
<p>Any tips on how I&#8217;d get started trying to get the data for this? I&#8217;m technical enough to know PHP, but I don&#8217;t know &#8211; this might be beyond me.</p>
<p>-David</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 &#171; Another Word For It</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2558</link>
		<dc:creator>Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 &#171; Another Word For It</dc:creator>
		<pubDate>Tue, 29 Nov 2011 00:06:12 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2558</guid>
		<description>[...] Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Ryan Rosario. [...]</description>
		<content:encoded><![CDATA[<p>[...] Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Ryan Rosario. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Diederik van Liere</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2557</link>
		<dc:creator>Diederik van Liere</dc:creator>
		<pubDate>Mon, 28 Nov 2011 23:19:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2557</guid>
		<description>Dear Ryan,

This is very cool indeed! One way to speed up this significantly is to use Wikihadoop and use your existing Python code for the mapper. Wikihadoop, in contrast to the Cloud9 package, is able to stream the full bzipped2 XML dump files using Hadoop Streaming. I am happy to help you if you are stuck. You can find Wikihadoop at: https://github.com/whym/wikihadoop

Best,
Diederik</description>
		<content:encoded><![CDATA[<p>Dear Ryan,</p>
<p>This is very cool indeed! One way to speed up this significantly is to use Wikihadoop and use your existing Python code for the mapper. Wikihadoop, in contrast to the Cloud9 package, is able to stream the full bzipped2 XML dump files using Hadoop Streaming. I am happy to help you if you are stuck. You can find Wikihadoop at: <a href="https://github.com/whym/wikihadoop" rel="nofollow">https://github.com/whym/wikihadoop</a></p>
<p>Best,<br />
Diederik</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Ryan</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2556</link>
		<dc:creator>Ryan</dc:creator>
		<pubDate>Mon, 28 Nov 2011 21:08:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2556</guid>
		<description>Thanks for the link. I have always had trouble navigating the Freebase website.</description>
		<content:encoded><![CDATA[<p>Thanks for the link. I have always had trouble navigating the Freebase website.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Parsing Wikipedia Articles: Wikipedia Extractor and Cloud9 by Tom Morris</title>
		<link>http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/comment-page-1/#comment-2555</link>
		<dc:creator>Tom Morris</dc:creator>
		<pubDate>Mon, 28 Nov 2011 20:38:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=947#comment-2555</guid>
		<description>You might want to check out Google/Freebase&#039;s weekly WEX dumps.  They&#039;ve done a bunch of the grunt work and publish the results on a regular basis.  In the past they&#039;ve made them available on EC2, which would save you the bandwidth charges, although I&#039;m not sure they still do that on a regular basis.

http://wiki.freebase.com/wiki/WEX
http://download.freebase.com/wex/latest/</description>
		<content:encoded><![CDATA[<p>You might want to check out Google/Freebase&#8217;s weekly WEX dumps.  They&#8217;ve done a bunch of the grunt work and publish the results on a regular basis.  In the past they&#8217;ve made them available on EC2, which would save you the bandwidth charges, although I&#8217;m not sure they still do that on a regular basis.</p>
<p><a href="http://wiki.freebase.com/wiki/WEX" rel="nofollow">http://wiki.freebase.com/wiki/WEX</a><br />
<a href="http://download.freebase.com/wex/latest/" rel="nofollow">http://download.freebase.com/wex/latest/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on My First Few Days with RStudio by Louis Aslett</title>
		<link>http://www.bytemining.com/2011/03/my-first-few-days-with-rstudio/comment-page-1/#comment-2553</link>
		<dc:creator>Louis Aslett</dc:creator>
		<pubDate>Wed, 23 Nov 2011 12:17:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=532#comment-2553</guid>
		<description>No probs.  Well, if it helps open port wise, then in that AMI I actually have standard port 80 rather than the usual 8787 for that reason: I know some corporate security policies don&#039;t like non-standard ports for HTTP.

If by security you meant encryption, then there are a few things I&#039;d like to do in future AMI releases, one being adding SSL support (though it would obviously be an unsigned cert because of changing EC2 IPs), but that&#039;ll be when I have the time!</description>
		<content:encoded><![CDATA[<p>No probs.  Well, if it helps open port wise, then in that AMI I actually have standard port 80 rather than the usual 8787 for that reason: I know some corporate security policies don&#8217;t like non-standard ports for HTTP.</p>
<p>If by security you meant encryption, then there are a few things I&#8217;d like to do in future AMI releases, one being adding SSL support (though it would obviously be an unsigned cert because of changing EC2 IPs), but that&#8217;ll be when I have the time!</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on My First Few Days with RStudio by Ryan</title>
		<link>http://www.bytemining.com/2011/03/my-first-few-days-with-rstudio/comment-page-1/#comment-2551</link>
		<dc:creator>Ryan</dc:creator>
		<pubDate>Wed, 23 Nov 2011 04:08:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=532#comment-2551</guid>
		<description>Thanks! I remember seeing a tweet about this. I look forward to trying it. My company and I use Amazon EC2 extensively and I&#039;ve been wanting to install RStudio Server on an EBS backed instance, but keys and security make it difficult to open a port. I will have to try this!</description>
		<content:encoded><![CDATA[<p>Thanks! I remember seeing a tweet about this. I look forward to trying it. My company and I use Amazon EC2 extensively and I&#8217;ve been wanting to install RStudio Server on an EBS backed instance, but keys and security make it difficult to open a port. I will have to try this!</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on My First Few Days with RStudio by Louis Aslett</title>
		<link>http://www.bytemining.com/2011/03/my-first-few-days-with-rstudio/comment-page-1/#comment-2550</link>
		<dc:creator>Louis Aslett</dc:creator>
		<pubDate>Wed, 23 Nov 2011 00:14:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=532#comment-2550</guid>
		<description>Just stumbled across this page. I’ve been maintaining AMIs specifically for RStudio Server for a few months now. Details are at http://www.louisaslett.com/RStudio_AMI/ in case they&#039;re any use ... it saves the setup time from a standard OS.</description>
		<content:encoded><![CDATA[<p>Just stumbled across this page. I’ve been maintaining AMIs specifically for RStudio Server for a few months now. Details are at <a href="http://www.louisaslett.com/RStudio_AMI/" rel="nofollow">http://www.louisaslett.com/RStudio_AMI/</a> in case they&#8217;re any use &#8230; it saves the setup time from a standard OS.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Hadoop Fatigue &#8212; Alternatives to Hadoop by Nikhil Prabhakar (@_nipra)</title>
		<link>http://www.bytemining.com/2011/08/hadoop-fatigue-alternatives-to-hadoop/comment-page-1/#comment-2548</link>
		<dc:creator>Nikhil Prabhakar (@_nipra)</dc:creator>
		<pubDate>Sun, 06 Nov 2011 18:09:36 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=794#comment-2548</guid>
		<description>Nice overview. LexisNexis link points to cascalog git repo.</description>
		<content:encoded><![CDATA[<p>Nice overview. LexisNexis link points to cascalog git repo.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Hadoop Fatigue &#8212; Alternatives to Hadoop by Ryan</title>
		<link>http://www.bytemining.com/2011/08/hadoop-fatigue-alternatives-to-hadoop/comment-page-1/#comment-2545</link>
		<dc:creator>Ryan</dc:creator>
		<pubDate>Tue, 18 Oct 2011 02:26:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=794#comment-2545</guid>
		<description>Oh yes. mrjob is great. I wrote about it here: http://www.bytemining.com/2010/11/exciting-tools-for-big-data-s4-sawzall-and-mrjob/
We used it at my previous company as well. Worked well. I hope Yelp continues to contribute.</description>
		<content:encoded><![CDATA[<p>Oh yes. mrjob is great. I wrote about it here: <a href="http://www.bytemining.com/2010/11/exciting-tools-for-big-data-s4-sawzall-and-mrjob/" rel="nofollow">http://www.bytemining.com/2010/11/exciting-tools-for-big-data-s4-sawzall-and-mrjob/</a><br />
We used it at my previous company as well. Worked well. I hope Yelp continues to contribute.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Hadoop Fatigue &#8212; Alternatives to Hadoop by albert</title>
		<link>http://www.bytemining.com/2011/08/hadoop-fatigue-alternatives-to-hadoop/comment-page-1/#comment-2544</link>
		<dc:creator>albert</dc:creator>
		<pubDate>Mon, 17 Oct 2011 16:40:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.bytemining.com/?p=794#comment-2544</guid>
		<description>At this year&#039;s Data Mining Camp, the revelation is Yelp&#039;s Python Hadoop framework mrjob.
http://engineeringblog.yelp.com/2010/10/mrjob-distributed-computing-for-everybody.html</description>
		<content:encoded><![CDATA[<p>At this year&#8217;s Data Mining Camp, the revelation is Yelp&#8217;s Python Hadoop framework mrjob.<br />
<a href="http://engineeringblog.yelp.com/2010/10/mrjob-distributed-computing-for-everybody.html" rel="nofollow">http://engineeringblog.yelp.com/2010/10/mrjob-distributed-computing-for-everybody.html</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>

