Accessing R from Python using RPy2

This past Tuesday I had the opportunity to present a short talk (a bit long) related to text mining at the Los Angeles R Users’ Group. Since I do most of my text mining in Python, I took this opportunity to discuss RPy2, an interface to R from Python. My slides are below:


Download/view slides here. Topics include

  • Using Python with R with an example using web mining.
  • Web mining using pure R rather than Python.

Code for demonstration is here:

  1. offtopic_demo.py is a pure Python script that extracts data from a web forum and dumps it to disk. To actually use it, you will need to register for an account.
  2. RPy2_demo.py reads the data from the forum from disk and calls R from Python to perform some basic analysis.
  3. curljson_demo.R grabs some JSON data from the Twitter Search API using RCurl and converts it to R lists using rjson.

Video:


Running the code requires some packages that you need to install.

  • twill package for web browsing, that installs a Python package for you. Requires the mechanize package as well. twill is a wrapper to mechanize.
  • BeautifulSoup package for Python for HTML parsing.
  • R must be built to use as a shared library using --enable-R-shlib, otherwise Python cannot call it.
  • RPy2, the Python interface to R.

To see the main talk of the evening, click here.

Some Recommended Books

Natural Language Processing

Text Mining

Data Mining

Web Mining

5 comments to Accessing R from Python using RPy2

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>