My First Few Days with RStudio

As most readers are probably aware, the free IDE for R, called RStudio, was recently released for general use and it immediately made huge waves within the R community. IDE stands for Integrated Development Environment. IDEs typically provides a rich set tools developing in some target language. For standard programming languages like C++ (VisualStudio) and Java (Eclipse or NetBeans), IDEs contain:

  • an editor tailored to the target language. The editor typically has tab/auto-complete for variable names, functions and class methods and properties and also features syntax highlighting.
  • a multiple document interface (MDI) where there may be several documents opened in different tabs.
  • a window that interacts with the compiler, or a panel containing the console to the language, a la MATLAB, and even vanilla R’s GUI.
  • a debugger
  • a file browser and language reference.

RStudio plays to this analogy very well, and makes modifications where appropriate. RStudio provides many features that are lacking in the standard R GUI, and improves on features that do not work properly in the Windows R GUI. Over the past few days, I have been doing all of my R analysis within RStudio, shortly with the Desktop version, and mostly with the Server version. I will discuss mostly the server version since that is what I have been using. It is identical (AFAIK) to the desktop version, so you are not missing anything by using either version.

RStudio Server

The biggest win for me with RStudio is the Server edition. I can access my work on any system that can communicate with the server. The interface always looks the same, and all I need is a web browser to access it. Before RStudio Server Edition, I had to run two versions of R: R GUI on my local machine for graphics and presentation, and a headless R on a research server for processing, where the server contained my data and the rest of my workflow. I no longer need to run multiple versions of R in my workflow.

First, installation is miraculously easy. I only had a few very minor glitches to deal with. Armed with sudo access to a machine on a research cluster at work, I was able to simply download the RPM and install it using the instructions provided on the web site. Then, all I had to do was fire up a browser and go to

http://servername.com:8787

and I was asked for my login credentials. But I couldn’t get in. This server authenticates using LDAP, but all I had to do was replace the contents of /etc/pam.d/rstudio with the contents or /etc/pam.d/login and I was able to login. But then there was a “unknown error.” Oh, the version of R that was installed was too old (2.8). I just did a yum upgrade R, and RStudio logged me in with no problems. What showed up on my browser screen was beautiful! It looked identical to the desktop version of RStudio.


Once logged in, I somehow have access to ALL of my files on the remote server. I can load my data (typically produced by Hadoop) already residing on the server, and I can save output, graphs, data and even the R session itself on the remote server! All while just clicking buttons. No commands to remember, no screwed up PDF files, and most importantly…. no scping files back and forth from the server just to create a plot (X worked well, but had limitations)!

Things I Love about R Studio

I will have to go panel by panel, but even then I will have missed cool features. I also will not discuss features that are already present in the MacOS X R GUI and are repeated and beautified in RStudio:
The R command prompt still looks the same. At first, my reaction was “Damn, what am I supposed to do?” But when the GUI finished loading, the familiar R command prompt appeared in all is 1970ish glory. I immediately started typing commands and seeing fields in the other panes populate and change to display different usages. It left me with a “oh, I see” feeling.

Saves R sessions correctly, and when I return to RStudio, ALL of my work is there! I could never get the save session/image function to work in R GUI. I gave up several years ago. In RStudio, it works properly, but you don’t even need it because… when you leave RStudio and then return, everything is there! The workspace (variables, functions, data, etc), the scripts you were working on, the plots, even the last dang help screen you looked at!

The Stop Execution button in the console actually works. When executing a long running computation in R GUI (that’s the first mistake), it is sometimes necessary to cancel the computation either because I made an error, or because the computation is killing my system’s performance. In R GUI, particularly on MacOS X, the Stop Execution button did absolutely nothing, because there was typically a spinning beachball preventing me from clicking it. Hitting ESC also did not work. In RStudio, clicking Stop actually seems to break out of the madness.

Workspace panel. The workspace panel displays the variables, functions, data frames and other objects that reside in the current workspace, a la MATLAB. From this panel, one can also switch or save workspaces. The user can also import a dataset from a text file using a trivial wizard (a la SPSS, etc.), or from a web URL. The user can also clear the workspace. A frequently overlooked command to do the same from the command line is rm(list=ls()), but that is no longer necessary to remember!

Clicking on a data frame object in the workspace pane, causes it to be displayed in a nice tabular format. It can also be printed to a local printer, or opened in a new window.

Clicking on a numerical value allows the user to change it by opening an in-place edit box. Clicking on other objects like lists, vectors and functions opens an edit window displaying the definition that created it.



Files panel. There is nothing really exciting to see here, except that by clicking the Upload button, I can upload files directly to the remote server just by selecting the file, without having to SCP!


Scripting panel. This is the second best feature of R studio and has the same feeling as the stock script editor that ships with R. The largest difference is that the editor in RStudio is stable. On MacOS X, the editor tends to garble 2-3 rows of code together on every single scroll. This editor does a better job of indentation than R GUI. When opening a function, R GUI tends to indent the body properly, but insert a closing } prematurely. RStudio’s editor also features auto-completion, a feature present in the command-line of R GUI and R, but not in the editor of R GUI. The user can also save their script on the remote server, print code to a local printer and search. Similar to MATLAB, the user can select one or more lines of code and run them by clicking the “Run Line(s)” button, rather than having to copy and paste lines. “Run All” is a point-and-click replacement for source.

The “Source on Save” function is interesting. If enabled, RStudio will run/source the script each time the script is saved. Honestly, I do not find this feature to be all that useful unless in the middle of debugging, and dangerous if not debugging. Suppose after a long 10-fold-cross-validation computation there is an error that we want to fix. We fix the error and save the script. Do we really want to run the computation again? If R were a compiled language, then yes. Since R is not a compiled language, this feature is not entirely useful in concept.

The “magic wand” icon contains what I suspect to be a growing collection of coding tools. Currently, the user can comment and uncomment a bunch of lines at once. This is particularly useful since, for some reason, there is no multiline comment flag in R. The user can also select a series of lines and wrap a function around them. This feature could be dangerous for those not familiar with coding but provides a very nice way to put a bunch of code into a function as an afterthought.



Plot panel. By far my favorite part of RStudio is the plot panel! All plots are saved in this panel, and the user move back and forth among plots that were already constructed. The Export button allows exporting a plot to user defined dimensions and save to the local machine as a PNG, or even copy it to the local machine’s clipboard! Of course, the PDF button produces a PDF file of the plot that can be saved on the local machine. If the plots are all too much, we can click “Clear All” and start again with a clean slate.

But, is it possible to create plots of larger size? I am sure there is, but I did not spend much time looking.


LaTeX and Sweave documents. From the File menu the user can create new documents including LaTeX and Sweave. Unfortunately, I cannot experiment more with these features because there is something amiss in my configuration. For students and researchers, having Sweave and LaTeX integrated with RStudio is a huge, huge, huge advantage. No longer must we copy/paste among different programs. To make the integration complete, BibTeX, Asymptote/TikZ/gnuplot whatever should be easily included by the user.

At any point if the user interface shows stale data, there is a Reload button to help you out by refreshing the entire RStudio interface.


Things that Need Improvement

I do not really have any complaints about RStudio, quite the opposite actually. However, there are some things that do not seem to work. I should note however, that I have not spent much (well, any) time debugging them. The developers are probably already working on some of them. Some of them are probably problems in my configuration and others are probably settings that I need to tweak.

No auto-completion of parentheses or quotation marks. This is a bummer, but not a deal breaker. On the other hand, as you type closing marks, RStudio highlights the matching mark.

The dataset view needs work. Columns can’t be resized. Other natural functionalities that seem to be missing are: column renaming (a call to names), cannot sort or order values by a column, and data manipulation (I didn’t say that). These missing features are a tad disappointing, but a hell of a lot better than displaying in the terminal.

Install packages in the packages panel does not work on our server’s configuration.

LaTeX cannot be found. Upon attempting to create a new LaTeX or Sweave document, I got a friendly notice (instead of a bizarre error message) saying that LaTeX is not installed. The problem is, it is installed and there does not seem to be anywhere in the GUI to configure its location. Additionally, some LaTeX templates would be useful.


In Conclusion…

My Workflow Before and After RStudio

Before RStudio



After RStudio



All in all, the biggest win for me with RStudio is the Server edition. I can access my work on any system that can communicate with the server. The interface always looks the same, and all I need is a web browser to access it. I no longer need to run multiple versions of R in my workflow.

The developers of this open source project seemed to get it right on the first try. How the hell is that possible??? So has anyone switched from the big R to the big blue ball?

29 comments to My First Few Days with RStudio

  • JL

    Great post on this new IDE. I’m still trying to figure out an ideal R setup, particularly for the situation you describe where all my data is stored on a remote server and I need to generate a lot of graphics using that data. I’ve messed around with RStudio myself and it seems very promising. One thing I’ve been wondering about though – from what I’ve read online it seems that if you want to really be a hardcore R user, a lot of people recommend investing in learning to use emacs+ess. Do you have any thoughts on emacs+ess vs RStudio, particularly when working with data on a remote server as you describe?

    • I have seen quite a din on Twitter about Emacs + ESS. I consider myself a hardcore R user, and I have no desire to use Emacs/ESS. I do not even know much about it unfortunately. But, part of the reason for that is that I am a vim supporter ;) . The allure of using Emacs/ESS seems to be that it makes use of the command-line version of R, and has some bling in the editor. It’s true though that Vim does not have a “special” mode like Emacs does (ESS).

      Plots were the deciding factor for me. It’s also the fact that I do not need a bunch of terminal windows, or frequent scping, or using sshfs to edit my files, and display plots.

  • Great post Ryan!
    I’ve been keeping an eye for the support forum of RStudio, and can’t wait to see their next major release.

    Cheers,
    Tal

  • iThink, iAct

    Thanks for the post, thanks for sharing your experience thus far!
    Awesome.. and some find details! Just like Tal, I too, I am looking forward to their next major release.

  • Yeah, RStudio is great, and after few changes it would be even greater. I’m waiting for utf-8 support, without it RStudio is useless for me.
    I’m wondering, can more than one user work with RStudio on server?

    • That’s a good point. I didn’t even consider UTF-8 (although it always crops up on me in Python). Yes, more than one user can use RStudio. It transparently uses the system’s authentication. Not exactly sure how they did that so easily, but it was very seamless. Each user logs in to an RStudio session using the username and password used by the server to authenticate the user for other services, like SSH. If another user logs in, I imagine RStudio just spawns off another process for the interface, and another process for R. Right now, I am the only one using RStudio on the server. If I find out that others have problems, I will write a followup post.

    • Hi Ieva, we’re definitely hoping to add UTF-8 support soon. We understand how useless RStudio is for many of our potential users until we add this capability.

      Yes, more than one user can work with RStudio server. Each user works out of his/her own home directory and has a separate R process. We have tested RStudio server extensively with many dozens of concurrent users and it runs quite smoothly at this point.

  • John

    Just a couple of notes about the Mac R GUI. It kept your multiple graphs automatically already. If you want to see your prior plots just bring the quartz window to the front and press command left arrow. Also, when that plot is in front now plotting goes into that space so you could add lines and such. In addition, the code editor also did multi-line commenting and uncommenting with command ‘ and option-command ‘.

  • I’m surprised about not being able to run up plots on the remote server; did you try getting a simple X server or VNC going?

    • Yeah, I’ve done plots over X. There tends to be a bit of a delay due to network latency, and the graphics always look a bit different…a bit retro. Since I do not manage the server (except for having sudo access), VNC isn’t really an option.

  • Forgot to mention, you can probably fix your latex problem by installing texinfo on the server. We (erroneously) look for the presence of texi2dvi instead of pdflatex. This’ll be fixed in the next version.

  • Piet

    Great article ! and also great Rstudio, indeed ! I totally agree with the use of the “stop” button. On the RGUI (on Os X or windows…), it never works ! In Rstudio it’s like a charm !

    One thing you miss to say about the Rstudio server version is the ability to leave actually you session while R is working, and come back later to see if computations are done. For instance you launch your code at work and you want to see the results at home (am I the only one to do that ? :-D )…

    It can even create plots in the plot area when you are not connected ! When working on R over an ssh connection, if you try to leave your terminal, R is logically shutdown and so it is a problem to send remote jobs to R easily… ok you can launch scripts, with R called on a “screen” or “at” process, but this is far less convenient, for instance to get your R session-workspace back.

  • Holger

    Thanks so much for the post! Do you have experiences with Eclipse-statet and what are the Pros of R studio desctop version compared with statet?
    Thanks!

  • Simon

    I used to use Eclipse (under Ubuntu) but found it frustrating, maybe because I didn’t use it long enough, but setting up new projects was always a chore and it would crash occasionally.

    I’ve actually been using Kate (a KDE text editor I assume), which has been pretty great. It’s got a terminal built in, and I had to add some hot keys, but for the most part it worked pretty well.

    That said, RStudio is awesome. It’s really helped clean up my desktop, and it’s very pretty. The workspace panel is fantastic and having the plot panel built in, with the ability to export easily is a real advantage.

    You touched on one thing I’d like to see, automatic brace completion for functions and loops, but that’s not really a need. I’d really like to see the ability to collapse sections of the code bounded by braces though. If I’ve got functions or if statements in Kate I can click on their line number and everything in the braces gets hidden, which makes scooting through the code a bit easier during debugging.

    That said, RStudio, yay! We convinced a few labs at the University here to switch over, and they love it.

  • [...] RStudio (I actually don’t know anyone that uses it, but it looks promising based on this short review and a longer, very informative review.) [...]

  • amazonaws

    i install RStudio server on amazon ec2, on ubuntu instance ami-da0cf8b3 64 bit. it’s easy to install R and RStudio. I test it with two users in the same time. wonderful. it’s magic.

  • Just stumbled across this page. I’ve been maintaining AMIs specifically for RStudio Server for a few months now. Details are at http://www.louisaslett.com/RStudio_AMI/ in case they’re any use … it saves the setup time from a standard OS.

    • Thanks! I remember seeing a tweet about this. I look forward to trying it. My company and I use Amazon EC2 extensively and I’ve been wanting to install RStudio Server on an EBS backed instance, but keys and security make it difficult to open a port. I will have to try this!

      • No probs. Well, if it helps open port wise, then in that AMI I actually have standard port 80 rather than the usual 8787 for that reason: I know some corporate security policies don’t like non-standard ports for HTTP.

        If by security you meant encryption, then there are a few things I’d like to do in future AMI releases, one being adding SSL support (though it would obviously be an unsigned cert because of changing EC2 IPs), but that’ll be when I have the time!

  • Elda

    I am trying to plot a linear model and a regression line using GSS7210. I am using the variables educ,opknow, and earndes. How would I incorporate these things in order to make these two graphs?

    Thanks

  • Great post. After stubbornly resisting graphical R IDEs for over 10 years, I finally switched from ssh/terminal + emacs to RStudio. Definitely the best R IDE ever made and I recommend it to anyone who hasn’t tried it.

    What finally got me to switch was RStudio server edition which I run on my 12-core MacPro. I did some port-forwarding magic and now I can easily rejoin my session from anywhere. It’s great that they even have an integrated shell tool so you can jump to the system shell whenever you need to. Great software!

    I should mention that it was a bit tricky to get the Rstudio server to build and run on OSx Lion. I documented the procedure on a wiki in case there are any mac users out there who want to run Rstudio server: http://sagebionetworks.jira.com/wiki/display/SYNR/Building+RStudio+from+Source+%28OSx+Lion%29

  • Just stubled upon this page and have to say it looks nice. I am taking a Statistics class and find R-Server to be a great way of getting my homework done, even on the road. With a Mifi router I can access my home server from everywhere I go! This 5 Star software and your site is a good getting started page.

    Support Open Source!

  • Andy

    Great Post, Ryan! I am currently switching to Rstudio and found it a great IDE easy to use. I am curious whether you have reach one of my problems: when you invoke system(“./run”) command, it says “sh: ./run Permission Denied”. Why I dont have the same permission as I work in R shell?

    Thanks again for the nice post.

    Andy

  • [...] will not duplicate the efforts of previous bloggers (see here and here) by providing an overview of RStudio’s best features, but I particularly love the [...]

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>