Introduction to RLink

R is a programming language and software environment for statistical computing and graphics. R is an open-source project, and a result of a large community effort. More information about R can be found at http://www.r-project.org.

RLink is a Wolfram System application that uses JLink and RJava/JRI Java libraries to link to the R functionality. It allows the user to communicate data between the Wolfram Language and R and execute R code from within the Wolfram Language.

This tutorial is intended to give a very quick snapshot of what RLink allows you to do and the typical RLink workflow. For a more in-depth account on RLink, please consult the "RLink User Guide", and the tutorials "Functions" and "R Data Types in RLink", as well as the documentation pages for individual functions of the RLink API, listed in "Reference".

Typical Workflow

You must load the application before you can use it.

In[1]:=
Click for copyable input

Now you need to install the R runtime.

In[2]:=
Click for copyable input

You are now ready to work. The workflow will normally contain iterations of the following three steps: send data from the Wolfram Language to R, execute some R code, and get the result back to the Wolfram Language. In what follows, a brief and rather condensed exposition of the typical RLink workflow will be given.

As a starting example, the following code will send some number to R and have it determine whether it is even or odd. First, pick a number.

In[3]:=
Click for copyable input
Out[3]=

Now send it to R.

In[4]:=
Click for copyable input
Out[4]=

Finally, test the condition.

In[5]:=
Click for copyable input
Out[5]=

RSet was used to send the data to R, assigning it to some variable (or expression) in the R workspace, and REvaluate was used to evaluate a string of R code and return the result back to the Wolfram Language. In fact, REvaluate combines the last two of the three steps mentioned, since it both evaluates the code and returns the result.

The result is wrapped in a List. This is because it is in fact a vector with a single logical element, from the viewpoint of R.

Things will not be much different if a list of numbers is considered, rather than a single number, because many R functions are naturally vectorized.

In[6]:=
Click for copyable input
Out[6]=
In[7]:=
Click for copyable input
Out[8]=

To call R functions with a more streamlined syntax, you can use RFunction. You can use it to have a more concise solution for the previous example, and avoid the step introducing a global variable.

In[9]:=
Click for copyable input
Out[9]=

Type Conversion and Internal Form of Expressions

First, load RLink:

In[1]:=
Click for copyable input

Define a set of random numbers

In[3]:=
Click for copyable input
Out[3]=

If you need to also keep the original numbers, you can get a little more sophisticated.

In[4]:=
Click for copyable input
Out[4]=

In the Wolfram Language, the analogous code would look like the following and produces a list with a simpler structure.

In[5]:=
Click for copyable input
Out[5]=

This shows some of the differences in the type systems of R and the Wolfram Language. What was returned is a list of lists, each sublist having two elements: a one-element integer vector (original number), and a one-element logical vector (the value of the test predicate).

You can use the ToRForm function to see the internal RLink representation of data, which can be both input data from the Wolfram Language and the results obtained from R. In this particular case, the preceding description of the structure of the result is easy to confirm.

In[6]:=
Click for copyable input
Out[6]=

This is the internal form of the expression stored in result, which RLink uses to communicate the data to and from R. You can use this form in RSet, RFunction, and REvaluate, just like a brief form. Most of the time, however, this form is much less convenient to use.

You can also use the FromRForm function to perform an inverse transformation.

In[7]:=
Click for copyable input
Out[7]=

As a matter of fact, often the structure of an expression in a short form can be simplified. For example, the inner lists in the result could have been flattened, and RLink would still interpret it correctly.

In[8]:=
Click for copyable input
Out[8]=

You can see now that the full form of this list is the same.

In[9]:=
Click for copyable input
Out[9]=

There are a few cases when the automatic type identification in RLink is ambiguous. One notable case is when you want to construct a list of elements of the same native type: in such a case it will be interpreted as a vector by default. It is possible to force the list interpretation; more details on that can be found on the documentation page for RList, and also in "Data Types".

Writing Your Own R Functions

You are not confined to R one-liners when working with RLink. To illustrate this point, consider an example: a Wolfram Language-style Split function will be implemented in R, which will work on vectors.

First, load RLink:

In[1]:=
Click for copyable input

This defines the function in question:

In[3]:=
Click for copyable input

Note the semicolon: it is generally used to suppress the output and the associated data transfer from R to the Wolfram Language.

You can test it now. First, create some random numbers with a likelihood of consecutive runs.

In[4]:=
Click for copyable input
Out[4]=

The following will split the list, using the function defined previously.

In[5]:=
Click for copyable input
Out[5]=

Here is an exactly equivalent Wolfram Language program.

In[6]:=
Click for copyable input
Out[6]=

It is not necessary to give a function you define in R workspace a name, since both R and RLink support anonymous functions. In particular, you could have defined the following.

In[7]:=
Click for copyable input
Out[7]=

And then you can use it.

In[8]:=
Click for copyable input
Out[8]=