Introduction to RLink

R is a programming language and software environment for statistical computing and graphics. R is an open-source project, and a result of a large community effort. More information about R can be found at http://www.r-project.org.

RLink is a Wolfram System application that uses JLink and RJava/JRI Java libraries to link to the R functionality. It allows the user to communicate data between the Wolfram Language and R and execute R code from within the Wolfram Language.

This tutorial is intended to give a very quick snapshot of what RLink allows you to do and the typical RLink workflow. For a more in-depth account on RLink, please consult the "RLink User Guide", and the tutorials "Functions" and "R Data Types in RLink", as well as the documentation pages for individual functions of the RLink API, listed in "Reference".

Typical Workflow

You must load the application before you can use it.

Now you need to install the R runtime.

You are now ready to work. The workflow will normally contain iterations of the following three steps: send data from the Wolfram Language to R, execute some R code, and get the result back to the Wolfram Language. In what follows, a brief and rather condensed exposition of the typical RLink workflow will be given.

As a starting example, the following code will send some number to R and have it determine whether it is even or odd. First, pick a number.

Now send it to R.

Finally, test the condition.

RSet was used to send the data to R, assigning it to some variable (or expression) in the R workspace, and REvaluate was used to evaluate a string of R code and return the result back to the Wolfram Language. In fact, REvaluate combines the last two of the three steps mentioned, since it both evaluates the code and returns the result.

The result is wrapped in a List. This is because it is in fact a vector with a single logical element, from the viewpoint of R.

Things will not be much different if a list of numbers is considered, rather than a single number, because many R functions are naturally vectorized.

To call R functions with a more streamlined syntax, you can use RFunction. You can use it to have a more concise solution for the previous example, and avoid the step introducing a global variable.

Type Conversion and Internal Form of Expressions

First, load RLink:

Define a set of random numbers

If you need to also keep the original numbers, you can get a little more sophisticated.

In the Wolfram Language, the analogous code would look like the following and produces a list with a simpler structure.

This shows some of the differences in the type systems of R and the Wolfram Language. What was returned is a list of lists, each sublist having two elements: a one-element integer vector (original number), and a one-element logical vector (the value of the test predicate).

You can use the ToRForm function to see the internal RLink representation of data, which can be both input data from the Wolfram Language and the results obtained from R. In this particular case, the preceding description of the structure of the result is easy to confirm.

This is the internal form of the expression stored in result, which RLink uses to communicate the data to and from R. You can use this form in RSet, RFunction, and REvaluate, just like a brief form. Most of the time, however, this form is much less convenient to use.

You can also use the FromRForm function to perform an inverse transformation.

As a matter of fact, often the structure of an expression in a short form can be simplified. For example, the inner lists in the result could have been flattened, and RLink would still interpret it correctly.

You can see now that the full form of this list is the same.

There are a few cases when the automatic type identification in RLink is ambiguous. One notable case is when you want to construct a list of elements of the same native type: in such a case it will be interpreted as a vector by default. It is possible to force the list interpretation; more details on that can be found on the documentation page for RList, and also in "Data Types".

Writing Your Own R Functions

You are not confined to R one-liners when working with RLink. To illustrate this point, consider an example: a Wolfram Language-style Split function will be implemented in R, which will work on vectors.

First, load RLink:

This defines the function in question:

Note the semicolon: it is generally used to suppress the output and the associated data transfer from R to the Wolfram Language.

You can test it now. First, create some random numbers with a likelihood of consecutive runs.

The following will split the list, using the function defined previously.

Here is an exactly equivalent Wolfram Language program.

It is not necessary to give a function you define in R workspace a name, since both R and RLink support anonymous functions. In particular, you could have defined the following.

And then you can use it.