Using R in Java with Rserve

R is a free software environment for statistical computing. Rserve is used as a TCP/IP server to run R libraries from within Java. In this tutorial, I want to show you how to use the power of R in a Java application using Rserve.

Rserve is a TCP/IP server which allows other programs to use R. Every connection has a separate workspace and working directory. Typically, you integrate the R backend to make plots, call statistical packages, etc.

Setup

For this example, I use the forecast package for R, a library providing forecast methods and error measures to forecast univariate time series.

To install this package, open R and click on Packages -> Install package(s). Select a download mirror and select forecast in the next window. R downloads the package and all dependencies. Also install the Rserve package.

You can also run the commands on your R console:

Install packages in R
install.packages("Rserve")
install.packages("forecast")

Rserve provides client jars that are used inside a Java program to communicate with R.  You can download both files from the Rserve download page. Add REngine.jar and RserveEngine.jar to your Java project library folder.

Start Rserve

After installing Rserve, we have to start the serve. Type the following command in your R console to import and start the Rserve package.

library(Rserve)
Rserve()
Start the Rserve server

By default, the Rserve server is running on port 6311.

Call R from Java

Let’s start by creating a new Rserve process.

private static final String PATH_TO_R = SystemUtils.IS_OS_UNIX ? "R" : "C:\\Program Files\\R\\R-3.5.1\\bin\\x64\\R.exe";

synchronized void createRserveProcess(int port) throws IOException {
    String cmd = PATH_TO_R + " -e " + "\"library(Rserve);Rserve(port=" + port + ")\"";
    Runtime.getRuntime().exec(cmd);
}

On unix system, we have no problem, as one Rserve instance can serve multiple calls. On Windows, Rserve can’t create a seperate process by forking the current process. We have to create a new Rserve process for each thread (listening on a different port), and a new Rserve connection on the corresponding port has to be established as well.

private static final String HOST = "localhost";
RConnection connection = new RConnection(HOST, PORT);

The port has to be the same as above.

Then, we load our packages in R. We have make a call to R using the previously created connection.

synchronized void init(RConnection connection) throws RserveException {
     // find local package repositories
     connection.voidEval("library.path <- cat(.libPaths()[1:1])");

     //load each package seperately
     connection.voidEval("library(\"forecast\", lib.loc = library.path)");
}

Finally, we have to tear down all running connections, that is all threads we started.

void tearDown() {
        try {
            for (RConnection connection : threads.values()) {
                connection.close();
                connection.shutdown();
                connection.detach();
            }
        } catch (RserveException e1) {
            System.err.println("Error closing Rserve threads: " + e1.getMessage());
        }
}

To bring it all in order, see the following code snippet. First, we create an instance of Rserve. We create a connection to R. We initialise our libraries in R. Finally, we store the connection in a list of threads so that we can later-on tear it down. We increase the port number so that we do not try to create another thread on the same port.

createRserveProcess(PORT);
RConnection connection = new RConnection(HOST, PORT);
init(connection);
threads.put(threadId, connection);

PORT++;

This Java implementation is part of a bigger forecast module. You can view the code on Github.