Month: July 2017

Surprising result when exploring Rcpp gallery

I’m starting to incorporate more Rcpp in my R work, and so decided to spend some time exploring the Rcpp Gallery. One example by John Merrill caught my eye. He provides a C++ solution to transforming an list of lists into a data frame, and shows impressive speed savings compared to as.data.frame.

This got me thinking about how I do this operation currently. I tend to rely on the do.call method. To mimic the example in the Rcpp example:

a <- replicate(250, 1:100, simplify=FALSE)
b <- do.call(cbind, a)

For fairness, I should get a data frame rather than a matrix, so for my comparisons, I do convert b into a data frame. I follow the original coding in the example, adding my method above into the mix. Comparing times:

res <- benchmark(as.data.frame(a),
                 CheapDataFrameBuilder(a),
                 as.data.frame(do.call(cbind, a)),
                 order="relative", replications=500)
res[,1:4]

The results were quite interesting to me 🙂

                              test replications elapsed relative
3 as.data.frame(do.call(cbind, a))          500    0.36    1.000
2         CheapDataFrameBuilder(a)          500    0.52    1.444
1                 as.data.frame(a)          500    7.28   20.222

I think part of what’s happening here is that as.data.frame.list expends overhead checking for different aspects of making a legit data frame, including naming conventions. The comparison to CheapDataFrameBuilder should really be with my barebones strategy. Having said that, the example does provide great value in showing what can be done using Rcpp.

Quirks about running Rcpp on Windows through RStudio

Quirks about running Rcpp on Windows through RStudio

This is a quick note about some tribulations I had running Rcpp (v. 0.12.12) code through RStudio (v. 1.0.143) on a Windows 7 box running R (v. 3.3.2). I also have RTools v. 3.4 installed. I fully admit that this may very well be specific to my box, but I suspect not.

I kept running into problems with Rcpp complaining that (a) RTools wasn’t installed, and (b) the C++ compiler couldn’t find Rcpp.h. First, devtools::find_rtools was giving a positive result, so (a) was not true. Second, I noticed that the wrong C++ compiler was being called. Even more frustrating was the fact that everything was working if I worked on a native R console rather than RStudio. So there was nothing inherently wrong with the code or setup, but rather the environment RStudio was creating.

After some searching the interwebs and StackOverflow, the following solution worked for me. I added the following lines to my global .Rprofile file:

Sys.setenv(PATH = paste(Sys.getenv("PATH"), "C:/RBuildTools/3.4/bin/",
            "C:/RBuildTools/3.4/mingw_64/bin", sep = ";"))
Sys.setenv(BINPREF = "C:/RBuildTools/3.4/mingw_64/bin/")

Note that C:/RBuildTools is the default location suggested when I installed RTools.

This solution is indicated here, but I have the reverse issue of the default setup working in R and not in the latest RStudio. However, the solution still works!!

Note that instead of putting it in the global .Rprofile, you could put it in a project-specific .Rprofile, or even in your R code as long as it is run before loading the Rcpp or derivative packages. Note also that if you use binary packages that use Rcpp, there is no problem. Only when you’re compiling C++ code either for your own code or for building a package from source is this an issue. And, as far as I can tell, only on Windows.

Hope this prevents someone else from 3 hours of heartburn trying to make Rcpp work on a Windows box. And, if this has already been fixed in RStudio, please comment and I’ll be happy to update this post.

 

Finding my Dropbox in R

I’ll often keep non-sensitive data on Dropbox so that I can access it on all my machines without gumming up git. I just wrote a small script to find the Dropbox location on each of my computers automatically. The crucial information is available here, from Dropbox.

My small snippet of code is the following:

if (Sys.info()['sysname'] == 'Darwin') {
  info <- RJSONIO::fromJSON(
    file.path(path.expand("~"),'.dropbox','info.json'))
}
if (Sys.info()['sysname'] == 'Windows') {
  info <- RJSONIO::fromJSON(
    if (file.exists(file.path(Sys.getenv('APPDATA'), 'Dropbox','info.json'))) {
      file.path(Sys.getenv('APPDATA'), 'Dropbox', 'info.json')
    } else {
    file.path(Sys.getenv('LOCALAPPDATA'),'Dropbox','info.json')
    }
  )
}

dropbox_base <- info$personal$path

I haven’t included the Linux option since I don’t really use a Linux box, but the Dropbox link above will show you where the info.json file lies in Linux. Also, if you have a business Dropbox account, you’ll probably need info$business$path.

Hope this helps!!!