essential-R

Essential tools and libraries for programming in R

OTHER License

Stars
18

Essential R

🚀 Essential tools and libraries for programming in R

About

This document serves as a personal list of

  • tools for package development,
  • good practices for programming,
  • and most frequently used packages

with an emphasis on data science, Bayesian stats and probabilistic ML.

R package development

A minimum R-package stack at least consists of the following packages of tools:

  • goodpractice for advice on writing R packages
  • devtools for general package development
  • testthat for unit testing
  • roxygen2 for method documentation
  • covr to generate coverage reports
  • lintr for static code analysis
  • styler to automatically format code
  • usethis for automation of repetitive tasks during development
  • remotes to install R packages from Git repositories, CRAN and Bioconductor
  • pkgdown to generate websites of your package
  • rcmdcheck to check your package within R
  • profvis to visualize profiling data
  • bench to time R expressions
  • microbenchmark to also time R expressions
  • lobstr to pry open R
  • here to find files and folders

Many more can be find on github at r-lib or rstudio.

Programming

  • purrr for functional programming
  • magrittr for pipeing function calls
  • R6 for object-oriented programming with encapsulation
  • Rcpp (with RcppArmadillo, RcppEigen and BH) for integration of C++ code
  • reticulate for interfacing to Python
  • cpp11 as alternative (and complement) to Rcpp
  • compiler, doParallel, parallel, foreach for speeding up R
  • rlang as low-level API for programming in R

C++

Furthermore, some good reading:

  • Scott Meyers: Effective C++
  • Scott Meyers: Effective Modern C++
  • Scott Meyers: Effective STL
  • Kurt Guntheroth: Optimized C++
  • David Vandevoorde: C++ Templates - the compete guide
  • Nicolai Josuttis: C++17 - the complete guide

Working with data

  • tidyverse (dplyr, tidyr, ...) for working with data in general
  • data.table as a fast alternative to R`s native data frame
  • datastructures for advanced data structures
  • dbplyr to work with data bases
  • RSQLite to work with SQLite data bases

Machine learning and Bayesian statistics

  • rstan to fit Bayesian models
  • cmdstanr as light-weight interface to cmdstan
  • rstanarm for applied regression modeling
  • brms for multilevel models
  • bayesplot to visualize Bayesian inferences
  • loo for approximate LOO-CV and PSIS
  • projpred for projection predictive variable selection
  • rstantools for developing R Packages interfacing with Stan
  • posterior for working with output from Bayesian models
  • coda for summarizing and working with MCMC output
  • MCMCpack provides some utility functions
  • LaplacesDemon for even more Bayes utility
  • mgcv for generalized additive mixed models
  • tensorflow for numerical computation
  • tfprobability for statistical computation and probabilistic modeling
  • keras to work with neural networks
  • sparklyr for big data processing
  • kernlab for kernel-based machine learning

Graphical models and causal inference

  • bnlearn for BN structure learning
  • pcalg for causal inference using graphical models
  • ggdag for visualizing DAGs
  • dagitty for analysis of structural equation models

Optimization

  • nloptr for non-linear optimization
  • cvxr for disciplined convec optimization

Visualization

Reporting

  • shiny for interactive web applications
  • rmarkdown to generate websites, pdf documents, etc from markdown files
  • bookdown for authoring books
  • tufte for Tufte-style documents
  • xaringan for HTML presentations

Other

  • igraph to work with graphs
  • drake for building pipelines

Author

Simon Dirmeier [email protected]

License

This work by Simon Dirmeier is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Badges
Extracted from project README
Project Status