Translate SQL queries into R expressions
APACHE-2.0 License
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
queryparser translates SQL queries into lists of unevaluated R expressions.
⚠️ Most R users should not directly use queryparser. Instead, use it through tidyquery. |
---|
For an introduction to tidyquery and queryparser, watch the recording of the talk "Bridging the Gap between SQL and R" from rstudio::conf(2020).
Install the released version of queryparser from CRAN with:
install.packages("queryparser")
Or install the development version from GitHub with:
# install.packages("remotes")
remotes::install_github("ianmcook/queryparser")
Call the function parse_query()
, passing a SELECT
statement enclosed in quotes as the first argument:
library(queryparser)
parse_query("SELECT DISTINCT carrier FROM flights WHERE dest = 'HNL'")
Queries can include the clauses SELECT
, FROM
, WHERE
, GROUP BY
, HAVING
, ORDER BY
, and LIMIT
:
parse_query(
" SELECT origin, dest,
COUNT(flight) AS num_flts,
round(SUM(seats)) AS num_seats,
round(AVG(arr_delay)) AS avg_delay
FROM flights f LEFT OUTER JOIN planes p
ON f.tailnum = p.tailnum
WHERE distance BETWEEN 200 AND 300
AND air_time IS NOT NULL
GROUP BY origin, dest
HAVING num_flts > 3000
ORDER BY num_seats DESC, avg_delay ASC
LIMIT 2;"
)
Set the argument tidyverse
to TRUE
to use functions from tidyverse packages including dplyr, stringr, and lubridate in the R expressions:
parse_query("SELECT COUNT(*) AS n FROM t WHERE x BETWEEN y AND z ORDER BY n DESC", tidyverse = TRUE)
queryparser will translate only explicitly allowed functions and operators, preventing injection of malicious code:
parse_query("SELECT x FROM y WHERE system('rm -rf /')")
queryparser does not currently support:
WITH
clause (common table expressions)OVER
expressions (window or analytic functions)queryparser currently has the following known limitations:
tidyverse
is set to TRUE
. An example of this is COUNT(DISTINCT ...)
expressions with multiple arguments.IS NULL
) have unparenthesized expressions as their operands, R will interpret the resulting code using a different order of operations than a SQL engine would. When using an expression as the operand to a logical operator, always enclose the expression in parentheses.queryparser is not intended to:
INSERT
or UPDATE
)SELECT
statements passed to itThe sqlparseR package (CRAN) provides a wrapper around the Python module sqlparse.