R package for calculation of 22 CAnonical Time-series CHaracteristics
R package for the calculation of 22 CAnonical Time-series CHaracteristics. The package is an efficient implementation that calculates time-series features coded in C.
knitr::opts_chunk$set(
comment = NA, fig.width = 7, fig.height = 5, cache = FALSE)
You can install the stable version of Rcatch22
from CRAN using the following:
install.packages("Rcatch22")
You can install the development version of Rcatch22
from GitHub using the following:
devtools::install_github("hendersontrent/Rcatch22")
You might also be interested in a related R package called theft
(Tools for Handling Extraction of Features from Time series) which provides standardised access to Rcatch22
and 5 other feature sets (including 3 feature sets from Python libraries) for a total of ~1,200 features. theft
also includes extensive functionality for processing and analysing time-series features, including automatic time-series classification, top performing feature identification, and a range of statistical data visualisations.
Please open the included vignette within an R environment or visit the detailed Rcatch22
Wiki for information and tutorials.
With features coded in C, Rcatch22
is highly computationally efficient, scaling nearly linearly with time-series size. Computation time in seconds for a range of time series lengths is presented below.
library(dplyr)
library(purrr)
library(ggplot2)
library(microbenchmark)
library(Rcatch22)
set.seed(123)
lengths <- c(10^1, 10^2, 10^3, 10^4, 10^5)
generate_sims <- function(n){
tmp <- c(cumsum(rnorm(n, mean = 0, sd = 1)))
sim <- as.data.frame(summary(microbenchmark(catch22_all(tmp), times = 10, unit = "s"))) %>%
dplyr::select(c(mean)) %>%
mutate(ts_length = n)
return(sim)
}
sims <- lengths %>%
purrr::map_df(~ generate_sims(n = .x))
# Draw graphic
sims %>%
ggplot() +
geom_smooth(aes(x = ts_length, y = mean), formula = y ~ x, method = "lm",
se = FALSE, size = 1.15, colour = "#1f78b4") +
geom_point(aes(x = ts_length, y = mean), size = 3.5, colour = "#1f78b4") +
labs(x = "Time series length",
y = "Computation time (s)") +
scale_x_log10(limits = c(1e1, 1e5),
breaks = scales::trans_breaks("log10", function(x) 10^x),
labels = scales::trans_format("log10", scales::math_format(10^.x))) +
scale_y_log10(breaks = scales::trans_breaks("log10", function(x) 10^x),
labels = scales::trans_format("log10", scales::math_format(10^.x))) +
theme_bw() +
theme(panel.grid.minor = element_blank(),
legend.position = "bottom",
axis.text = element_text(size = 12))
An option to include the mean and standard deviation as features in addition to catch22
is available through setting the catch24
argument to TRUE
:
features <- catch22_all(x, catch24 = TRUE)
A DOI is provided at the top of this README. Alternatively, the package can be cited using the following:
citation("Rcatch22")
Please also cite the original catch22 paper: