Fast spatial data matching in R
MCMC for ‘Big Data’ with Stan
This post is an extension (and a translation to R) of PyMC-Labs’ benchmarking of MCMC for “Big Data”.
The Stan code was updated to use within-chain parallelization and compiler optimization for faster CPU sampling. Stan was able to achieve similar sampling speeds as PyMC’s JAX + GPU solution, purely on CPU.
Data wrangling with data.table and the Tidyverse
Bayesian Rock Climbing Rankings
This post is a transposition to R of Ethan Rosenthal’s blog post on modeling Rock Climbing route difficulty using a Bayesian IRT (Item Response Theory) model.
The original Stan code was updated to use within-chain parallelization and compiler optimization for faster CPU sampling.
Several data processing solutions are showcased, using either data.table
or dbplyr
(with a DuckDB
backend), with timings to compare their speed.