This vignette demonstrates the Approximate Bayesian
Computation Adaptive algorithm using tidyabc. Like
ABC-SMC, ABC-Adaptive is an iterative method that refines the parameter
estimates across multiple waves. However, it differs fundamentally in
how it generates proposals for the next wave.
Instead of using a fixed perturbation kernel around previous particles (as in SMC), ABC-Adaptive fits empirical proposal distributions to the weighted posterior sample from the previous wave. New particles for the next wave are then sampled directly from these fitted distributions. This approach aims to better capture the shape of the current posterior approximation and can sometimes lead to faster convergence or more efficient exploration, particularly when the posterior shape deviates significantly from a normal distribution.
This example uses the same model, priors, and observed data as defined in the “Getting started” and “ABC Sequential Monte-Carlo” vignettes.
We reuse the simulation function (test_simulation_fn),
scoring function (test_scorer_fn), priors
(test_priors), and observed data
(test_obsdata) from the previous vignettes. We also run a
small trial fit to calculate the scoreweights as
recommended for SMC.
# example simulation
# Well be trying to recover norm and gamma parameters
# We'll use this function for both example generation and fitting
test_simulation_fn = function(norm_mean, norm_sd, gamma_mean, gamma_sd) {
A = rnorm(1000, norm_mean, norm_sd)
B = rgamma2(1000, gamma_mean, gamma_sd)
C = rgamma2(1000, gamma_mean, gamma_sd)
return(
list(
data1 = A + B - C,
data2 = B * C
)
)
}
test_scorer_fn = function(simdata, obsdata) {
return(list(
data1 = calculate_wasserstein(simdata$data1, obsdata$data1),
data2 = calculate_wasserstein(simdata$data2, obsdata$data2)
))
}
tmp = test_simulation(
test_simulation_fn,
test_scorer_fn,
norm_mean = 4, norm_sd=2, gamma_mean=6, gamma_sd=2,
seed = 123
)
truth = tmp$truth
test_obsdata = tmp$obsdata
test_priors = priors(
norm_mean ~ unif(0, 10),
norm_sd ~ unif(0, 10),
gamma_mean ~ unif(0, 10),
gamma_sd ~ unif(0, 10),
# enforces convex gamma:
~ gamma_mean > gamma_sd
)
test_priors
#> Parameters:
#> * norm_mean: unif(min = 0, max = 10)
#> * norm_sd: unif(min = 0, max = 10)
#> * gamma_mean: unif(min = 0, max = 10)
#> * gamma_sd: unif(min = 0, max = 10)
#> Constraints:
#> * gamma_mean > gamma_sd
ggplot(
tibble(data1 = test_obsdata$data1), aes(x=data1))+
geom_histogram(bins=50, fill="steelblue", color="white")+
xlab("A + B - C")
ggplot(tibble(data2 = test_obsdata$data2), aes(x=data2))+
geom_histogram(bins=50, fill="coral", color="white")+
xlab("B × C")
trial_fit = abc_rejection(
obsdata = test_obsdata,
priors_list = test_priors,
sim_fn = test_simulation_fn,
scorer_fn = test_scorer_fn,
n_sims = 1000, # A smaller number for the trial run
acceptance_rate = 0.5 # A high acceptance rate for the trial to get diverse samples
)
#> ABC rejection, 1 wave.
# Calculate metrics, including recommended score weights
metrics = posterior_distance_metrics(trial_fit)
test_scorewt = metrics$scoreweights # Extract the recommended weightsNow we execute the ABC-Adaptive algorithm. The key difference lies in how new proposals are generated between waves.
adaptive_fit = abc_adaptive(
obsdata = test_obsdata,
priors_list = test_priors,
sim_fn = test_simulation_fn,
scorer_fn = test_scorer_fn,
n_sims = 1000,
# Number of simulations per wave
acceptance_rate = 0.5,
# Proportion of particles accepted each wave
scoreweights = test_scorewt,
# Use the weights calculated from the trial fit
parallel = FALSE,
# Disable parallel processing for vignette
knots = 7,
# Number of knots for fitting the empirical CDF (controls smoothness/complexity)
bw = 0.1,
# Bandwidth for the kernel used in empirical fitting (controls smoothness)
widen_by = 1,
# Factor to widen the proposal distribution if ESS is too low (default is 1, no widening)
distfit = "analytical"
)
#> ABC-Adaptive
#> Adaptive waves: ■ 1% | wave 1 ETA: 5m
#> Adaptive waves: ■ 1% | wave 2 ETA: 5m
#> Adaptive waves: ■■ 2% | wave 4 ETA: 5m
#> Adaptive waves: ■■ 3% | wave 6 ETA: 5m
#> Adaptive waves: ■■ 4% | wave 9 ETA: 5m
#> Adaptive waves: ■■■ 5% | wave 11 ETA: 5m
#> Adaptive waves: ■■■ 6% | wave 13 ETA: 5m
#> Adaptive waves: ■■■ 7% | wave 15 ETA: 5m
#> Adaptive waves: ■■■ 8% | wave 18 ETA: 5m
#> Converged on wave: 19
summary(adaptive_fit)
#> ABC adaptive fit: 19 waves - (converged)
#> Parameter estimates:
#> # A tibble: 4 × 4
#> # Groups: param [4]
#> param mean_sd median_95_CrI ESS
#> <chr> <chr> <chr> <dbl>
#> 1 gamma_mean 5.969 ± 0.078 5.969 [5.787 — 6.208] 313.
#> 2 gamma_sd 1.974 ± 0.123 1.972 [1.703 — 2.367] 313.
#> 3 norm_mean 3.985 ± 0.181 3.985 [3.540 — 4.501] 313.
#> 4 norm_sd 2.030 ± 0.446 2.076 [0.703 — 2.927] 313.n_sims, acceptance_rate,
scoreweights: These function similarly to
SMC.
knots: This parameter controls the
complexity of the empirical distribution fitted to each parameter’s
posterior sample in each wave. A higher number of knots allows the
fitted distribution to capture more detail and potentially sharper
features, but requires more data (simulations) to estimate reliably and
is computationally more expensive. A lower number provides a smoother,
more regularized fit. The default (often around 20-30) is often a good
starting point, but 7 is used here for illustration.
bw (bandwidth): This parameter
affects the smoothing kernel used during the empirical CDF fitting
(within the empirical_data or empirical_cdf
functions). A smaller bw leads to a less smooth,
potentially more detailed fit (closer to the sample histogram), while a
larger bw produces a smoother, more regularized fit. It
influences the trade-off between fitting the sample closely and avoiding
overfitting.
widen_by: ABC-Adaptive includes a
mechanism to prevent particle degeneracy (where the Effective Sample
Size, ESS, drops too low). If the ESS falls below a threshold (e.g.,
200), the algorithm increases the acceptance rate (loosens the tolerance
epsilon) and applies the widen()
function to the empirical proposal distribution fitted from the
previous wave’s particles. The widen_by parameter controls
the scale factor passed to the
widen() function, determining how much the dispersion of
the proposal distribution is increased. A value of 1 (the default) means
no widening is applied via the widen() function; the
adjustment happens only through the acceptance rate. Values greater than
1 will actively increase the spread of the proposal distribution based
on the logit-space scaling transformation described previously
(specifically,
expit((logit(qx) - logit(qmedian)) * scale + logit(qmedian))
where scale = widen_by).
The summary output shows the final posterior
statistics aggregated across all waves.
The standard posterior plot shows the marginal distributions for each parameter.
Compared to the SMC fit (using the same model and data), you might
observe differences in the shape of the posterior densities. Because the
empirical fitting process often respects the bounds of the prior
(especially when the prior is used as a link function, as mentioned in
the posterior_fit_empirical documentation), the resulting
posteriors can appear “sharper” or more constrained near the prior
boundaries, especially if the true posterior is close to these limits.
This is in contrast to SMC, where the Gaussian perturbation kernel can
sometimes produce smoother, more Gaussian-like tails.
The plot_convergence() function is equally important for
the Adaptive algorithm.
This plot displays the same metrics as for SMC across waves: 1.
abs_distance: The tolerance
epsilon should decrease over waves, ideally converging
towards a stable value. 2. ESS (Effective Sample
Size): Unlike SMC, the ESS in Adaptive might show a different
pattern. It often rises as the proposal distribution adapts to the
posterior, but it might not plateau as distinctly. The lack of a clear
plateau doesn’t necessarily indicate non-convergence; it might
suggest that the algorithm could continue to converge faster if
run for more waves, as the proposals become increasingly efficient. The
absence of a plateau can be due to the direct adaptation of the proposal
distribution shape. 3. IQR_95_redn (95% CI
Reduction): This should generally decline, indicating shrinking
uncertainty in parameter estimates. 4.
rel_mean_change (Relative Mean Change):
This should also decline, indicating stabilizing central estimates.
The Adaptive method often converges in fewer waves than SMC for the given example, as seen in the faster stabilization of distance metrics and parameter changes. The ESS behaviour, lacking a clear plateau, reflects the different proposal mechanism.
The plot_evolution() function shows how the estimated
parameter distributions change over the waves.
This plot visualizes how the empirical proposal distributions adapt and refine the posterior approximation wave by wave, potentially showing sharper features emerging compared to SMC, especially near prior boundaries.
Finally, plot_correlations() helps understand the
relationships between parameters in the final posterior sample.
This heatmap shows the correlation coefficients between pairs of parameters based on the final set of weighted particles, similar to the SMC output. The uniform shape of the priors is more apparent in the adaptive fit.
ABC-Adaptive offers an alternative iterative approach to ABC
inference by directly modelling the posterior distribution from the
previous wave’s particles to generate new proposals. This can lead to
different convergence characteristics and posterior shapes compared to
SMC, particularly when dealing with posteriors constrained by prior
bounds. Understanding the knots and bw
parameters is key to controlling the fidelity and smoothness of the
empirical distribution fits used internally.