Skip to contents

This vignette compares the three estimator families exposed by the package. It is a practical guide rather than a universal ranking of methods.

All estimators share the same basic inputs: data, b, optional grid bounds, and, for ASH/GLBFP, the shift vector m.

library(GLBFP)

x <- cbind(rnorm(200), rnorm(200, sd = 1.25))
b <- c(0.75, 0.9)
m <- c(2, 2)
point <- c(0, 0)

fits <- list(
  ASH = ash(point, x, b = b, m = m),
  LBFP = lbfp(point, x, b = b),
  GLBFP = glbfp(point, x, b = b, m = m)
)

vapply(fits, function(z) z$estimation, numeric(1))
#>       ASH      LBFP     GLBFP 
#> 0.1240741 0.1395065 0.1349684

Grid estimates can be compared through the common *_estimate() interface.

grid_ash <- ash_estimate(x, b = b, m = m, grid_size = 15)
grid_lbfp <- lbfp_estimate(x, b = b, grid_size = 15)
grid_glbfp <- glbfp_estimate(x, b = b, m = m, grid_size = 15)

comparison <- data.frame(
  method = c("ASH", "LBFP", "GLBFP"),
  mean_density = c(
    mean(grid_ash$densities),
    mean(grid_lbfp$densities),
    mean(grid_glbfp$densities)
  ),
  max_density = c(
    max(grid_ash$densities),
    max(grid_lbfp$densities),
    max(grid_glbfp$densities)
  )
)

comparison
#>   method mean_density max_density
#> 1    ASH   0.02629630   0.1574074
#> 2   LBFP   0.02434245   0.1441655
#> 3  GLBFP   0.02439168   0.1476182

Practical starting rules

As a first pass:

  • use LBFP when a simple linear blend frequency polygon is sufficient;
  • use GLBFP when a tunable shifted linear blend estimator is desired;
  • use ASH when an averaged shifted histogram representation is desired.

The bandwidth vector b usually matters more than small changes in m. Use compute_bi_optim() as a reproducible starting point, then inspect sensitivity around that value. This helper implements a plug-in bandwidth choice motivated by the optimal cell-width calculation for multivariate frequency polygons in Carbon and Duchesne (2024).

For manuscript figures or numerical comparisons, report the selected b, the selected m, the grid definition, and the estimator family. This makes the result reproducible and avoids treating the default display as a statistical conclusion by itself.