Skip to contents

This vignette gives only the background needed to understand the software interface. It is not a replacement for a methodological article.

Histogram-based density estimation

Histogram-based estimators approximate a density by aggregating observations into bins. Frequency polygon estimators smooth the histogram idea through linear interpolation between neighboring bin heights. Averaged shifted histograms use several shifted grids and average the resulting estimates.

These ideas are discussed in standard references on multivariate density estimation, including Scott (1992), and in work on nonparametric density estimation bounds such as Terrell and Scott (1985).

Estimators implemented in the package

The package separates pointwise and grid-based evaluation.

Estimator Pointwise function Grid function
Averaged Shifted Histogram ASH() ASH_estimate()
Linear Blend Frequency Polygon LBFP() LBFP_estimate()
General Linear Blend Frequency Polygon GLBFP() GLBFP_estimate()

The pointwise functions evaluate the density at one supplied location. The grid functions evaluate the same estimator on a regular or user-supplied grid. They are intended for visualization and reproducible numerical summaries.

Bandwidth and shift parameters

The bandwidth vector b controls the scale of the bins. Smaller values can increase local variation, while larger values can smooth the estimate. The helper compute_bi_optim() provides a plug-in starting value based on the optimal cell-width calculation for multivariate frequency polygons in Carbon and Duchesne (2024).

The shift vector m must contain positive integers. Larger values increase the number of shifted components and therefore increase computational cost.

Scope of this package

The package provides an R implementation, documentation, examples, tests, pkgdown articles, and benchmark scaffolding for GLBFP workflows. This vignette does not claim a new theoretical result.

References

Scott, D. W. (1992). Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley. doi:10.1002/9780470316849.

Carbon, M., and Duchesne, T. (2024). Multivariate frequency polygon for stationary random fields. Annals of the Institute of Statistical Mathematics, 76(2), 263-287. doi:10.1007/s10463-023-00883-5.

Terrell, G. R., and Scott, D. W. (1985). Oversmoothed Nonparametric Density Estimates. Journal of the American Statistical Association, 80(389), 209-214. doi:10.1080/01621459.1985.10477163.

The complete bibliographic record for the original GLBFP methodological article has not yet been verified in this repository. It should be added before using this vignette as source material for a journal article.