Skip to content

Commit ebb31cd

Browse files
authored
Merge pull request #52 from osorensen/copilot/update-set-smc-options-docs
Document set_smc_options with detailed parameter descriptions and examples
2 parents 0e66af8 + 7694ad3 commit ebb31cd

File tree

6 files changed

+236
-43
lines changed

6 files changed

+236
-43
lines changed

.Rbuildignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,4 @@
44
^data-raw$
55
^dev$
66
^\.github$
7+
^\.git$

DESCRIPTION

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,15 @@ Maintainer: Oystein Sorensen <oystein.sorensen.1985@gmail.com>
1010
Description: Provides nested sequential Monte Carlo algorithms for performing
1111
sequential inference in the Bayesian Mallows model, which is a widely used
1212
probability model for rank and preference data. The package implements the
13-
SMC² (Sequential Monte Carlo Squared) algorithm for handling sequentially
13+
SMC2 (Sequential Monte Carlo Squared) algorithm for handling sequentially
1414
arriving rankings and pairwise preferences, including support for complete
1515
rankings, partial rankings, and pairwise comparisons. The methods are based
1616
on Sørensen (2025) <doi:10.1214/25-BA1564>.
1717
License: GPL-3
1818
Encoding: UTF-8
1919
LazyData: true
2020
Roxygen: list(markdown = TRUE)
21-
RoxygenNote: 7.3.2
21+
RoxygenNote: 7.3.3
2222
LinkingTo:
2323
Rcpp,
2424
RcppArmadillo

R/set_smc_options.R

Lines changed: 116 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,125 @@
11
#' Set SMC options
22
#'
3-
#' @param n_particles Number of particles
4-
#' @param n_particle_filters Initial number of particle filters for each
5-
#' particle
6-
#' @param max_particle_filters Maximum number of particle filters.
7-
#' @param resampling_threshold Effective sample size threshold for resampling
8-
#' @param doubling_threshold Threshold for particle filter doubling. If the
9-
#' acceptance rate of the rejuvenation step falls below this threshold, the
10-
#' number of particle filters is doubled. Defaults to 0.2.
11-
#' @param max_rejuvenation_steps Maximum number of rejuvenation steps. If the
12-
#' number of unique particles has not exceeded half the number of particles
13-
#' after this many steps, the rejuvenation is still stopped.
14-
#' @param metric Metric
15-
#' @param resampler resampler
16-
#' @param latent_rank_proposal latent rank proposal
17-
#' @param verbose Boolean
18-
#' @param trace Logical specifying whether to save static parameters at each
19-
#' timestep.
3+
#' @description
4+
#' Configure the SMC2 (Sequential Monte Carlo Squared) algorithm for the
5+
#' Bayesian Mallows model. This function sets parameters that control the
6+
#' particle filter, resampling strategy, and diagnostic output.
7+
#'
8+
#' @param n_particles Integer specifying the number of particles to use in the
9+
#' outer SMC loop. More particles generally improve approximation accuracy but
10+
#' increase computational cost. Defaults to 1000.
11+
#' @param n_particle_filters Integer specifying the initial number of particle
12+
#' filters for each particle in the inner loop. This controls the granularity
13+
#' of the latent rank estimation. Defaults to 50.
14+
#' @param max_particle_filters Integer specifying the maximum number of particle
15+
#' filters allowed. The algorithm can adaptively increase the number of
16+
#' filters up to this limit when the acceptance rate is low. Defaults to
17+
#' 10000.
18+
#' @param resampling_threshold Numeric specifying the effective sample size
19+
#' threshold for triggering resampling. When the effective sample size falls
20+
#' below this threshold, the particles are resampled to avoid degeneracy.
21+
#' Defaults to `n_particles / 2`.
22+
#' @param doubling_threshold Numeric threshold for particle filter doubling. If
23+
#' the acceptance rate of the rejuvenation step falls below this threshold,
24+
#' the number of particle filters is doubled (up to `max_particle_filters`)
25+
#' to improve mixing. Should be between 0 and 1. Defaults to 0.2.
26+
#' @param max_rejuvenation_steps Integer specifying the maximum number of
27+
#' rejuvenation MCMC steps to perform. The rejuvenation step helps maintain
28+
#' particle diversity. The algorithm stops early if the number of unique
29+
#' particles exceeds half the total number of particles. Defaults to 20.
30+
#' @param metric Character string specifying the distance metric to use for
31+
#' comparing rankings. Options are `"footrule"` (default), `"spearman"`,
32+
#' `"kendall"`, `"cayley"`, `"hamming"`, or `"ulam"`. The choice of metric
33+
#' affects the likelihood function in the Mallows model.
34+
#' @param resampler Character string specifying the resampling algorithm.
35+
#' Options are `"multinomial"` (default), `"residual"`, `"stratified"`, or
36+
#' `"systematic"`. Different resamplers have different variance properties.
37+
#' @param latent_rank_proposal Character string specifying the proposal
38+
#' distribution for latent ranks in the Metropolis-Hastings step. Options are
39+
#' `"uniform"` (default) or `"pseudo"`. The `"pseudo"` option can provide
40+
#' better proposals for partial rankings.
41+
#' @param verbose Logical indicating whether to print progress messages during
42+
#' computation. Defaults to `FALSE`.
43+
#' @param trace Logical specifying whether to save static parameters (alpha,
44+
#' rho, cluster probabilities) at each timestep. This is useful for
45+
#' diagnostic purposes but increases memory usage. Defaults to `FALSE`.
2046
#' @param trace_latent Logical specifying whether to sample and save one
21-
#' complete set of latent rankings for each particle and each timepoint.
47+
#' complete set of latent rankings for each particle at each timepoint. This
48+
#' can be used to inspect the evolution of rankings over time but
49+
#' substantially increases memory usage. Defaults to `FALSE`.
50+
#'
51+
#' @details
52+
#' The SMC2 algorithm uses a nested particle filter structure:
53+
#' \itemize{
54+
#' \item The outer loop maintains `n_particles` particles, each representing
55+
#' a hypothesis about the static parameters (alpha, rho, cluster
56+
#' probabilities).
57+
#' \item The inner loop uses `n_particle_filters` particle filters per outer
58+
#' particle to estimate the latent rankings.
59+
#' \item When new data arrives, the algorithm updates the particle weights and
60+
#' performs rejuvenation MCMC steps to maintain diversity.
61+
#' \item If particle degeneracy occurs (effective sample size below
62+
#' `resampling_threshold`), particles are resampled.
63+
#' \item If the acceptance rate during rejuvenation is low (below
64+
#' `doubling_threshold`), the number of particle filters is adaptively
65+
#' doubled.
66+
#' }
67+
#'
68+
#' For computational efficiency with CRAN examples, use smaller values such as
69+
#' `n_particles = 100` and `n_particle_filters = 1`.
70+
#'
71+
#' @return A list containing all the specified options, suitable for passing to
72+
#' [compute_sequentially()].
73+
#'
74+
#' @seealso [compute_sequentially()], [set_hyperparameters()]
2275
#'
23-
#' @return A list
2476
#' @export
2577
#'
78+
#' @examples
79+
#' # Basic usage with default settings
80+
#' opts <- set_smc_options()
81+
#'
82+
#' # Customize for faster computation (suitable for CRAN examples)
83+
#' opts_fast <- set_smc_options(
84+
#' n_particles = 100,
85+
#' n_particle_filters = 1
86+
#' )
87+
#'
88+
#' # Use with complete rankings data
89+
#' mod <- compute_sequentially(
90+
#' complete_rankings,
91+
#' hyperparameters = set_hyperparameters(n_items = 5),
92+
#' smc_options = set_smc_options(n_particles = 100, n_particle_filters = 1)
93+
#' )
94+
#'
95+
#' # Customize resampling and metric
96+
#' opts_custom <- set_smc_options(
97+
#' n_particles = 200,
98+
#' n_particle_filters = 10,
99+
#' resampler = "stratified",
100+
#' metric = "kendall"
101+
#' )
102+
#'
103+
#' # Enable diagnostic output
104+
#' opts_trace <- set_smc_options(
105+
#' n_particles = 100,
106+
#' n_particle_filters = 1,
107+
#' verbose = TRUE,
108+
#' trace = TRUE
109+
#' )
110+
#'
111+
#' # For partial rankings with more rejuvenation steps
112+
#' mod_partial <- compute_sequentially(
113+
#' partial_rankings[1:10, ],
114+
#' hyperparameters = set_hyperparameters(n_items = 5),
115+
#' smc_options = set_smc_options(
116+
#' n_particles = 30,
117+
#' n_particle_filters = 5,
118+
#' max_rejuvenation_steps = 10,
119+
#' latent_rank_proposal = "uniform"
120+
#' )
121+
#' )
122+
#'
26123
set_smc_options <- function(
27124
n_particles = 1000, n_particle_filters = 50, max_particle_filters = 10000,
28125
resampling_threshold = n_particles / 2, doubling_threshold = .2,

README.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ knitr::opts_chunk$set(
1919
[![R-CMD-check](https://github.com/osorensen/BayesMallowsSMC2/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/osorensen/BayesMallowsSMC2/actions/workflows/R-CMD-check.yaml)
2020
<!-- badges: end -->
2121

22-
BayesMallowsSMC2 provides functions for performing sequential inference in the Bayesian Mallows model using the SMC$^{2}$ algorithm.
22+
BayesMallowsSMC2 provides functions for performing sequential inference in the Bayesian Mallows model using the SMC2 algorithm.
2323

2424
## Installation
2525

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
<!-- badges: end -->
1010

1111
BayesMallowsSMC2 provides functions for performing sequential inference
12-
in the Bayesian Mallows model using the SMC$^{2}$ algorithm.
12+
in the Bayesian Mallows model using the SMC2 algorithm.
1313

1414
## Installation
1515

man/set_smc_options.Rd

Lines changed: 115 additions & 20 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)