Skip to content

Commit 7a1735b

Browse files
Copilotosorensen
andcommitted
Document set_smc_options with verbose descriptions and examples
Co-authored-by: osorensen <21175639+osorensen@users.noreply.github.com>
1 parent 49b92db commit 7a1735b

File tree

1 file changed

+116
-19
lines changed

1 file changed

+116
-19
lines changed

R/set_smc_options.R

Lines changed: 116 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,125 @@
11
#' Set SMC options
22
#'
3-
#' @param n_particles Number of particles
4-
#' @param n_particle_filters Initial number of particle filters for each
5-
#' particle
6-
#' @param max_particle_filters Maximum number of particle filters.
7-
#' @param resampling_threshold Effective sample size threshold for resampling
8-
#' @param doubling_threshold Threshold for particle filter doubling. If the
9-
#' acceptance rate of the rejuvenation step falls below this threshold, the
10-
#' number of particle filters is doubled. Defaults to 0.2.
11-
#' @param max_rejuvenation_steps Maximum number of rejuvenation steps. If the
12-
#' number of unique particles has not exceeded half the number of particles
13-
#' after this many steps, the rejuvenation is still stopped.
14-
#' @param metric Metric
15-
#' @param resampler resampler
16-
#' @param latent_rank_proposal latent rank proposal
17-
#' @param verbose Boolean
18-
#' @param trace Logical specifying whether to save static parameters at each
19-
#' timestep.
3+
#' @description
4+
#' Configure the SMC² (Sequential Monte Carlo Squared) algorithm for the
5+
#' Bayesian Mallows model. This function sets parameters that control the
6+
#' particle filter, resampling strategy, and diagnostic output.
7+
#'
8+
#' @param n_particles Integer specifying the number of particles to use in the
9+
#' outer SMC loop. More particles generally improve approximation accuracy but
10+
#' increase computational cost. Defaults to 1000.
11+
#' @param n_particle_filters Integer specifying the initial number of particle
12+
#' filters for each particle in the inner loop. This controls the granularity
13+
#' of the latent rank estimation. Defaults to 50.
14+
#' @param max_particle_filters Integer specifying the maximum number of particle
15+
#' filters allowed. The algorithm can adaptively increase the number of
16+
#' filters up to this limit when the acceptance rate is low. Defaults to
17+
#' 10000.
18+
#' @param resampling_threshold Numeric specifying the effective sample size
19+
#' threshold for triggering resampling. When the effective sample size falls
20+
#' below this threshold, the particles are resampled to avoid degeneracy.
21+
#' Defaults to `n_particles / 2`.
22+
#' @param doubling_threshold Numeric threshold for particle filter doubling. If
23+
#' the acceptance rate of the rejuvenation step falls below this threshold,
24+
#' the number of particle filters is doubled (up to `max_particle_filters`)
25+
#' to improve mixing. Should be between 0 and 1. Defaults to 0.2.
26+
#' @param max_rejuvenation_steps Integer specifying the maximum number of
27+
#' rejuvenation MCMC steps to perform. The rejuvenation step helps maintain
28+
#' particle diversity. The algorithm stops early if the number of unique
29+
#' particles exceeds half the total number of particles. Defaults to 20.
30+
#' @param metric Character string specifying the distance metric to use for
31+
#' comparing rankings. Options are `"footrule"` (default), `"spearman"`,
32+
#' `"kendall"`, `"cayley"`, `"hamming"`, or `"ulam"`. The choice of metric
33+
#' affects the likelihood function in the Mallows model.
34+
#' @param resampler Character string specifying the resampling algorithm.
35+
#' Options are `"multinomial"` (default), `"residual"`, `"stratified"`, or
36+
#' `"systematic"`. Different resamplers have different variance properties.
37+
#' @param latent_rank_proposal Character string specifying the proposal
38+
#' distribution for latent ranks in the Metropolis-Hastings step. Options are
39+
#' `"uniform"` (default) or `"pseudo"`. The `"pseudo"` option can provide
40+
#' better proposals for partial rankings.
41+
#' @param verbose Logical indicating whether to print progress messages during
42+
#' computation. Defaults to `FALSE`.
43+
#' @param trace Logical specifying whether to save static parameters (alpha,
44+
#' rho, cluster probabilities) at each timestep. This is useful for
45+
#' diagnostic purposes but increases memory usage. Defaults to `FALSE`.
2046
#' @param trace_latent Logical specifying whether to sample and save one
21-
#' complete set of latent rankings for each particle and each timepoint.
47+
#' complete set of latent rankings for each particle at each timepoint. This
48+
#' can be used to inspect the evolution of rankings over time but
49+
#' substantially increases memory usage. Defaults to `FALSE`.
50+
#'
51+
#' @details
52+
#' The SMC² algorithm uses a nested particle filter structure:
53+
#' \itemize{
54+
#' \item The outer loop maintains `n_particles` particles, each representing
55+
#' a hypothesis about the static parameters (alpha, rho, cluster
56+
#' probabilities).
57+
#' \item The inner loop uses `n_particle_filters` particle filters per outer
58+
#' particle to estimate the latent rankings.
59+
#' \item When new data arrives, the algorithm updates the particle weights and
60+
#' performs rejuvenation MCMC steps to maintain diversity.
61+
#' \item If particle degeneracy occurs (effective sample size below
62+
#' `resampling_threshold`), particles are resampled.
63+
#' \item If the acceptance rate during rejuvenation is low (below
64+
#' `doubling_threshold`), the number of particle filters is adaptively
65+
#' doubled.
66+
#' }
67+
#'
68+
#' For computational efficiency with CRAN examples, use smaller values such as
69+
#' `n_particles = 100` and `n_particle_filters = 1`.
70+
#'
71+
#' @return A list containing all the specified options, suitable for passing to
72+
#' [compute_sequentially()].
73+
#'
74+
#' @seealso [compute_sequentially()], [set_hyperparameters()]
2275
#'
23-
#' @return A list
2476
#' @export
2577
#'
78+
#' @examples
79+
#' # Basic usage with default settings
80+
#' opts <- set_smc_options()
81+
#'
82+
#' # Customize for faster computation (suitable for CRAN examples)
83+
#' opts_fast <- set_smc_options(
84+
#' n_particles = 100,
85+
#' n_particle_filters = 1
86+
#' )
87+
#'
88+
#' # Use with complete rankings data
89+
#' mod <- compute_sequentially(
90+
#' complete_rankings,
91+
#' hyperparameters = set_hyperparameters(n_items = 5),
92+
#' smc_options = set_smc_options(n_particles = 100, n_particle_filters = 1)
93+
#' )
94+
#'
95+
#' # Customize resampling and metric
96+
#' opts_custom <- set_smc_options(
97+
#' n_particles = 200,
98+
#' n_particle_filters = 10,
99+
#' resampler = "stratified",
100+
#' metric = "kendall"
101+
#' )
102+
#'
103+
#' # Enable diagnostic output
104+
#' opts_trace <- set_smc_options(
105+
#' n_particles = 100,
106+
#' n_particle_filters = 1,
107+
#' verbose = TRUE,
108+
#' trace = TRUE
109+
#' )
110+
#'
111+
#' # For partial rankings with more rejuvenation steps
112+
#' mod_partial <- compute_sequentially(
113+
#' partial_rankings,
114+
#' hyperparameters = set_hyperparameters(n_items = 5),
115+
#' smc_options = set_smc_options(
116+
#' n_particles = 100,
117+
#' n_particle_filters = 5,
118+
#' max_rejuvenation_steps = 10,
119+
#' latent_rank_proposal = "pseudo"
120+
#' )
121+
#' )
122+
#'
26123
set_smc_options <- function(
27124
n_particles = 1000, n_particle_filters = 50, max_particle_filters = 10000,
28125
resampling_threshold = n_particles / 2, doubling_threshold = .2,

0 commit comments

Comments
 (0)