Skip to content

Commit db9d793

Browse files
committed
compilling
1 parent 14b69ec commit db9d793

File tree

5 files changed

+469
-40
lines changed

5 files changed

+469
-40
lines changed

DESCRIPTION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ License: GPL-3
1919
Encoding: UTF-8
2020
LazyData: true
2121
Roxygen: list(markdown = TRUE)
22-
RoxygenNote: 7.3.2
22+
RoxygenNote: 7.3.3
2323
LinkingTo:
2424
Rcpp,
2525
RcppArmadillo

README.md

Lines changed: 178 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,25 @@
99
<!-- badges: end -->
1010

1111
BayesMallowsSMC2 provides functions for performing sequential inference
12-
in the Bayesian Mallows model using the SMC$^{2}$ algorithm.
12+
in the Bayesian Mallows model using the nested sequential Monte Carlo
13+
(SMC²) algorithm. This package implements the methodology described in
14+
Sørensen et al. (2025) for learning from ranking and preference data
15+
that arrives sequentially over time.
16+
17+
## Key Features
18+
19+
- **Sequential Learning**: Process ranking and preference data as it
20+
arrives over time
21+
- **Nested SMC² Algorithm**: Efficient particle-based inference for
22+
complex parameter-latent state dependencies
23+
- **Multiple Data Types**: Support for complete rankings, partial
24+
rankings, and pairwise preferences
25+
- **Flexible Distance Metrics**: Kendall, Cayley, Hamming, Footrule,
26+
Spearman, and Ulam distances
27+
- **Mixture Models**: Multi-cluster Bayesian Mallows models for
28+
heterogeneous populations
29+
- **Real-time Inference**: Online posterior updates without reprocessing
30+
historical data
1331

1432
## Installation
1533

@@ -21,8 +39,163 @@ You can install the development version of BayesMallowsSMC2 from
2139
devtools::install_github("osorensen/BayesMallowsSMC2")
2240
```
2341

24-
## Usage
42+
## Quick Start
43+
44+
Here’s a basic example of sequential ranking analysis:
45+
46+
``` r
47+
library(BayesMallowsSMC2)
48+
49+
# Generate synthetic ranking data
50+
set.seed(123)
51+
n_items <- 5
52+
n_users <- 20
53+
n_timepoints <- 10
54+
55+
# Create sequential ranking data
56+
data <- expand.grid(
57+
timepoint = 1:n_timepoints,
58+
user = 1:n_users
59+
)
60+
61+
# Add rankings for each item (1 = most preferred, 5 = least preferred)
62+
for(i in 1:n_items) {
63+
data[[paste0("item", i)]] <- sample(1:n_items, nrow(data), replace = TRUE)
64+
}
65+
66+
# Set up model parameters
67+
hyperparams <- set_hyperparameters(
68+
n_items = n_items,
69+
n_clusters = 2, # Two preference groups
70+
alpha_shape = 2, # Precision prior
71+
alpha_rate = 1
72+
)
73+
74+
# Configure SMC² algorithm
75+
smc_opts <- set_smc_options(
76+
n_particles = 500, # Parameter particles
77+
n_particle_filters = 100, # Latent state filters per particle
78+
metric = "kendall", # Distance metric
79+
verbose = TRUE
80+
)
81+
82+
# Run sequential inference
83+
result <- compute_sequentially(
84+
data = data,
85+
hyperparameters = hyperparams,
86+
smc_options = smc_opts
87+
)
88+
89+
# Analyze results
90+
summary(result)
91+
plot(result)
92+
```
93+
94+
## Algorithm Overview
95+
96+
The nested SMC² algorithm operates on two levels:
97+
98+
1. **Outer SMC (Parameter Level)**: Maintains particles representing
99+
samples from the posterior distribution of static parameters
100+
(precision α, modal rankings ρ, cluster probabilities τ)
101+
102+
2. **Inner SMC (Latent State Level)**: For each parameter particle,
103+
runs multiple particle filters to track the evolution of latent
104+
rankings and cluster assignments over time
105+
106+
This nested structure enables efficient inference in the complex joint
107+
parameter-latent state space while maintaining proper uncertainty
108+
quantification.
109+
110+
## Data Formats
111+
112+
### Complete Rankings
113+
114+
``` r
115+
data <- data.frame(
116+
timepoint = c(1, 1, 2, 2),
117+
user = c(1, 2, 1, 2),
118+
item1 = c(1, 2, 1, 3), # Rankings for item 1
119+
item2 = c(2, 1, 2, 1), # Rankings for item 2
120+
item3 = c(3, 3, 3, 2) # Rankings for item 3
121+
)
122+
```
123+
124+
### Pairwise Preferences
125+
126+
``` r
127+
# First precompute topological sorts
128+
prefs_matrix <- matrix(c(1, 2, 2, 3, 3, 1), ncol = 2, byrow = TRUE)
129+
topo_sorts <- precompute_topological_sorts(
130+
prefs = prefs_matrix,
131+
n_items = 3,
132+
save_frac = 0.1
133+
)
134+
135+
# Create preference data
136+
pref_data <- data.frame(
137+
timepoint = c(1, 1, 2, 2),
138+
user = c(1, 2, 1, 2),
139+
top_item = c(1, 2, 3, 1), # Preferred item
140+
bottom_item = c(2, 3, 1, 3) # Dispreferred item
141+
)
142+
143+
# Run analysis
144+
result <- compute_sequentially(
145+
data = pref_data,
146+
hyperparameters = set_hyperparameters(n_items = 3),
147+
smc_options = set_smc_options(n_particles = 200),
148+
topological_sorts = topo_sorts
149+
)
150+
```
151+
152+
## Performance Tuning
153+
154+
The algorithm’s performance depends on several key parameters:
155+
156+
- **n_particles**: More particles improve accuracy but increase
157+
computational cost
158+
- **n_particle_filters**: More filters per particle improve latent state
159+
estimation
160+
- **resampling_threshold**: Controls when to resample particles
161+
(default: n_particles/2)
162+
- **metric**: Different distance metrics capture different aspects of
163+
ranking disagreement
164+
165+
For large-scale problems, consider:
166+
167+
``` r
168+
# High-performance configuration
169+
smc_opts <- set_smc_options(
170+
n_particles = 1000,
171+
n_particle_filters = 200,
172+
max_particle_filters = 1000,
173+
resampler = "systematic",
174+
trace = FALSE, # Reduce memory usage
175+
trace_latent = FALSE
176+
)
177+
```
178+
179+
## Citation
180+
181+
If you use this package in your research, please cite:
182+
183+
> Sørensen, Ø., Stein, A., Netto, W. L., & Leslie, D. S. (2025).
184+
> Sequential Rank and Preference Learning with the Bayesian Mallows
185+
> Model. *Bayesian Analysis*. DOI: 10.1214/25-BA1564.
186+
187+
## Additional Resources
188+
189+
- **Paper**: The foundational methodology paper with theoretical details
190+
and empirical studies
191+
- **OSF Repository**: <https://osf.io/pquk4/> - Contains replication
192+
code and additional examples
193+
- **Vignettes**: Detailed tutorials and case studies (coming soon)
194+
195+
## Getting Help
25196

26-
This package is under development, and is not yet well documented. For
27-
examples on how to use it, see the code in the OSF repository
28-
<https://osf.io/pquk4/>.
197+
- **Issues**: Report bugs and request features on [GitHub
198+
Issues](https://github.com/osorensen/BayesMallowsSMC2/issues)
199+
- **Documentation**: Use `?function_name` for detailed help on specific
200+
functions
201+
- **Examples**: See function documentation for comprehensive examples

man/compute_sequentially.Rd

Lines changed: 115 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)