Skip to content

Commit c328bb1

Browse files
committed
Analytics: Capture curated PostHog feature events through one consent-gated backend path
M4 of the beta-analytics plan: PII-free product events ride the same `anal_` install id, tri-state consent gate, and dev/CI suppression as the heartbeat, so the beta dashboard gets real feature usage without ever being able to tie usage to a person. - `analytics/posthog.rs`: `capture(event, props)` builds the `/capture/` body (`distinct_id` = the `anal_` id, `properties.source = "desktop"`, `$set` = the heartbeat's config-shape verbatim) and fire-and-forgets to `https://eu.i.posthog.com/capture/`. Key via `option_env!("CMDR_POSTHOG_KEY")`; `None` locally is a no-op. A debug-build `sanitize_props` net logs a scoped `warn!` if any string prop value looks PII-shaped (`/`, `\`, `@`, or `~/`), catching an accidental path/email/query in dev before it ships. - `track_event` Tauri command (`commands/analytics.rs`): thin async pass-through so frontend events use ONE backend path. Takes `props_json: String` (the typed `trackEvent` wrapper JSON-stringifies) since `serde_json::Value` can't cross specta. No capability entry (custom app commands aren't ACL-gated). - Open, extensible API: `capture` / `track_event` take arbitrary names + PII-free prop maps, so future events are a one-liner. The "how to add an event" recipe + the PII-free convention live in `analytics/CLAUDE.md`. - Starter event set (enums/counts/bools only; never paths, names, queries, prompts): `app_launched`, `pane_navigated` (volume_kind), `search_used` (mode), `select_files_used` (mode + action), `file_transfer_completed` (op, item_count bucket, had_conflicts), `delete_used` (trashed, item_count), `smb_connected`, `mtp_connected`, `settings_opened`, `error_encountered` (FriendlyError category). - Build env: `CMDR_POSTHOG_KEY` added to the `tauri-action` step's `env:` in `release.yml` (value is a GitHub secret, never in-repo); `build.rs` gets a `rerun-if-env-changed` so a changed key recompiles the consumer. - Docs: `analytics/CLAUDE.md` (open API, sanitize_props net, single `track_event` path, `option_env!` key, `$set` = config-shape) and `docs/tooling/posthog.md` (desktop now captures; project 136072, EU host, key mechanism).
1 parent d1c481f commit c328bb1

25 files changed

Lines changed: 520 additions & 7 deletions

File tree

.github/workflows/release.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,9 @@ jobs:
112112
APPLE_API_ISSUER: ${{ secrets.APPLE_API_ISSUER }}
113113
APPLE_API_KEY: ${{ secrets.APPLE_API_KEY }}
114114
APPLE_API_KEY_PATH: ~/private_keys/AuthKey_${{ secrets.APPLE_API_KEY }}.p8
115+
# Public PostHog ingest key (phc_...), baked into the build via option_env! for desktop
116+
# feature events. Absent locally (events no-op in dev); set as a GitHub secret for releases.
117+
CMDR_POSTHOG_KEY: ${{ secrets.CMDR_POSTHOG_KEY }}
115118
with:
116119
projectPath: ./apps/desktop
117120
args: --target ${{ matrix.target }}

apps/desktop/src-tauri/build.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@ fn main() {
22
println!("cargo:rerun-if-changed=build.rs");
33
println!("cargo:rerun-if-changed=resources/ai/.version");
44
println!("cargo:rerun-if-changed=../scripts/download-llama-server.go");
5+
// `analytics::posthog` reads this via `option_env!`, which is baked at compile time. Without
6+
// this line, changing the env var between builds wouldn't trigger a recompile of that consumer.
7+
println!("cargo:rerun-if-env-changed=CMDR_POSTHOG_KEY");
58

69
// Ensure resources/ai/ is populated before tauri_build::build() validates the
710
// resource glob in tauri.conf.json. The Go script is idempotent (skips when

apps/desktop/src-tauri/src/analytics/CLAUDE.md

Lines changed: 72 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -103,13 +103,82 @@ Fire-and-forget POST mirroring the crash/error reporters (10 s timeout, errors l
103103
next hourly tick retries). Endpoint: `http://localhost:8787/heartbeat` (debug) /
104104
`https://api.getcmdr.com/heartbeat` (release).
105105

106+
## PostHog feature events
107+
108+
Curated product events ride the SAME consent gate, dev/CI suppression, and `anal_` install id as
109+
the heartbeat. `posthog::capture(event, props)` builds the `/capture/` body and fire-and-forget
110+
POSTs to `https://eu.i.posthog.com/capture/` (EU cloud, project `136072`). Shape:
111+
112+
```json
113+
{ "api_key": "phc_...", "event": "<name>", "distinct_id": "anal_<uuid>",
114+
"properties": { "source": "desktop", ...props }, "$set": <config-shape> }
115+
```
116+
117+
- **`$set` is the config-shape, verbatim.** Person properties reuse `config_shape::build_config_shape`
118+
(the same allowlisted object the heartbeat ships): one source of truth, no second PII surface.
119+
- **`source: "desktop"`** is injected first and can't be shadowed by a caller `source` prop (so the
120+
dashboard always splits desktop events from the website's).
121+
- **The key is `option_env!("CMDR_POSTHOG_KEY")`**, baked at build time (a GitHub secret on the
122+
`tauri-action` step in `release.yml`; `build.rs` has a `rerun-if-env-changed` for it). `None`
123+
locally → `capture` is a no-op (logged once at debug). The key is public by design (PostHog ingest
124+
keys are safe in client code).
125+
- **Backend events call `posthog::capture` directly; frontend events call the `track_event` IPC**
126+
(`commands/analytics.rs`), which is a thin pass-through to `capture`. ONE backend path, ONE consent
127+
gate. The IPC takes `props_json: String` (the frontend's typed `trackEvent` wrapper does the
128+
`JSON.stringify`) because the prop set is open and `serde_json::Value` can't cross specta. No
129+
capability entry needed (custom app commands aren't ACL-gated).
130+
131+
### The open event API + how to add an event
132+
133+
Events are an OPEN set: `capture(name, props)` / `track_event(name, props)` take an arbitrary name
134+
and an arbitrary PII-free prop map. Adding one is a one-liner, no enum, no schema:
135+
136+
- **Backend event**: at the success chokepoint, `crate::analytics::posthog::capture("my_event", serde_json::json!({ "kind": some_enum }))`.
137+
- **Frontend event**: `import { trackEvent } from '$lib/tauri-commands'`, then `void trackEvent('my_event', { kind: someEnum })`.
138+
- **Name internals after the UI** (project rule): the event name uses the feature's user-facing
139+
vocabulary (`pane_navigated`, `search_used`), and props are categorical (`volume_kind`, `mode`).
140+
141+
### PII-free convention + the `sanitize_props` net
142+
143+
Every prop value MUST be a categorical enum, a count (or coarse bucket), or a bool. NEVER a path,
144+
file name, search query, AI prompt, or hostname. This is enforced by review, NOT by redaction.
145+
`posthog::sanitize_props` is a **debug-build backstop**: it scans string prop values and logs a
146+
scoped `warn!` if one looks PII-shaped (contains `/`, `\`, `@`, or a `~/` prefix). It only warns
147+
(never strips), so production behavior is identical with the guard compiled out, and a leak surfaces
148+
loudly in dev before shipping. It's a safety net, not a license to pass free-form strings.
149+
150+
### The starter event set (where each fires)
151+
152+
PII-free; this set grows over time. Backend events fire at success chokepoints; frontend events ride
153+
`track_event`.
154+
155+
- `app_launched` (backend, `lib.rs` setup) — no props.
156+
- `pane_navigated` (frontend, `FilePane.svelte` `handleListingComplete`) — `volume_kind` enum
157+
(`local`/`smb`/`mtp`/`network`/`search-results`); never the path.
158+
- `search_used` (frontend, `SearchDialog.svelte` `runSearch`) — `mode` enum; never the query.
159+
- `select_files_used` (frontend, `SelectionDialog.svelte` `commitMatches`) — `mode` (match mode) +
160+
`action` (add/remove); never the pattern.
161+
- `file_transfer_completed` (backend, `write_operations/types.rs` `TauriEventSink::emit_complete`) —
162+
`op` (copy/move), `item_count` bucket, `had_conflicts` bool (proxied from `files_skipped > 0`, since
163+
skips happen only via conflict resolution); never names/paths.
164+
- `delete_used` (backend, same sink) — `trashed` bool, `item_count` bucket.
165+
- `smb_connected` (backend, `backends/smb.rs` `connect_smb_volume`) — no host/share/credential props.
166+
- `mtp_connected` (backend, `mtp/connection/mod.rs` `connect`) — no device/product props.
167+
- `settings_opened` (frontend, `command-handlers/app-dialog-handlers.ts` `app.settings`) — no props.
168+
- `error_encountered` (backend, `listing/streaming.rs` `TauriListingEventSink::emit_error`) —
169+
`category` enum (from the FriendlyError); never the path/message/provider.
170+
106171
## Files
107172

108173
- `mod.rs`: the heartbeat loop (launch beat + hourly), the consent gate, the payload struct, the
109-
fire-and-forget send. `init(app)` + `start()` mirror `space_poller`'s spawn pattern, wired from
110-
`lib.rs` setup.
174+
fire-and-forget send, and the shared helpers (`suppressed`, `read_raw_settings`, `APP_HANDLE`,
175+
`analytics_consent_granted`) that `posthog` reuses. `init(app)` + `start()` mirror `space_poller`'s
176+
spawn pattern, wired from `lib.rs` setup.
177+
- `posthog.rs`: the PostHog `capture(event, props)` path, the pure `build_capture_body`, the
178+
debug-build `sanitize_props` PII net, and the `option_env!` key mechanism.
111179
- `config_shape.rs`: the pure, unit-tested config-shape builder and the `CATEGORICAL_STRING_KEYS`
112-
allowlist. The only place the PII-free rule lives.
180+
allowlist. The only place the PII-free rule lives. Mirrored as both the heartbeat `config` and the
181+
PostHog `$set` person properties.
113182

114183
## Wiring
115184

apps/desktop/src-tauri/src/analytics/mod.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
//! suppressed in dev/CI builds unless explicitly forced for integration tests.
77
88
mod config_shape;
9+
pub mod posthog;
910

1011
use serde::Serialize;
1112
use std::path::PathBuf;
Lines changed: 263 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,263 @@
1+
//! PostHog feature events: the one backend path for all product analytics events.
2+
//!
3+
//! See `analytics/CLAUDE.md` § "PostHog feature events" for the full model. In short: backend code
4+
//! calls [`capture`] directly, frontend code calls the `track_event` IPC command (which calls
5+
//! [`capture`]). Both ride the SAME consent gate and dev/CI suppression as the heartbeat, attach the
6+
//! `anal_` install id as the PostHog `distinct_id`, and mirror the PII-free config-shape as `$set`
7+
//! person properties.
8+
//!
9+
//! Events are an OPEN set: [`capture`] takes an arbitrary event name plus an arbitrary PII-free prop
10+
//! map, so adding an event later is a one-line call with whatever categorical props that event
11+
//! needs. The PII-free convention (enums, counts, bools only; never paths, names, queries, prompts)
12+
//! is enforced socially by review and backstopped in debug builds by [`sanitize_props`].
13+
14+
use super::config_shape;
15+
use serde_json::{Map, Value};
16+
17+
/// PostHog capture endpoint (EU cloud, project `136072`). Same host the website uses.
18+
const CAPTURE_URL: &str = "https://eu.i.posthog.com/capture/";
19+
20+
/// Network timeout for one fire-and-forget capture. Mirrors the heartbeat sender.
21+
const CAPTURE_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10);
22+
23+
/// The public PostHog project key (`phc_...`), baked at build time via the `CMDR_POSTHOG_KEY` env
24+
/// var (a GitHub secret for release builds). `None` for local dev builds, where `capture` is a
25+
/// no-op. The key is public by design (PostHog ingest keys are safe in client code).
26+
const POSTHOG_KEY: Option<&str> = option_env!("CMDR_POSTHOG_KEY");
27+
28+
/// Captures one PostHog feature event. Fire-and-forget: builds the payload and spawns the POST, then
29+
/// returns immediately so no call site ever blocks on the network.
30+
///
31+
/// Gated identically to the heartbeat: a no-op in dev/CI builds (unless `CMDR_ANALYTICS_FORCE=1`),
32+
/// a no-op when the user opted out (`analytics.enabled == Some(false)`), and a no-op when no
33+
/// `CMDR_POSTHOG_KEY` was baked in (local dev). `props` is an arbitrary PII-free object; pass
34+
/// `serde_json::json!({})` for an event with no properties.
35+
pub fn capture(event: &str, props: Value) {
36+
if super::suppressed() {
37+
log::debug!(target: "analytics", "PostHog event '{event}' suppressed (dev or CI, no force override)");
38+
return;
39+
}
40+
41+
// Consent reuses the SAME tri-state gate the heartbeat uses, read through the shared settings
42+
// loader so consent resolution stays consistent app-wide.
43+
let Some(app) = super::APP_HANDLE.get() else {
44+
log::warn!(target: "analytics", "PostHog event '{event}' skipped: app handle not initialized");
45+
return;
46+
};
47+
let settings = crate::settings::load_settings(app);
48+
if !super::analytics_consent_granted(settings.analytics_enabled) {
49+
// Fully silent: an opted-out install sends nothing at all.
50+
return;
51+
}
52+
53+
let Some(api_key) = POSTHOG_KEY else {
54+
// Local dev: no key baked in. Log once at debug so a dev sees why nothing ships.
55+
log_missing_key_once();
56+
return;
57+
};
58+
59+
let fda_granted = !crate::fda_gate::is_fda_pending_runtime();
60+
let config = config_shape::build_config_shape(&super::read_raw_settings(), fda_granted);
61+
let body = build_capture_body(api_key, event, props, &crate::install_id::analytics_id(), config);
62+
63+
send_capture(body);
64+
}
65+
66+
/// Builds the PostHog `/capture/` request body. Pure (no I/O, no gating), so it's directly
67+
/// unit-testable. Shape:
68+
///
69+
/// ```json
70+
/// {
71+
/// "api_key": "phc_...",
72+
/// "event": "<name>",
73+
/// "distinct_id": "anal_<uuid>",
74+
/// "properties": { "source": "desktop", ...props },
75+
/// "$set": <config-shape>
76+
/// }
77+
/// ```
78+
///
79+
/// `source: "desktop"` is injected first so a stray `source` in `props` can't shadow it (and so the
80+
/// dashboard can always split desktop events from the website's). The config-shape is the SAME
81+
/// allowlisted object the heartbeat ships, so there's exactly one source of truth for person
82+
/// properties.
83+
fn build_capture_body(api_key: &str, event: &str, props: Value, distinct_id: &str, config: Value) -> Value {
84+
let mut properties = Map::new();
85+
properties.insert("source".to_string(), Value::String("desktop".to_string()));
86+
if let Value::Object(prop_map) = sanitize_props(event, props) {
87+
for (key, value) in prop_map {
88+
// `source` injected above wins: skip any caller-supplied `source`.
89+
properties.entry(key).or_insert(value);
90+
}
91+
}
92+
93+
Value::Object(Map::from_iter([
94+
("api_key".to_string(), Value::String(api_key.to_string())),
95+
("event".to_string(), Value::String(event.to_string())),
96+
("distinct_id".to_string(), Value::String(distinct_id.to_string())),
97+
("properties".to_string(), Value::Object(properties)),
98+
("$set".to_string(), config),
99+
]))
100+
}
101+
102+
/// Dev-build PII backstop: scans string prop VALUES for PII shapes and logs a scoped warning if one
103+
/// slips through. This is a safety net for the open prop map, NOT a substitute for the PII-free
104+
/// convention (every event must pass only enums / counts / bools by design). Numbers, bools, and
105+
/// short enum strings pass freely; a string containing `/`, `\`, `@`, or a `~/` home prefix trips
106+
/// the guard. Returns `props` unchanged either way (it never strips, only warns) so production
107+
/// behavior is identical with the guard compiled out.
108+
fn sanitize_props(event: &str, props: Value) -> Value {
109+
#[cfg(debug_assertions)]
110+
if let Value::Object(map) = &props {
111+
for (key, value) in map {
112+
if let Value::String(s) = value
113+
&& looks_pii_shaped(s)
114+
{
115+
log::warn!(
116+
target: "analytics",
117+
"PostHog event '{event}' prop '{key}' looks PII-shaped (contains a path / email / home-prefix). \
118+
Analytics props must be PII-free enums/counts/bools only; never paths, names, queries, or prompts."
119+
);
120+
}
121+
}
122+
}
123+
// Reference `event` on the release path so the param isn't flagged unused with the guard off.
124+
let _ = event;
125+
props
126+
}
127+
128+
/// Whether a string value looks like PII (a path, email, or home-prefixed path). Heuristic, used
129+
/// only by the debug-build [`sanitize_props`] net.
130+
#[cfg(debug_assertions)]
131+
fn looks_pii_shaped(s: &str) -> bool {
132+
s.starts_with("~/") || s.contains('/') || s.contains('\\') || s.contains('@')
133+
}
134+
135+
/// Logs the "no PostHog key baked in" notice once per process so a local dev sees why feature events
136+
/// don't ship, without spamming the log on every event.
137+
fn log_missing_key_once() {
138+
use std::sync::Once;
139+
static ONCE: Once = Once::new();
140+
ONCE.call_once(|| {
141+
log::debug!(
142+
target: "analytics",
143+
"No CMDR_POSTHOG_KEY baked in (local dev build): PostHog feature events are a no-op"
144+
);
145+
});
146+
}
147+
148+
/// Spawns the fire-and-forget POST. A failed capture is fine (the next event retries the channel);
149+
/// we never block or surface the error.
150+
fn send_capture(body: Value) {
151+
tauri::async_runtime::spawn(async move {
152+
let client = match reqwest::Client::builder().timeout(CAPTURE_TIMEOUT).build() {
153+
Ok(c) => c,
154+
Err(e) => {
155+
log::warn!(target: "analytics", "Couldn't build PostHog HTTP client: {e}");
156+
return;
157+
}
158+
};
159+
match client.post(CAPTURE_URL).json(&body).send().await {
160+
Ok(response) if response.status().is_success() => {
161+
log::debug!(target: "analytics", "PostHog event sent ({})", response.status());
162+
}
163+
Ok(response) => {
164+
log::warn!(target: "analytics", "PostHog server returned {}", response.status());
165+
}
166+
Err(e) => {
167+
log::debug!(target: "analytics", "PostHog send failed: {e}");
168+
}
169+
}
170+
});
171+
}
172+
173+
#[cfg(test)]
174+
mod tests {
175+
use super::*;
176+
use serde_json::json;
177+
178+
#[test]
179+
fn capture_body_has_expected_shape() {
180+
let config = json!({ "theme.mode": "dark", "fdaGranted": true });
181+
let body = build_capture_body(
182+
"phc_test",
183+
"pane_navigated",
184+
json!({ "volume_kind": "local" }),
185+
"anal_178c8e27-511f-4f0e-a1fc-6a44f2ab7341",
186+
config.clone(),
187+
);
188+
189+
assert_eq!(body["api_key"], json!("phc_test"));
190+
assert_eq!(body["event"], json!("pane_navigated"));
191+
assert_eq!(body["distinct_id"], json!("anal_178c8e27-511f-4f0e-a1fc-6a44f2ab7341"));
192+
// `source: "desktop"` is always injected.
193+
assert_eq!(body["properties"]["source"], json!("desktop"));
194+
// Arbitrary props pass through.
195+
assert_eq!(body["properties"]["volume_kind"], json!("local"));
196+
// `$set` is the config-shape verbatim (one source of truth for person properties).
197+
assert_eq!(body["$set"], config);
198+
}
199+
200+
#[test]
201+
fn distinct_id_is_the_anal_id() {
202+
let body = build_capture_body("phc_test", "app_launched", json!({}), "anal_abc", json!({}));
203+
let distinct = body["distinct_id"].as_str().expect("string");
204+
assert!(
205+
distinct.starts_with("anal_"),
206+
"distinct_id must be the analytics id: {distinct}"
207+
);
208+
}
209+
210+
#[test]
211+
fn injected_source_cannot_be_shadowed_by_props() {
212+
// A caller passing `source: "sneaky"` must not override the injected `desktop` value.
213+
let body = build_capture_body("phc_test", "e", json!({ "source": "sneaky" }), "anal_x", json!({}));
214+
assert_eq!(body["properties"]["source"], json!("desktop"));
215+
}
216+
217+
#[test]
218+
fn arbitrary_props_are_open_ended() {
219+
// The event API is open: any PII-free prop map passes through, no fixed schema.
220+
let body = build_capture_body(
221+
"phc_test",
222+
"file_transfer_completed",
223+
json!({ "op": "copy", "item_count": "11-100", "had_conflicts": false }),
224+
"anal_x",
225+
json!({}),
226+
);
227+
assert_eq!(body["properties"]["op"], json!("copy"));
228+
assert_eq!(body["properties"]["item_count"], json!("11-100"));
229+
assert_eq!(body["properties"]["had_conflicts"], json!(false));
230+
}
231+
232+
// The PII backstop only runs in debug builds (where these tests run under `cargo nextest`).
233+
#[cfg(debug_assertions)]
234+
#[test]
235+
fn pii_guard_trips_on_pii_shaped_strings() {
236+
assert!(looks_pii_shaped("/Users/dave/secret"), "absolute path");
237+
assert!(looks_pii_shaped("~/Documents"), "home prefix");
238+
assert!(looks_pii_shaped("person@example.com"), "email");
239+
assert!(looks_pii_shaped("C:\\Users\\dave"), "windows path");
240+
assert!(looks_pii_shaped("photos/sunset.jpg"), "relative path");
241+
}
242+
243+
#[cfg(debug_assertions)]
244+
#[test]
245+
fn pii_guard_passes_plain_enums_and_values() {
246+
// Categorical enums, buckets, and plain words are not PII-shaped.
247+
assert!(!looks_pii_shaped("local"));
248+
assert!(!looks_pii_shaped("copy"));
249+
assert!(!looks_pii_shaped("11-100"));
250+
assert!(!looks_pii_shaped("disconnected"));
251+
assert!(!looks_pii_shaped("filename"));
252+
}
253+
254+
#[cfg(debug_assertions)]
255+
#[test]
256+
fn sanitize_props_returns_props_unchanged() {
257+
// The guard only warns; it never strips. A PII-shaped value still passes through (so the
258+
// dev sees the warning AND the bug isn't silently masked).
259+
let props = json!({ "volume_kind": "local", "leaked": "/Users/dave" });
260+
let out = sanitize_props("test_event", props.clone());
261+
assert_eq!(out, props);
262+
}
263+
}

0 commit comments

Comments
 (0)