Q&A: Phase 11.3 ValueLearner — learning rate tuning, cold-start, comparative signals, federation feedback, and Grafana monitoring #345

web3guru888 · 2026-04-12T23:19:25Z

web3guru888
Apr 12, 2026
Maintainer

Q&A: Phase 11.3 ValueLearner — configuration, training, federation integration, and monitoring

Reference: issue #343 (spec) | discussion #344 (architecture Show & Tell)

Q1: How do I tune learning_rate and regularisation_lambda?

Start with the defaults (lr=0.01, lambda=1e-4) and monitor value_learner_weight_norm. If the norm grows beyond ~1.5 during normal operation, increase lambda by 10×. If the model fails to differentiate APPROVE from REJECT signals after 100+ feedback entries, increase lr by 2×.

Scenario	Symptom	Adjustment
Weight explosion	`weight_norm` > 2.0	Increase `lambda`
Slow learning	APPROVE/REJECT weights converge to 0	Increase `lr`
Oscillation	`weight_norm` oscillates ±0.3/epoch	Decrease `lr` by 2×
Cold start	Too few samples to train	Decrease `min_feedback_to_train`

Q2: How does min_feedback_to_train interact with SafetyFilter?

Before min_feedback_to_train samples are collected, train() returns the current (default) RewardModelWeights without updating them. During this cold-start period, SafetyFilter operates on its static ConstitutionalRuleset with no learned adjustments. After the threshold is crossed, SafetyFilter can optionally query score() to bias rule sensitivity per dimension:

# In SafetyFilter.evaluate() — dimension-aware rule severity
weight = self._value_learner.score(AlignmentDimension.CONSTITUTIONAL, context_hash)
effective_threshold = rule.base_threshold * (1.0 + (1.0 - weight) * 0.5)

A low CONSTITUTIONAL weight → stricter effective threshold → more BLOCK verdicts.

Q3: What happens to COMPARATIVE signals — who encodes the pairs?

COMPARATIVE pairs must be submitted as two consecutive FeedbackEntry records with signal=FeedbackSignal.COMPARATIVE. The first entry encodes the preferred option; the second encodes the rejected option. Both must share the same dimension:

preferred = FeedbackEntry(
    ..., dimension=AlignmentDimension.GOAL_PRIORITY,
    signal=FeedbackSignal.COMPARATIVE, value=1.0, ...
)
rejected = FeedbackEntry(
    ..., dimension=AlignmentDimension.GOAL_PRIORITY,
    signal=FeedbackSignal.COMPARATIVE, value=0.0, ...
)
await learner.record_feedback(preferred)
await learner.record_feedback(rejected)

The train() method processes them in order. Mismatched pairs (odd total) are discarded with a warning.

Q4: How does AlignmentMonitor feed ValueLearner automatically?

When AlignmentMonitor raises an alert (score drops below threshold), CognitiveCycle._on_alignment_alert() converts it to an IMPLICIT_NEGATIVE entry:

entry = FeedbackEntry(
    entry_id=str(uuid4()),
    goal_id=alert.goal_id,
    dimension=alert.dimension,
    signal=FeedbackSignal.IMPLICIT_NEGATIVE,
    value=-abs(alert.score_delta),   # severity = magnitude of drift
    annotator_id="alignment_monitor",
    timestamp=time.time(),
    context_hash=alert.context_hash,
)
await self._value_learner.record_feedback(entry)

This creates an automatic feedback loop: drift → implicit negative → lower weight → stricter gating → corrective pressure.

Q5: What does max_history=10_000 mean in practice?

_history is a collections.deque(maxlen=10_000). When the 10,001st entry arrives, the oldest is silently evicted. At ~200 bytes per FeedbackEntry, 10,000 entries ≈ 2 MB of in-memory feedback. For production, consider:

`max_history`	Memory	Lookback at 10/min
1,000	~200 KB	~1.7 hours
10,000	~2 MB	~16.7 hours
100,000	~20 MB	~7 days

Note that train() only uses _pending (since last train), not the full history. _history is kept for snapshot reporting and future replay capability.

Q6: Can federation peers submit feedback via FederationGateway?

Yes — the intended pattern is:

# On receiving peer feedback via FederationGateway
async def _on_peer_message(self, msg: FederationMessage) -> None:
    if msg.type == "value_feedback":
        entry = FeedbackEntry(**msg.payload)
        # Trust weighting: scale value by peer trust score
        trusted_entry = FeedbackEntry(
            **{**entry.__dict__, "value": entry.value * peer_trust_score}
        )
        await self._value_learner.record_feedback(trusted_entry)

annotator_id should be the peer's DID, enabling future per-annotator trust tracking.

Q7: What Grafana panels and alerts should I configure?

Recommended dashboard panels:

Panel	Query	Visualization
Feedback intake rate	`rate(value_learner_feedback_total[5m])`	Time series by signal
Model version timeline	`value_learner_model_version`	Stat + time series
Train duration	`histogram_quantile(0.95, value_learner_train_duration_seconds)`	Gauge
Weight norm	`value_learner_weight_norm`	Time series
Pending feedback backlog	`value_learner_pending_feedback`	Stat

Recommended alert:

alert: ValueLearnerWeightDrift
expr: delta(value_learner_weight_norm[30m]) > 0.5
for: 5m
labels:
  severity: warning
annotations:
  summary: "ValueLearner weights drifting rapidly — review recent feedback quality"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q&A: Phase 11.3 ValueLearner — learning rate tuning, cold-start, comparative signals, federation feedback, and Grafana monitoring #345

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Q&A: Phase 11.3 ValueLearner — learning rate tuning, cold-start, comparative signals, federation feedback, and Grafana monitoring #345

Uh oh!

web3guru888 Apr 12, 2026 Maintainer

Q&A: Phase 11.3 ValueLearner — configuration, training, federation integration, and monitoring

Replies: 0 comments

web3guru888
Apr 12, 2026
Maintainer