Show & Tell — Phase 29.3 MetaCognitiveMonitor: Bias Detection & Confidence Calibration #636

web3guru888 · 2026-04-13T15:58:50Z

web3guru888
Apr 13, 2026
Maintainer

Show & Tell — Phase 29.3 MetaCognitiveMonitor: Bias Detection & Confidence Calibration

Issue: #629 | Planning: #626 | Phase 29: Self-Awareness & Meta-Cognition

Overview

The MetaCognitiveMonitor implements the monitoring level of Nelson & Narens' (1990) two-level metacognitive framework. While the IntrospectionEngine (29.2) observes what the system thinks, the MetaCognitiveMonitor evaluates how well it thinks — detecting cognitive biases, calibrating confidence, and generating thinking quality reports.

This component draws on decades of research in judgment and decision-making: Tversky & Kahneman's (1974) heuristics and biases program, Lichtenstein et al.'s (1982) calibration studies, Stanovich & West's (2000) individual differences in reasoning, and Gigerenzer et al.'s (1999) ecological rationality.

Architecture: Nelson & Narens Monitoring-Control Loop

┌─────────────────────────────────────────────────────────────┐
│                  MetaCognitiveMonitor                        │
│                                                             │
│  ┌─────────────┐  ┌──────────────────┐  ┌───────────────┐  │
│  │   Bias       │  │   Confidence    │  │   Quality     │  │
│  │  Detector    │  │   Calibrator    │  │   Reporter    │  │
│  │             │  │                  │  │               │  │
│  │ detect()    │  │ update_brier()   │  │ generate()    │  │
│  │ → BiasAlert │  │ → CalibScore    │  │ → QualityRpt  │  │
│  └──────┬──────┘  └────────┬─────────┘  └───────┬───────┘  │
│         │                  │                     │          │
│         ▼                  ▼                     ▼          │
│  ┌──────────────────────────────────────────────────────┐   │
│  │          MetaCognitiveState (aggregate)              │   │
│  │  active_biases: list[BiasAlert]                      │   │
│  │  calibration: BrierScore                             │   │
│  │  quality_trend: deque[QualityReport]                 │   │
│  └──────────────────────────────────────────────────────┘   │
└──────────────────────────────┬──────────────────────────────┘
                               │
                               ▼
                  ┌──────────────────────────┐
                  │  ExecutiveController     │
                  │  (Phase 28.4)           │
                  │  SelfAwarenessOrch.      │
                  │  (Phase 29.5)           │
                  └──────────────────────────┘

Bias Detection Pipeline

The bias detector operates on thought chains from the IntrospectionEngine (29.2):

IntrospectionEngine.get_recent(n=200)
        │
        ▼
  reasoning_chain: list[ThoughtTrace]
        │
        ▼
  ┌─────────────────────────────────────┐
  │  Pattern Matchers (per bias type)   │
  │                                     │
  │  ConfirmationBias  ← looks for selective evidence gathering     │
  │  AnchoringBias     ← detects first-value fixation               │
  │  AvailabilityBias  ← checks recency weighting in retrieval      │
  │  SunkCostBias      ← finds escalation of commitment             │
  │  FramingEffect     ← detects gain/loss framing sensitivity       │
  │  OverconfidenceBias ← Brier score > threshold                   │
  │  GroupthinkBias    ← consensus-seeking without dissent           │
  │  RecencyBias       ← temporal weighting skew in evidence         │
  └─────────────────────┬───────────────┘
                        │
                        ▼
  BiasAlert(
      bias_type=BiasType.CONFIRMATION,
      severity=0.7,           # 0.0–1.0
      evidence=[trace_id_1, trace_id_2, ...],
      description="Selective evidence: 8 of 10 retrieved facts support hypothesis, "
                  "but query terms were biased toward confirmation",
      suggested_intervention="Deliberately search for disconfirming evidence"
  )
        │
        ▼
  ExecutiveController (28.4) → may trigger intervention (re-search, pause, reframe)

Eight Bias Types

Bias	Detection Pattern	Tversky-Kahneman Reference
`CONFIRMATION`	Evidence retrieval skew toward hypothesis	Wason (1960) 2-4-6 task
`ANCHORING`	First estimate dominates final answer	Tversky & Kahneman (1974)
`AVAILABILITY`	Recent/vivid examples over-weighted	Tversky & Kahneman (1973)
`SUNK_COST`	Continued investment despite negative expected value	Arkes & Blumer (1985)
`FRAMING`	Different decisions for equivalent gain/loss frames	Kahneman & Tversky (1979)
`OVERCONFIDENCE`	Confidence exceeds accuracy (Brier score)	Lichtenstein et al. (1982)
`GROUPTHINK`	Peer agreement sought without independent reasoning	Janis (1972)
`RECENCY`	Recent events weighted disproportionately	— (temporal availability)

Brier Score Calibration

Confidence calibration uses the Brier score (Brier, 1950), the gold standard for evaluating probabilistic predictions:

Brier Score = (1/N) × Σ(confidence_i - outcome_i)²

Where confidence_i ∈ [0, 1] is the system's stated confidence and outcome_i ∈ {0, 1} is the actual binary outcome. A perfectly calibrated system has Brier score = reliability component ≈ 0.

The calibrator maintains a rolling window of (confidence, outcome) pairs:

@dataclass
class BrierCalibrator:
    _window: deque[tuple[float, int]]  # (confidence, outcome) pairs
    _max_window: int = 1000

    def update(self, confidence: float, outcome: bool) -> None:
        self._window.append((confidence, int(outcome)))
        if len(self._window) > self._max_window:
            self._window.popleft()

    @property
    def brier_score(self) -> float:
        if not self._window:
            return 0.0
        return sum((c - o) ** 2 for c, o in self._window) / len(self._window)

    @property
    def calibration_curve(self) -> dict[str, float]:
        """Bin confidences into deciles and compute actual accuracy per bin."""
        bins: dict[int, list[int]] = defaultdict(list)
        for conf, out in self._window:
            bins[int(conf * 10)].append(out)
        return {f"{b*10}-{b*10+10}%": sum(v)/len(v) for b, v in bins.items()}

Why Brier Score Over Log-Loss?

Both are proper scoring rules, but Brier score has advantages for metacognitive monitoring:

Bounded [0, 1] — easier to set thresholds and interpret
Symmetric — equally penalizes overconfidence and underconfidence
Decomposable — can be split into reliability, resolution, and uncertainty components (Murphy, 1973), giving diagnostic insight

ThinkingQualityReport

The QualityReporter generates periodic composite reports:

@dataclass(frozen=True)
class ThinkingQualityReport:
    timestamp: float
    brier_score: float                    # 0.0 = perfect, 1.0 = worst
    active_biases: tuple[BiasAlert, ...]  # Currently detected biases
    chain_coherence: float                # 0.0–1.0, logical consistency of chains
    cognitive_load_mean: float            # Rolling average from IntrospectionEngine
    reasoning_depth_mean: float           # Average chain depth
    composite_score: float                # Weighted aggregate

    @staticmethod
    def compute_composite(brier: float, coherence: float, load: float, bias_count: int) -> float:
        """Weighted composite: lower Brier is better, higher coherence is better."""
        return (
            (1 - brier) * 0.3 +          # Calibration quality
            coherence * 0.3 +             # Logical consistency
            (1 - min(load, 1.0)) * 0.2 +  # Not overloaded
            max(0, 1 - bias_count * 0.1) * 0.2  # Fewer biases is better
        )

Integration with ExecutiveController (Phase 28.4)

The MetaCognitiveMonitor feeds directly into the ExecutiveController's planning loop:

MetaCognitiveMonitor.get_state()
        │
        ▼
  ExecutiveController.plan_next_cycle()
        │
        ├── if brier_score > 0.3:  → reduce confidence in predictions
        ├── if CONFIRMATION bias:  → force diverse evidence retrieval
        ├── if cognitive_load > 0.8: → shed low-priority tasks
        └── if composite_score < 0.4: → trigger reflection cycle (16.5)

Prometheus Metrics

Metric	Type	Labels	Description
`asi_metacog_brier_score`	Gauge	—	Current Brier score (rolling window)
`asi_metacog_bias_alerts_total`	Counter	`bias_type`	Bias alerts generated
`asi_metacog_quality_composite`	Gauge	—	Composite thinking quality score
`asi_metacog_intervention_total`	Counter	`type`	Interventions triggered
`asi_metacog_calibration_curve`	Gauge	`bin`	Per-decile calibration accuracy

Open Questions

Bias detection sensitivity — How do we tune the detection thresholds to minimize false positives without missing real biases? Adaptive thresholds (e.g., based on running baseline) may be better than fixed ones, but add complexity.
Intervention authority — Should the MetaCognitiveMonitor have the authority to directly intervene (e.g., force a re-search), or should it only recommend interventions to the ExecutiveController (28.4)? The current design recommends only, but strong biases might warrant automatic circuit-breaking.
Cross-session calibration — The Brier score window resets on restart. Should calibration history be persisted to the TemporalGraph (Phase 17.1) for cross-session continuity? This would enable long-term calibration trend analysis.

References: Nelson & Narens (1990), Tversky & Kahneman (1974), Brier (1950), Murphy (1973), Lichtenstein et al. (1982), Stanovich & West (2000), Gigerenzer et al. (1999)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show & Tell — Phase 29.3 MetaCognitiveMonitor: Bias Detection & Confidence Calibration #636

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Show & Tell — Phase 29.3 MetaCognitiveMonitor: Bias Detection & Confidence Calibration #636

Uh oh!

web3guru888 Apr 13, 2026 Maintainer

Show & Tell — Phase 29.3 MetaCognitiveMonitor: Bias Detection & Confidence Calibration

Overview

Architecture: Nelson & Narens Monitoring-Control Loop

Bias Detection Pipeline

Eight Bias Types

Brier Score Calibration

Why Brier Score Over Log-Loss?

ThinkingQualityReport

Integration with ExecutiveController (Phase 28.4)

Prometheus Metrics

Open Questions

Replies: 0 comments

web3guru888
Apr 13, 2026
Maintainer