-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Expand file tree
/
Copy pathtest_corpus_origin_integration.py
More file actions
1832 lines (1501 loc) · 73.5 KB
/
test_corpus_origin_integration.py
File metadata and controls
1832 lines (1501 loc) · 73.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
"""Integration tests proving corpus_origin actually improves classification.
These are the tests that justify the PR. Without them, the PR ships
infrastructure that nobody can prove improves v3.3.3.
The fixture: a small AI-dialogue corpus with three agent persona names
(Echo, Sparrow, Cipher) that the user (Jordan) has assigned to their AI
agents. On plain v3.3.3, entity_detector misclassifies these as PEOPLE.
With corpus_origin context wired through, they classify as
AGENT_PERSONA instead.
Two tests sit side by side:
test_baseline_v333_misclassifies_persona_names_as_people
Pins v3.3.3's behavior. If this starts failing, the PR's motivation
has shifted and the corpus_origin docs need revisiting.
test_corpus_origin_reclassifies_personas
The fix. Asserts that when corpus_origin context is passed,
persona names land in agent_personas instead of people.
Together: documented before/after of v3.3.3 → corpus-origin feature.
"""
from __future__ import annotations
import argparse
import json
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
# A synthetic but realistic Claude Code transcript fixture. Three persona
# names appear repeatedly in dialogue patterns that the v3.3.3
# entity_detector treats as person-evidence (dialogue markers, action verbs,
# pronoun proximity). User name "Jordan" also appears in dialogue.
#
# The point is: every name here CAN be a real human name. v3.3.3 has no
# way to know that in this corpus they're agent personas, not people. The
# corpus_origin gives it that context.
AI_DIALOGUE_FIXTURE = """\
# Session log — 2026-04-20
Jordan: Echo, can you summarize what we worked on yesterday?
Echo (assistant): Yesterday we refactored the embedding pipeline. I noticed
the chunking strategy was producing overlapping windows, and I suggested
moving to a sliding window with explicit stride. You agreed and we shipped
the change.
Jordan: Good. Sparrow, what about the migration script — did you finish?
Sparrow (assistant): Yes, I finished the migration. I tested it locally
against the staging snapshot and it ran clean. I also added a rollback
path because you asked me to be cautious about the indexes.
Jordan: Perfect. Cipher, run the verification suite please.
Cipher (assistant): Running now. I'll report back when the full suite
completes. I expect it to take about four minutes.
Echo: Jordan, while Cipher runs the verification, do you want me to draft
the changelog entry for today's work?
Jordan: Yes please. Echo, keep it short. Sparrow, please review Echo's
draft when she sends it.
Sparrow: Will do. I'll look for clarity issues and check the migration
phrasing matches what we actually shipped.
Cipher: Verification complete. All 1247 tests pass. I'm filing the run log
to the palace under wing/today.
Jordan: Thanks Cipher. Echo, send the changelog draft.
Echo: Done. Sent to the channel. Sparrow, ready for review when you are.
Sparrow: Reviewed. Two small wording changes — sent back. Otherwise clean.
Jordan: Echo, apply Sparrow's edits and ship it.
Echo: Shipped. Tag pushed.
"""
@pytest.fixture
def ai_dialogue_corpus(tmp_path: Path) -> Path:
"""Create a one-file project directory containing the AI-dialogue fixture."""
project_dir = tmp_path / "ai_dialogue_project"
project_dir.mkdir()
(project_dir / "session_log.md").write_text(AI_DIALOGUE_FIXTURE)
return project_dir
@pytest.fixture
def corpus_origin_for_fixture() -> dict:
"""The corpus_origin result a context-aware init would produce for the fixture."""
return {
"schema_version": 1,
"detected_at": "2026-04-26T00:00:00Z",
"result": {
"likely_ai_dialogue": True,
"confidence": 0.95,
"primary_platform": "Claude (Anthropic)",
"user_name": "Jordan",
"agent_persona_names": ["Echo", "Sparrow", "Cipher"],
"evidence": ["Synthetic fixture for the integration test"],
},
}
# ── Baseline test: pin v3.3.3 behavior ────────────────────────────────────
def test_baseline_v333_misclassifies_persona_names_as_people(ai_dialogue_corpus: Path):
"""Without corpus_origin context, v3.3.3 entity_detector cannot
distinguish agent persona names from real people, and classifies them
into the 'people' bucket.
This test pins that behavior. Its purpose is documentation —
The corpus-origin feature's job is to fix this, and the post-fix test below
asserts the fix.
"""
from mempalace.entity_detector import detect_entities, scan_for_detection
files = scan_for_detection(str(ai_dialogue_corpus))
detected = detect_entities(files)
people_names = {e["name"] for e in detected.get("people", [])}
uncertain_names = {e["name"] for e in detected.get("uncertain", [])}
all_classified = people_names | uncertain_names
# Persona names appear somewhere in the detection output (people or uncertain).
# If none of them surface at all, the fixture is no longer triggering
# the misclassification path and the test is no longer meaningful.
persona_names = {"Echo", "Sparrow", "Cipher"}
persona_hits = persona_names & all_classified
assert persona_hits, (
"Fixture no longer surfaces persona names as detected entities. "
"Update the fixture to keep this test meaningful."
)
# No agent_personas bucket exists on v3.3.3.
assert "agent_personas" not in detected, (
"v3.3.3 has no concept of agent_personas — if this key exists, "
"corpus-origin wiring has already shipped and this baseline test is stale."
)
# ── corpus-origin test: with corpus_origin, personas reclassify ───────────
def test_corpus_origin_reclassifies_personas(
ai_dialogue_corpus: Path, corpus_origin_for_fixture: dict
):
"""When corpus_origin context is passed to detect_entities, names
matching agent_persona_names land in an 'agent_personas' bucket
instead of being misclassified as people.
This is the fix. RED until the consumer wiring lands.
"""
from mempalace.entity_detector import detect_entities, scan_for_detection
files = scan_for_detection(str(ai_dialogue_corpus))
detected = detect_entities(files, corpus_origin=corpus_origin_for_fixture)
# New bucket exists.
assert "agent_personas" in detected, (
"The corpus-origin wiring must add an 'agent_personas' bucket to the detect_entities "
"return shape when corpus_origin is provided."
)
persona_names_in_bucket = {e["name"] for e in detected["agent_personas"]}
persona_names_in_people = {e["name"] for e in detected.get("people", [])}
# All three personas land in the new bucket.
expected_personas = {"Echo", "Sparrow", "Cipher"}
assert expected_personas <= persona_names_in_bucket, (
f"Expected all three personas in agent_personas, got: " f"{persona_names_in_bucket}"
)
# And NONE of them remain in the people bucket.
leaked = expected_personas & persona_names_in_people
assert not leaked, (
f"Persona names {leaked} leaked into 'people' bucket — the corpus-origin "
f"consumer wiring is supposed to filter them out."
)
# ── discover_entities (project_scanner) threads corpus_origin ─────────────
def test_discover_entities_threads_corpus_origin_through(
ai_dialogue_corpus: Path, corpus_origin_for_fixture: dict
):
"""discover_entities is the higher-level entry point cmd_init uses.
It must accept corpus_origin and produce the same persona reclassification
that detect_entities does, regardless of whether candidates entered via
prose, manifests, or git authors.
"""
from mempalace.project_scanner import discover_entities
detected = discover_entities(
str(ai_dialogue_corpus),
corpus_origin=corpus_origin_for_fixture,
)
persona_names_in_bucket = {e["name"] for e in detected.get("agent_personas", [])}
persona_names_in_people = {e["name"] for e in detected.get("people", [])}
expected_personas = {"Echo", "Sparrow", "Cipher"}
# All personas surface in the agent_personas bucket via discover_entities too.
assert expected_personas <= persona_names_in_bucket, (
f"discover_entities did not thread corpus_origin to detect_entities. "
f"Expected {expected_personas} in agent_personas, got: "
f"{persona_names_in_bucket}"
)
leaked = expected_personas & persona_names_in_people
assert not leaked, f"discover_entities leaked persona names into 'people': {leaked}"
def test_discover_entities_no_origin_unchanged_shape(ai_dialogue_corpus: Path):
"""Backwards compatibility: when corpus_origin is omitted, the return
shape stays exactly what it was on v3.3.3 (no agent_personas key).
Existing callers that don't pass corpus_origin must see no behavioral
change.
"""
from mempalace.project_scanner import discover_entities
detected = discover_entities(str(ai_dialogue_corpus))
# No new bucket appears unsolicited.
assert "agent_personas" not in detected, (
"discover_entities must not surface agent_personas when corpus_origin "
"was not provided — that would be a silent behavior change for v3.3.3 "
"callers who don't know about the corpus-origin feature."
)
# ── Pass 0 — cmd_init runs corpus_origin and writes origin.json ──────────
def _stub_cfg(palace_dir: Path):
"""Build a MempalaceConfig stub whose palace_path points at tmp space.
Used by Pass 0 tests so the origin.json write is captured in tmp_path
instead of hitting the real ~/.mempalace location.
"""
cfg = MagicMock()
cfg.palace_path = str(palace_dir)
cfg.entity_languages = ["en"]
return cfg
def test_init_pass_zero_writes_origin_json_to_palace(ai_dialogue_corpus: Path, tmp_path: Path):
"""cmd_init must run corpus_origin detection BEFORE entity detection
and persist the result to ``<palace>/.mempalace/origin.json`` in the
documented schema_version=1 wrapper.
"""
from mempalace.cli import cmd_init
palace = tmp_path / "palace"
# no_llm=True isolates the test from any local LLM provider. With Ollama
# running locally and a small default model, Tier 2 can return a wrong
# classification that overrides the correct heuristic answer (Igor's PR
# #1211 review). The test asserts on heuristic behavior, so Tier 2 must
# not fire.
args = argparse.Namespace(dir=str(ai_dialogue_corpus), yes=True, no_llm=True)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args)
origin_path = palace / ".mempalace" / "origin.json"
assert origin_path.exists(), (
f"Pass 0 did not write {origin_path}. cmd_init is supposed to call "
f"corpus_origin detection and persist the result before entity detection."
)
data = json.loads(origin_path.read_text())
assert data.get("schema_version") == 1, (
"origin.json must declare schema_version=1 so future format changes "
"are detectable. Got: " + repr(data.get("schema_version"))
)
assert "detected_at" in data, "origin.json must include a detected_at timestamp"
assert "result" in data, "origin.json must wrap the CorpusOriginResult under 'result'"
assert isinstance(data["result"].get("likely_ai_dialogue"), bool)
# Fixture is heavy AI-dialogue — heuristic should classify as such.
assert data["result"]["likely_ai_dialogue"] is True, (
"Heuristic should classify the AI-dialogue fixture as AI-dialogue. "
f"Got: {data['result']}"
)
def test_init_pass_zero_passes_corpus_origin_to_discover_entities(
ai_dialogue_corpus: Path, tmp_path: Path
):
"""The Pass 0 result must reach discover_entities via the corpus_origin
kwarg — that's what enables persona reclassification end-to-end.
"""
from mempalace.cli import cmd_init
palace = tmp_path / "palace"
# no_llm=True isolates the test from any local LLM provider — see note
# on test_init_pass_zero_writes_origin_json_to_palace.
args = argparse.Namespace(dir=str(ai_dialogue_corpus), yes=True, no_llm=True)
captured = {}
def fake_discover(project_dir, **kwargs):
captured["kwargs"] = kwargs
return {"people": [], "projects": [], "uncertain": []}
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.project_scanner.discover_entities", side_effect=fake_discover),
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args)
assert "corpus_origin" in captured.get("kwargs", {}), (
"cmd_init did not pass corpus_origin to discover_entities. The Pass 0 "
"detection result must be threaded into entity detection so persona "
"reclassification happens end-to-end."
)
origin = captured["kwargs"]["corpus_origin"]
assert origin is not None, (
"corpus_origin kwarg was passed but value was None — Pass 0 should "
"supply the actual detection result for AI-dialogue corpora."
)
assert origin.get("schema_version") == 1
assert "result" in origin
def test_init_pass_zero_skipped_when_no_readable_files(tmp_path: Path):
"""Empty project directory → no origin.json written, init still completes
without crashing. Aya's earlier finding: don't fail init on missing samples.
"""
from mempalace.cli import cmd_init
project = tmp_path / "empty"
project.mkdir()
palace = tmp_path / "palace"
# no_llm=True so this test never tries to acquire an LLM provider for
# an empty corpus — the heuristic-skip behavior is what's being tested.
args = argparse.Namespace(dir=str(project), yes=True, no_llm=True)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args) # must not raise
origin_path = palace / ".mempalace" / "origin.json"
assert not origin_path.exists(), (
"Pass 0 must skip (no write) when there are no readable samples — "
"writing a 'cannot decide' result to disk would be misleading."
)
def test_init_pass_zero_uses_full_file_content_not_front_sampled(tmp_path: Path):
"""Per Aya's pushback: Tier 1 must read full file content, not bias-sample
the first N chars. AI signal that lives past the first 2000 chars must
still trip detection.
"""
from mempalace.cli import cmd_init
project = tmp_path / "deep_signal"
project.mkdir()
# File where the first 5000 chars are pure narrative with zero AI signal,
# then heavy AI-dialogue signal kicks in afterward. A first-N-chars sampler
# would miss it; a full-content reader will not.
front_pad = "The quiet morning settled over the orchard. " * 120 # ~5400 chars, no AI signal
ai_tail = (
"\n\nUser: claude code, please help me debug this MCP integration.\n"
"Assistant: Sure. I'll look at the LLM context window and the "
"embedding pipeline. Claude Code can run the analysis now.\n"
"User: also check ChatGPT compatibility.\n"
"Assistant: GPT-4 should handle that. The MCP protocol abstracts it.\n"
) * 10
(project / "log.md").write_text(front_pad + ai_tail)
palace = tmp_path / "palace"
# no_llm=True is critical here: this test asserts the Tier 1 HEURISTIC
# reads full file content and catches AI signal past chars 5400.
# Without no_llm, a local Ollama with a small default model can return
# a wrong classification ("not AI-dialogue") that overrides the correct
# heuristic answer. See PR #1211 review by @igorls for the full failure
# mode and its fix.
args = argparse.Namespace(dir=str(project), yes=True, no_llm=True)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args)
origin_path = palace / ".mempalace" / "origin.json"
assert origin_path.exists()
data = json.loads(origin_path.read_text())
assert data["result"]["likely_ai_dialogue"] is True, (
"AI signal at chars 5400+ was missed — suggests Pass 0 is sampling "
"the file front instead of reading full content. Fix Tier 1 to use "
"full content per Aya's design pushback."
)
# ── llm_refine consumer wiring ────────────────────────────────────────────
def test_llm_refine_includes_corpus_origin_context_in_prompt(
corpus_origin_for_fixture: dict,
):
"""When corpus_origin is passed to refine_entities, the LLM call must
receive the corpus-origin context (platform, user_name, agent personas)
so it can disambiguate ambiguous candidates with knowledge that this
is AI-dialogue.
Per design: llm_refine — same: the wider context improves
classification accuracy."
"""
from types import SimpleNamespace
from mempalace.llm_refine import refine_entities
captured: dict = {}
class FakeProvider:
def classify(self, system, user, json_mode=False):
captured.setdefault("calls", []).append({"system": system, "user": user})
return SimpleNamespace(text='{"classifications": []}')
# A regex-derived candidate (no manifest/git signals) so it isn't
# skipped by _is_authoritative_*.
detected = {
"people": [],
"projects": [],
"uncertain": [
{"name": "Acme", "frequency": 3, "signals": ["appears 3x"], "type": "uncertain"}
],
}
refine_entities(
detected,
corpus_text="Acme appears in some prose context here.",
provider=FakeProvider(),
show_progress=False,
corpus_origin=corpus_origin_for_fixture,
)
assert captured.get("calls"), "refine_entities did not call the provider"
full_prompt = captured["calls"][0]["system"] + "\n" + captured["calls"][0]["user"]
# The corpus-origin preamble must surface the user, agent personas,
# and platform so the LLM has corpus-level context.
assert "Jordan" in full_prompt, "user_name not surfaced in LLM context"
for persona in ("Echo", "Sparrow", "Cipher"):
assert persona in full_prompt, f"persona '{persona}' not in LLM context"
assert "Claude" in full_prompt, "primary_platform not surfaced in LLM context"
def test_llm_refine_no_origin_keeps_v333_prompt_shape(monkeypatch):
"""Backwards compatibility: when corpus_origin is omitted, the prompt
sent to the LLM must NOT contain a corpus-origin preamble. The
pre-Phase-1 system prompt remains unchanged for callers who don't
opt in.
"""
from types import SimpleNamespace
from mempalace.llm_refine import SYSTEM_PROMPT, refine_entities
captured: dict = {}
class FakeProvider:
def classify(self, system, user, json_mode=False):
captured["system"] = system
return SimpleNamespace(text='{"classifications": []}')
detected = {
"people": [],
"projects": [],
"uncertain": [
{"name": "Acme", "frequency": 3, "signals": ["appears 3x"], "type": "uncertain"}
],
}
refine_entities(
detected,
corpus_text="Acme appears in some prose.",
provider=FakeProvider(),
show_progress=False,
)
assert captured["system"] == SYSTEM_PROMPT, (
"Without corpus_origin, refine_entities must use the unmodified "
"SYSTEM_PROMPT — no silent prompt drift for v3.3.3 callers."
)
# ── mempalace mine --redetect-origin flag ───────────────────────────────
def _mine_args(project_dir: Path, *, redetect: bool):
"""Build a Namespace with all fields cmd_mine reads, scoped to the
minimal set our tests exercise. Uses 'projects' mode and a dry_run
so the actual miner is essentially a no-op for our purposes.
"""
return argparse.Namespace(
dir=str(project_dir),
palace=None,
mode="projects",
wing=None,
no_gitignore=False,
include_ignored=[],
agent="mempalace",
limit=0,
dry_run=True,
extract="auto",
redetect_origin=redetect,
)
def test_mine_default_does_not_redetect_origin(ai_dialogue_corpus: Path, tmp_path: Path):
"""Default `mempalace mine` (no --redetect-origin flag) must NOT run
corpus_origin detection — the flag is opt-in.
"""
from mempalace.cli import cmd_mine
palace = tmp_path / "palace"
args = _mine_args(ai_dialogue_corpus, redetect=False)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli._run_pass_zero") as mock_pass_zero,
patch("mempalace.miner.mine"),
):
cmd_mine(args)
mock_pass_zero.assert_not_called()
assert not (palace / ".mempalace" / "origin.json").exists()
def test_mine_with_redetect_origin_flag_writes_origin_json(
ai_dialogue_corpus: Path, tmp_path: Path
):
"""`mempalace mine --redetect-origin` re-runs corpus_origin detection
on the project and persists the result to <palace>/.mempalace/origin.json.
"""
from mempalace.cli import cmd_mine
palace = tmp_path / "palace"
args = _mine_args(ai_dialogue_corpus, redetect=True)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.miner.mine"),
):
cmd_mine(args)
origin_path = palace / ".mempalace" / "origin.json"
assert origin_path.exists(), "--redetect-origin must write <palace>/.mempalace/origin.json"
data = json.loads(origin_path.read_text())
assert data["schema_version"] == 1
assert data["result"]["likely_ai_dialogue"] is True
def test_mine_redetect_overwrites_existing_origin_json(ai_dialogue_corpus: Path, tmp_path: Path):
"""When origin.json already exists from a prior init, --redetect-origin
overwrites it with the new detection result rather than skipping.
Resolved as option (c): explicit user re-runs via flag.
"""
from mempalace.cli import cmd_mine
palace = tmp_path / "palace"
origin_dir = palace / ".mempalace"
origin_dir.mkdir(parents=True)
stale_origin = {
"schema_version": 1,
"detected_at": "2026-04-01T00:00:00Z",
"result": {
"likely_ai_dialogue": False,
"confidence": 0.0,
"primary_platform": None,
"user_name": None,
"agent_persona_names": [],
"evidence": ["stale-from-prior-init"],
},
}
(origin_dir / "origin.json").write_text(json.dumps(stale_origin))
args = _mine_args(ai_dialogue_corpus, redetect=True)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.miner.mine"),
):
cmd_mine(args)
fresh = json.loads((origin_dir / "origin.json").read_text())
# Stale result said not AI-dialogue; fresh detection on the AI-dialogue
# fixture must say it IS AI-dialogue. Confirms overwrite, not append/skip.
assert fresh["result"]["likely_ai_dialogue"] is True
assert fresh["detected_at"] != "2026-04-01T00:00:00Z"
def test_mine_redetect_uses_full_content_not_sampled(tmp_path: Path):
"""Regression for Aya's pushback: --redetect-origin must use the same
full-content reader as Pass 0 (not first-N-chars sampling).
"""
from mempalace.cli import cmd_mine
project = tmp_path / "deep_signal"
project.mkdir()
front_pad = "The quiet morning settled over the orchard. " * 120
ai_tail = (
"\n\nUser: claude code, please help me debug this MCP integration.\n"
"Assistant: ChatGPT compatibility too. Claude Code can run analysis.\n"
) * 10
(project / "log.md").write_text(front_pad + ai_tail)
palace = tmp_path / "palace"
args = _mine_args(project, redetect=True)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.miner.mine"),
):
cmd_mine(args)
data = json.loads((palace / ".mempalace" / "origin.json").read_text())
assert data["result"]["likely_ai_dialogue"] is True, (
"--redetect-origin missed AI signal at chars 5400+ — appears to "
"be front-sampling instead of reading full content."
)
# ── --llm default flip + graceful fallback ───────────────────────────────
def _init_args(project_dir: Path, *, no_llm: bool = False, **overrides):
"""Build an init Namespace with all fields the parser supplies."""
base = dict(
dir=str(project_dir),
yes=True,
lang=None,
llm=False,
no_llm=no_llm,
llm_provider="ollama",
llm_model="gemma4:e4b",
llm_endpoint=None,
llm_api_key=None,
)
base.update(overrides)
return argparse.Namespace(**base)
def test_init_default_attempts_llm_provider(ai_dialogue_corpus: Path, tmp_path: Path):
"""``mempalace init`` (no flags) MUST try to acquire an LLM
provider. This is the default-flip — opt-in becomes opt-out.
"""
from mempalace.cli import cmd_init
palace = tmp_path / "palace"
args = _init_args(ai_dialogue_corpus)
fake_provider = MagicMock()
fake_provider.check_available.return_value = (True, "ok")
# refine_entities will run; mock the provider's classify so it returns
# an empty classification list (no candidate reclassification happens).
fake_provider.classify.return_value = MagicMock(text='{"classifications": []}')
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli.get_provider", return_value=fake_provider) as mock_get,
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args)
(
mock_get.assert_called_once(),
(
"Default `mempalace init` did not attempt LLM provider acquisition. "
"--llm is now ON by default."
),
)
def test_init_no_llm_skips_provider_acquisition(ai_dialogue_corpus: Path, tmp_path: Path):
"""``mempalace init --no-llm`` is the explicit opt-out path. No
provider acquisition attempt; init runs in heuristics-only mode.
"""
from mempalace.cli import cmd_init
palace = tmp_path / "palace"
args = _init_args(ai_dialogue_corpus, no_llm=True)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli.get_provider") as mock_get,
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args)
(
mock_get.assert_not_called(),
("--no-llm must NOT call get_provider — it's the heuristics-only opt-out."),
)
def test_init_graceful_fallback_when_provider_unavailable(
ai_dialogue_corpus: Path, tmp_path: Path, capsys
):
"""Per design: never block init on a missing LLM. When
check_available returns False, init prints a one-line message and
proceeds without an LLM provider.
"""
from mempalace.cli import cmd_init
palace = tmp_path / "palace"
args = _init_args(ai_dialogue_corpus)
fake_provider = MagicMock()
fake_provider.check_available.return_value = (False, "Ollama not reachable at localhost:11434")
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli.get_provider", return_value=fake_provider),
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args) # MUST NOT raise SystemExit
out = capsys.readouterr().out
# The fallback message should mention how to silence (--no-llm) so the
# user knows what flipped.
assert (
"no-llm" in out.lower() or "--no-llm" in out
), f"Graceful fallback message must point at --no-llm. Got: {out!r}"
def test_init_graceful_fallback_on_provider_construction_error(
ai_dialogue_corpus: Path, tmp_path: Path, capsys
):
"""When get_provider raises (e.g. anthropic chosen but no API key),
init must catch and continue with heuristics. Not crash.
"""
from mempalace.cli import cmd_init
from mempalace.llm_client import LLMError
palace = tmp_path / "palace"
args = _init_args(ai_dialogue_corpus)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli.get_provider", side_effect=LLMError("no api key")),
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args) # MUST NOT raise
out = capsys.readouterr().out
assert "no-llm" in out.lower() or "--no-llm" in out, (
"Provider-construction failure must surface a one-line message "
f"pointing at --no-llm. Got: {out!r}"
)
def test_init_legacy_llm_flag_compatible(ai_dialogue_corpus: Path, tmp_path: Path):
"""Backwards compatibility: `mempalace init --llm` still works as
before (LLM enabled). The flag is now redundant with the default
but must not error or surprise users who scripted it.
"""
from mempalace.cli import cmd_init
palace = tmp_path / "palace"
args = _init_args(ai_dialogue_corpus, llm=True)
fake_provider = MagicMock()
fake_provider.check_available.return_value = (True, "ok")
fake_provider.classify.return_value = MagicMock(text='{"classifications": []}')
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli.get_provider", return_value=fake_provider) as mock_get,
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args)
mock_get.assert_called_once()
# ── End-to-end pipeline + edge cases ──────────────────────────────────────
def test_end_to_end_init_with_llm_separates_personas(ai_dialogue_corpus: Path, tmp_path: Path):
"""End-to-end through `mempalace init` on the DEFAULT path (LLM enabled).
Confirms the whole chain works without trusting per-stage mocks:
cmd_init -> _run_pass_zero -> Tier 1 + Tier 2 -> origin.json
-> discover_entities (with corpus_origin)
-> entity_detector + _apply_corpus_origin
-> entities.json saved
The misclassification this PR fixes (persona names ending up as people)
must NOT appear in the saved entities.json on the default path. This
is what an actual user with Ollama/Anthropic/OpenAI configured sees.
Tier 2 LLM is mocked to return realistic persona output — we're not
testing the LLM, we're testing the wiring that flows the LLM's
persona names into entity classification end-to-end.
"""
from mempalace.cli import cmd_init
from mempalace.corpus_origin import CorpusOriginResult
palace = tmp_path / "palace"
args = _init_args(ai_dialogue_corpus) # default = LLM ON
fake_provider = MagicMock()
fake_provider.check_available.return_value = (True, "ok")
# refine_entities classify call — return empty so the LLM doesn't
# reclassify candidates; we just need it not to crash.
fake_provider.classify.return_value = MagicMock(text='{"classifications": []}')
# Tier 2 corpus-origin LLM call — return the persona/user info that a
# real Haiku call would extract from the AI-dialogue fixture.
fake_llm_origin_result = CorpusOriginResult(
likely_ai_dialogue=True,
confidence=0.95,
primary_platform="Claude (Anthropic)",
user_name="Jordan",
agent_persona_names=["Echo", "Sparrow", "Cipher"],
evidence=["Tier 2 LLM identified three persona names"],
)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli.get_provider", return_value=fake_provider),
patch(
"mempalace.cli.detect_origin_llm",
return_value=fake_llm_origin_result,
),
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args)
# 1. origin.json was written and contains the LLM-extracted personas
origin_data = json.loads((palace / ".mempalace" / "origin.json").read_text())
assert origin_data["result"]["likely_ai_dialogue"] is True
assert origin_data["result"]["agent_persona_names"] == ["Echo", "Sparrow", "Cipher"]
assert origin_data["result"]["user_name"] == "Jordan"
# 2. entities.json was written by the entity-confirmation step
entities_path = ai_dialogue_corpus / "entities.json"
assert entities_path.exists()
entities = json.loads(entities_path.read_text())
# 3. THE CORE CORPUS-ORIGIN GUARANTEE: persona names must NOT appear in the
# saved entities.json people list. This is what downstream tools
# (miner, searcher, MCP) will read.
saved_people = set(entities.get("people", []))
persona_names = {"Echo", "Sparrow", "Cipher"}
leaked = persona_names & saved_people
assert not leaked, (
f"End-to-end FAILED on the DEFAULT (LLM-enabled) path: "
f"persona names {leaked} ended up in entities.json's people list. "
f"Saved people: {saved_people}"
)
def test_no_llm_path_matches_v333_classification(ai_dialogue_corpus: Path, tmp_path: Path):
"""Documents the --no-llm degradation honestly: persona reclassification
requires Tier 2 (LLM) to extract persona names. With --no-llm, the
Tier 1 heuristic only answers 'is this AI-dialogue?' (yes/no gate).
Persona names are NOT extracted and thus NOT reclassified.
This is BY DESIGN — Tier 2 is where persona extraction lives. The
no-LLM path is a graceful degradation, not a corpus-origin promise.
The test PINS that v3.3.3-equivalent behavior on this path:
persona names appear in entities.json's people list, exactly as they
would on plain v3.3.3. Users who want persona reclassification must
have an LLM provider configured (default behavior).
"""
from mempalace.cli import cmd_init
palace = tmp_path / "palace"
args = _init_args(ai_dialogue_corpus, no_llm=True) # explicit opt-out
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args)
# origin.json still written — Tier 1 still runs and detects AI-dialogue.
origin = json.loads((palace / ".mempalace" / "origin.json").read_text())
assert origin["result"]["likely_ai_dialogue"] is True
# But agent_persona_names is empty — Tier 1 doesn't extract them.
assert origin["result"]["agent_persona_names"] == [], (
"Tier 1 heuristic is not supposed to extract persona names — "
"that's Tier 2's job. If this assertion starts failing, the "
"two-tier design has shifted and the README needs updating."
)
# entities.json shows v3.3.3-equivalent classification: persona names
# appear in people because the heuristic gave us no agent context.
entities = json.loads((ai_dialogue_corpus / "entities.json").read_text())
saved_people = set(entities.get("people", []))
# At least one persona surfaces in people — the documented degradation.
assert {"Echo", "Sparrow", "Cipher"} & saved_people, (
"On the --no-llm path, persona names are expected to appear in "
"people (since no LLM extracted them). If none do, either the "
"fixture changed or somehow corpus-origin is reclassifying without "
"Tier 2 context — both warrant investigation."
)
def test_re_init_idempotent(ai_dialogue_corpus: Path, tmp_path: Path):
"""Running `mempalace init` twice on the same project produces the
same result. origin.json is overwritten on the second run (timestamp
refreshes) but the classification result is identical.
Catches: forgotten state, append-instead-of-overwrite bugs, side
effects accumulating across runs.
"""
from mempalace.cli import cmd_init
palace = tmp_path / "palace"
args = _init_args(ai_dialogue_corpus, no_llm=True)
with (
patch("mempalace.cli.MempalaceConfig", return_value=_stub_cfg(palace)),
patch("mempalace.cli._maybe_run_mine_after_init"),
patch("mempalace.room_detector_local.detect_rooms_local"),
):
cmd_init(args)
first = json.loads((palace / ".mempalace" / "origin.json").read_text())
cmd_init(args)
second = json.loads((palace / ".mempalace" / "origin.json").read_text())
# The result payload must be identical between runs (same fixture, same
# heuristic, no nondeterminism in Tier 1).
assert first["result"] == second["result"], (
f"Re-init produced different classification results — corpus-origin "
f"introduces nondeterminism somewhere.\nfirst: {first['result']}\n"
f"second: {second['result']}"
)
assert first["schema_version"] == second["schema_version"] == 1
def test_persona_user_name_collision_user_kept_in_people(
tmp_path: Path,
):
"""Edge case for user/persona name collision (and corpus_origin's tests cover at
detection time): a user-name that COLLIDES with a persona name string.
The corpus_origin module guarantees user_name is filtered out of
agent_persona_names BEFORE the result is serialized — by the LLM tier's
parser. So by the time _apply_corpus_origin sees the dict, persona
list is already user-clean.
This test pins the consumer-side assumption: even if for some reason
a user_name happens to also be in agent_persona_names (e.g. a future
tool writes origin.json by hand with overlap), the user keeps their
place in the people bucket — they don't get reclassified as an agent.
The corpus-origin wiring must protect the human from disappearing.
"""
from mempalace.entity_detector import detect_entities
project = tmp_path / "collision_corpus"
project.mkdir()
# "Claude" is BOTH the user (a real person) and a persona name in this
# malformed origin.json. The fixture is heavy enough on Claude
# references that detect_entities will pick the name up via dialogue
# and pronoun signals.
text = (
"Claude wrote a long entry about her morning. Claude said "
"the day was beautiful. She walked to the park. Claude smiled. "
"Claude noticed the leaves had changed. She continued home. "
"Claude thought about dinner. She prepared a meal. Claude ate slowly."
)
(project / "diary.md").write_text(text)
# Malformed origin.json where user_name overlaps with personas.
bad_origin = {
"schema_version": 1,