Skip to content

Commit c5c5a89

Browse files
JihaoXinclaude
andcommitted
Update ARCHITECTURE and TODO docs with v0.2 features, add zh/ar translations
Rewrite ARCHITECTURE.md to reflect current state: 4-step Research pipeline, mixin-based orchestrator, skills system, conda isolation, DB state management, 9 agents. Add Chinese and Arabic translations. Update TODO.md with recently completed v0.2 items and correct test count (115). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 5b04dd1 commit c5c5a89

File tree

6 files changed

+837
-83
lines changed

6 files changed

+837
-83
lines changed

ARCHITECTURE.md

Lines changed: 139 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -4,34 +4,66 @@
44

55
**Core idea**: Trust the AI's judgment; code handles execution and guardrails only.
66

7+
- **DB as source of truth** &mdash; project config and status live in SQLite; YAML is used only for per-agent runtime state
8+
- **Per-project isolation** &mdash; each project gets its own conda env, sandboxed HOME, and `PYTHONNOUSERSITE=1`
9+
- **Skills over hard-coded rules** &mdash; modular instruction sets (skills) are loaded at runtime to enforce best practices
10+
711
## Pipeline Overview
812

13+
ARK runs three phases in sequence:
14+
915
```
10-
┌─────────────────────────────────────────────────────────────┐
11-
│ Simplified Pipeline │
12-
├─────────────────────────────────────────────────────────────┤
13-
│ │
14-
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
15-
│ │ Reviewer │───▶│ Planner │───▶│ Execute │ │
16-
│ │ Review │ │ Decide │ │ Run │ │
17-
│ └──────────┘ └────┬─────┘ └──────────┘ │
18-
│ │ │
19-
│ ▼ │
20-
│ Planner outputs YAML: │
21-
│ actions: │
22-
│ - agent: experimenter │
23-
│ task: "..." │
24-
│ - agent: writer │
25-
│ task: "..." │
26-
│ │
27-
│ ┌──────────────────────────────────────────┐ │
28-
│ │ Memory (minimal) │ │
29-
│ │ - scores: [7.0, 7.2, 7.5, ...] │ │
30-
│ │ - is_stagnating() → bool │ │
31-
│ │ - GOAL_ANCHOR (constant) │ │
32-
│ └──────────────────────────────────────────┘ │
33-
│ │
34-
└─────────────────────────────────────────────────────────────┘
16+
┌─────────────────────────────────────────────────────────────────┐
17+
│ ARK Pipeline │
18+
├─────────────────────────────────────────────────────────────────┤
19+
│ │
20+
│ Phase 1: Research (4-step) │
21+
│ ┌──────────────┐ ┌─────────────┐ ┌─────────┐ ┌──────────┐ │
22+
│ │Deep Research │─▶│ Initializer │─▶│ Planner │─▶│Experiment│ │
23+
│ │(Gemini) │ │(bootstrap) │ │(plan) │ │(run) │ │
24+
│ └──────────────┘ └─────────────┘ └─────────┘ └──────────┘ │
25+
│ │
26+
│ Phase 2: Dev │
27+
│ ┌───────────────────────────────────────────────────────┐ │
28+
│ │ plan → experiment on Slurm → analyze → write draft │ │
29+
│ └───────────────────────────────────────────────────────┘ │
30+
│ │
31+
│ Phase 3: Review (iterative loop) │
32+
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌��─────────┐ │
33+
│ │ Compile │─▶│ Review │─▶│ Planner │─▶│ Execute │──┐ │
34+
│ │ LaTeX │ │ Score │ │ Decide │ │ Run │ │ │
35+
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
36+
│ ▲ │ │
37+
│ └──── Validate ◀────────────────────────────────────┘ │
38+
��� (recompile) │
39+
│ │
40+
│ Loop until score ≥ threshold or human intervention │
41+
└─────────────────────────────────────────────────────────────────┘
42+
```
43+
44+
### Research Phase (4-step pipeline)
45+
46+
| Step | Agent | What Happens |
47+
|:-----|:------|:-------------|
48+
| 1 | Deep Research | Gemini literature survey, background knowledge gathering |
49+
| 2 | Initializer | Bootstrap conda env, install builtin skills, prepare citations |
50+
| 3 | Planner | Generate initial research plan from survey results |
51+
| 4 | Experimenter | Run first round of experiments based on plan |
52+
53+
### Review Loop
54+
55+
Each iteration runs 5 steps: Compile → Review → Plan → Execute → Validate.
56+
57+
The Planner outputs structured YAML action plans:
58+
59+
```yaml
60+
actions:
61+
- agent: experimenter
62+
task: "Run perplexity validation experiment"
63+
priority: 1
64+
- agent: writer
65+
task: "Update Section 4.2"
66+
priority: 2
3567
```
3668
3769
## Core Components
@@ -52,7 +84,7 @@ class SimpleMemory:
5284
```
5385

5486
Additional features:
55-
- **Issue tracking**: Counts how many times each issue reappears across iterations
87+
- **Issue tracking**: Content-based dedup — counts how many times each issue reappears across iterations
5688
- **Repair validation**: Verifies that attempted fixes actually resolved the issue
5789
- **Strategy escalation**: Automatically bans ineffective methods and suggests alternatives
5890
- **Meta-debugging**: Triggers diagnostic when the system is stuck
@@ -61,50 +93,54 @@ Additional features:
6193

6294
Every agent invocation includes a constant "Goal Anchor" that describes the project's core objectives. This prevents agents from drifting off-topic over many iterations.
6395

64-
The Goal Anchor is project-specific and should be configured per project.
65-
66-
### 3. Planner Agent
96+
### 3. Orchestrator (`orchestrator.py`)
6797

68-
The **core decision-maker**. Outputs a structured action plan:
98+
Mixin-based design with 5 mixins:
6999

70-
```yaml
71-
actions:
72-
- agent: experimenter
73-
task: "Run perplexity validation experiment"
74-
priority: 1
75-
- agent: writer
76-
task: "Update Section 4.2"
77-
priority: 2
100+
```python
101+
class Orchestrator(ResearchMixin, DevMixin, ReviewMixin, FigureMixin, BaseMixin):
102+
# Dispatches to the correct phase based on mode
103+
# Syncs status to DB after each step
104+
# Handles Telegram notifications
78105
```
79106

80-
### 4. Orchestrator (`orchestrator.py`)
107+
### 4. Skills System (`skills/`)
81108

82-
Minimal control flow:
109+
Modular instruction sets loaded at runtime:
83110

84-
```python
85-
def run_paper_iteration():
86-
# 1. Review
87-
review = run_agent("reviewer")
88-
score = parse_score(review)
89-
memory.record_score(score)
90-
91-
# 2. Stagnation detection
92-
if memory.is_stagnating():
93-
send_notification("Human intervention needed")
94-
95-
# 3. Planner decides + execute
96-
run_planner_cycle(review)
97-
98-
# 4. Visualize + commit
99-
run_figure_phase()
100-
compile_latex()
101-
git_commit()
102-
```
111+
| Skill | Purpose |
112+
|:------|:--------|
113+
| **research-integrity** | Anti-simulation: agents must run real experiments |
114+
| **human-intervention** | Escalation protocol via Telegram |
115+
| **env-isolation** | Per-project environment boundaries |
116+
| **figure-integrity** | Validates figures match actual data |
117+
| **page-adjustment** | Content density control within page limits |
118+
119+
Skills are auto-installed during pipeline bootstrap (Research Phase Step 2).
120+
121+
### 5. Environment Isolation (`webapp/jobs.py`)
122+
123+
Each project gets a sandboxed conda env:
124+
125+
- `provision_project_env()` clones base env to `<project>/.env/`
126+
- `project_env_ready()` checks if env exists
127+
- Orchestrator runs with `HOME=<project_dir>`, `PYTHONNOUSERSITE=1`
128+
- Both CLI (`ark run`) and Web Portal auto-detect and use the project env
129+
130+
### 6. State Management (`webapp/db.py`)
131+
132+
SQLite is the source of truth for project config and status:
103133

104-
## Agent List (8 agents)
134+
- Project creation, config, phase status
135+
- Score history, cost tracking
136+
- CLI and webapp read/write the same DB
137+
- YAML files under `auto_research/state/` are for per-agent runtime state only
138+
139+
## Agent List (9 agents)
105140

106141
| Agent | Role |
107142
|-------|------|
143+
| initializer | Bootstraps project: conda env, skills, citations |
108144
| reviewer | Reviews and scores the paper |
109145
| planner | Analyzes issues, generates action plan (paper & dev modes) |
110146
| experimenter | Designs, runs, and analyzes experiments |
@@ -114,30 +150,51 @@ def run_paper_iteration():
114150
| meta_debugger | System-level diagnosis |
115151
| coder | Implements code changes (dev mode) |
116152

117-
## Deprecated
118-
119-
- `events.py` — Event-driven system (replaced by Planner-based decisions)
120-
- Complex Memory tracking (issues, effective_actions, failed_attempts) — simplified
121-
122153
## File Structure
123154

124155
```
125156
ARK/
126-
├── orchestrator.py # Main loop
127-
├── memory.py # Memory system
128-
├── agents/ # Agent prompt templates
129-
│ ├── reviewer.prompt
130-
│ ├── planner.prompt
131-
│ ├── experimenter.prompt
132-
│ ├── researcher.prompt
133-
│ ├── writer.prompt
134-
│ ├── visualizer.prompt
135-
│ ├── meta_debugger.prompt
136-
│ └── coder.prompt
137-
├── state/ # Runtime state (gitignored)
138-
│ ├── action_plan.yaml
139-
│ ├── latest_review.md
140-
│ ├── findings.yaml
141-
│ └── memory.yaml
142-
└── logs/ # Execution logs (gitignored)
157+
├── ark/
158+
│ ├── orchestrator.py # Main loop (mixin-based)
159+
│ ├── pipeline.py # Research phase 4-step pipeline
160+
│ ├── memory.py # Score tracking, issue dedup, stagnation
161+
│ ├── agents.py # Agent invocation
162+
│ ├── execution.py # Agent execution and skill injection
163+
│ ├── cli.py # CLI commands (ark new/run/status/...)
164+
│ ├── compiler.py # LaTeX compilation
165+
│ ├── citation.py # DBLP/CrossRef citation verification
166+
│ ├── deep_research.py # Gemini Deep Research integration
167+
│ ├── telegram.py # Telegram notifications + human intervention
168+
│ ├── compute.py # Slurm/cloud compute backends
169+
│ ├── templates/agents/ # Agent prompt templates
170+
│ │ ├── initializer.prompt
171+
│ │ ├── reviewer.prompt
172+
│ │ ├── planner.prompt
173+
│ │ ├── experimenter.prompt
174+
│ │ ├── researcher.prompt
175+
│ │ ├── writer.prompt
176+
│ │ ├── visualizer.prompt
177+
│ │ └── coder.prompt
178+
│ └── webapp/
179+
│ ├── app.py # Flask app
180+
│ ├─�� db.py # SQLite models + state management
181+
│ ├── jobs.py # Job launch, conda env provisioning
182+
│ ├── routes.py # API routes + SSE
183+
│ └── static/app.html # SPA frontend
184+
├── skills/
185+
│ ├── index.json # Skill registry
186+
│ └── builtin/ # Built-in skills
187+
│ ├── research-integrity/
188+
│ ├── human-intervention/
189+
│ ├── env-isolation/
190+
│ ├── figure-integrity/
191+
│ └── page-adjustment/
192+
├── venue_templates/ # LaTeX templates per venue
193+
├── tests/ # 115 tests
194+
└── projects/ # Per-project directories (gitignored)
143195
```
196+
197+
## Deprecated
198+
199+
- `events.py` — Event-driven system (replaced by Planner-based decisions)
200+
- Complex Memory tracking (issues, effective_actions, failed_attempts) — simplified

0 commit comments

Comments
 (0)