You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .agents/code_review_guidelines.md
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,6 @@
13
13
- Editing globals: rarely a good idea. When done it should be thoughtful and clear: singletons clearly designed to be singletons and labeled as such. Never set globals on external libs (structlog) unless this project is an “application” (server always run at top level) and not a library (potentially called from many apps).
14
14
15
15
### Python specific guide
16
-
17
16
- Code should be "Pythonic"
18
17
- We use `asyncio` where ever possible. Avoid threads unless there's a good reason we can't use async.
19
18
- Python json.dumps should always set `ensure_ascii=False`
@@ -25,3 +24,7 @@ The SDK in `/libs/core` is a SDK/library we expose to third parties. We code rev
25
24
- Changing existing APIs that break current users should be avoided. Call out breaking API changes, and confirm with user that we're okay with this break.
26
25
- All visible classes/vars should have docstrings explaining their purpose. These will be pulled into 3rd party docs automatically. The doc strings should be written for 3rd party devs learning the SDK.
27
26
- Performance: the base_adapter and litellm_adapter are performance critical. They are the core run-loop of our agent system. We should avoid anything that would slow them down (file reads should be done once and passed in, etc). It's critical to avoid blocking IO - a process may be executing hundreds of these in parallel.
27
+
28
+
### Project specific guide
29
+
30
+
-**`ModelName` enum and user input:** Do not use the `ModelName` enum for validation or typing of user-provided model identifiers (for example in a Pydantic request body that validates an API payload). Kiln loads additional models over the air; those models can use names that are not members of the locally shipped `ModelName` enum. If request validation is tied to the enum, a model that is valid according to the merged model list will fail validation. Appropriate uses of `ModelName` include aliasing a constant chosen at build time (for example default config that references a known shipped model) and entries inside the `ml_model_list` provider definitions.
Copy file name to clipboardExpand all lines: .cursor/skills/kiln-add-model/SKILL.md
+50-5Lines changed: 50 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -181,15 +181,28 @@ If the model supports configurable reasoning effort (not just on/off), add `avai
181
181
182
182
## Phase 4 – Run Tests
183
183
184
-
Tests call real LLMs and cost money. Just execute commands directly — Cursor prompts for approval.
184
+
Tests call real LLMs and cost money. Ideally the user only needs to consent to two script executions: the smoke test, then the full parallel suite.
185
185
186
186
**Vertex AI authentication:** Vertex tests require active gcloud credentials. If you are changing a model that uses Vertex, you must not run the test until asking the user to run `gcloud auth application-default login` before trying. These failures are auth issues, not model config problems.
187
187
188
188
**`-k` filter syntax:** Always use bracket notation for model+provider filtering, never `and`:
189
189
- Good: `-k "test_name[glm_5-fireworks_ai]"` or `-k "glm_5"`
190
190
- Bad: `-k "glm_5 and fireworks"` — `and` is a pytest keyword expression that can match wrong tests
191
191
192
-
### 4a. Smoke test — verify slug works
192
+
### 4a. Enable parallel testing
193
+
194
+
Before running paid tests, enable parallel testing in `pytest.ini`:
195
+
196
+
```ini
197
+
# Change this line:
198
+
# addopts = -n auto
199
+
# To:
200
+
addopts = -n 8
201
+
```
202
+
203
+
**Important:** Revert this change after all tests complete (re-comment the line).
Tests are in `libs/core/kiln_ai/adapters/extractors/test_litellm_extractor.py`.
217
230
@@ -225,10 +238,39 @@ uv run pytest --runpaid --ollama libs/core/kiln_ai/adapters/extractors/test_lite
225
238
226
239
If a provider rejects a data type (400 error), remove that `KilnMimeType` and re-run.
227
240
241
+
### 4e. Revert parallel testing
242
+
243
+
After all tests complete, **revert `pytest.ini`** back to the commented-out state:
244
+
245
+
```ini
246
+
# addopts = -n auto
247
+
```
248
+
249
+
### 4f. Test output format
250
+
251
+
After all tests finish, present results to the user as:
252
+
253
+
1.**Two paragraphs of nuance** – describe any unusual findings, things you tried and reverted, known pre-existing failures vs new failures, API quirks discovered, and any config adjustments made during testing.
254
+
255
+
2.**Per-model per-test dump** – organized by model name and provider, using this format:
Use ✅ for PASSED, ❌ for FAILED (with brief reason), ⏭️ for SKIPPED.
265
+
228
266
---
229
267
230
268
## Phase 5 – Discord Announcement
231
269
270
+
**Do NOT draft the Discord announcement automatically.** After presenting test results, ask the user if they want a Discord announcement drafted. Only proceed if they confirm.
271
+
272
+
When requested, use this format:
273
+
232
274
```
233
275
New Model: [Model Name] 🚀
234
276
[One-liner about the model and that it's now in Kiln]
@@ -288,9 +330,12 @@ Rules:
288
330
-[ ] Preserve existing comments from predecessor (e.g. reasoning notes, MIME type groupings)
289
331
-[ ] Zero-sum applied if model is suggested for evals/data gen
290
332
-[ ] RAG config templates updated if the new model replaces one used in `app/web_ui/src/routes/(app)/docs/rag_configs/[project_id]/add_search_tool/rag_config_templates.ts`
Implement the active project. Routes to single-phase or full implementation.
3
+
Implement the active project. The top-level agent acts as a strict manager/coordinator — it orchestrates sub-agents but never writes code or reviews it.
4
4
5
5
## Pre-Checks
6
6
@@ -22,7 +22,6 @@ Check that all spec artifacts through `implementation_plan.md` have `status: com
22
22
If any are missing or `status: draft`:
23
23
24
24
> Project spec is incomplete. The following artifacts need attention:
25
-
>
26
25
> -[missing/draft artifacts]
27
26
>
28
27
> Use `/spec continue` to finish speccing before implementing.
@@ -34,90 +33,140 @@ If any are missing or `status: draft`:
34
33
-`/spec implement all` or `/spec impl all`: All remaining phases
35
34
-`/spec implement phase N` or `/spec impl phase N`: Specific single phase
36
35
37
-
## Single Phase Implementation
36
+
## Manager Role
37
+
38
+
The manager orchestrates the implementation process. It does NOT code, review code, run tests, or make technical decisions.
39
+
40
+
The manager's responsibilities:
41
+
- Spawn coding sub-agents and CR sub-agents at the right times
42
+
- Route CR feedback back to the coding agent
43
+
- Verify that commits actually landed (via `git status`)
44
+
- Surface phase summaries and roadblocks to the user
45
+
- Send minimal, well-structured prompts that point to reference files — not restate their content
46
+
47
+
## Single Phase Flow
48
+
49
+
If the target phase is already complete (checkbox checked in `implementation_plan.md`), tell the user and stop — don't re-implement it.
50
+
51
+
### Step 1: Spawn Coding Agent
52
+
53
+
Spawn a new coding sub-agent using the Initial Coding Prompt template below.
38
54
39
-
Implement one phase autonomously. The coding agent works without user assistance from start to finish.
55
+
→ Read [spawning_subagents.md](.cursor/skills/specs/references/spawning_subagents.md) for how to spawn sub-agents.
40
56
41
-
### Coding Persona
57
+
The coding agent returns either:
58
+
- A summary indicating it's ready for code review
59
+
- A roadblock message (see Escalation below)
42
60
43
-
You are a very skilled senior engineer IC. Your code:
61
+
### Step 2: CR Loop
44
62
45
-
- Explains itself through great naming and composition
46
-
- Uses comments only for external constraints, not to describe poorly structured code
47
-
- Is test-driven: tests that catch real breakage, don't need constant refactoring, target 95%+ coverage, reuse test helpers
63
+
1. Spawn a fresh CR sub-agent using the CR Agent Prompt template below
64
+
2. CR agent returns structured feedback with severity labels
65
+
3. If the review is clean: proceed to Step 3
66
+
4. If issues exist:
67
+
- Resume the coding agent with the CR Feedback Prompt template, passing the CR output
68
+
- Coding agent addresses issues and returns a summary
69
+
- Spawn a new CR sub-agent, passing prior feedback in a `<prior_cr_feedback>` block
70
+
- Repeat until CR returns clean
48
71
49
-
You're willing to flag when a requirement leads to bad technical outcomes — but you don't re-litigate plan-level decisions that were already confirmed during speccing.
72
+
→ Read [spawning_subagents.md](.cursor/skills/specs/references/spawning_subagents.md) for how to spawn sub-agents.
50
73
51
-
### Implementation Loop
74
+
### Step 3: Commit
52
75
53
-
1.**Read the implementation plan** and identify the target phase
54
-
2.**Read spec and architecture docs** for context
55
-
3.**Write phase plan** to `/phase_plans/phase_N.md`:
56
-
- Overview: what this phase accomplishes and why
57
-
- Steps: ordered, specific. Files to change, exact changes, code snippets for signatures
58
-
- Tests: specific automated test cases by name and what they verify
59
-
- Completion criteria: checklist of what must be true when done
60
-
4.**Build the code** per the phase plan
61
-
5.**Run automated checks** (lint, format, type-check, build). Follow project-specific commands from system prompt. Iterate until clean.
62
-
6.**Write tests** per the phase plan's test section
63
-
7.**Run tests**. Iterate until passing.
64
-
8.**Run automated checks again** (tests/fixes may introduce lint/format issues). Iterate until clean.
65
-
9.**Self code-review via sub-agent**:
66
-
- → Read [references/spawning_subagents.md](references/spawning_subagents.md) for how to spawn
67
-
- Pass the prompt from [references/cr_agent_prompt.md](references/cr_agent_prompt.md) to the sub-agent
68
-
- Include: "A coding agent just implemented phase N of [project]. Review the changes using `git diff`. The spec for this project can be found [here](link_to_spec_folder)."
69
-
- Iterate per CR Iteration Loop below
70
-
10.**Run automated checks one final time** (CR fixes may introduce issues). Iterate until clean.
71
-
11.**Mark phase complete** in `implementation_plan.md` (toggle checkbox only)
72
-
12.**Stop and present summary** of what was built
76
+
Resume the coding agent with the Commit Prompt template below. The coding agent commits all changes, marks the phase complete, and returns the commit message.
73
77
74
-
### CR Iteration Loop
78
+
### Step 4: Verify
75
79
76
-
1. Spawn CR sub-agent with clean context. Pass the CR prompt from `cr_agent_prompt.md`.
77
-
2. CR returns feedback with severity labels (critical/moderate/mild).
78
-
3. If issues exist:
79
-
- Fix each issue (or rarely, add a code comment explaining the technical rationale)
80
-
- Spawn a new CR sub-agent, passing the same CR prompt plus `<prior_cr_feedback>` block
81
-
4. The re-review agent:
82
-
- Verifies prior issues are addressed
83
-
- Checks for new issues from fixes
84
-
5. Loop until CR returns clean.
80
+
Run `git status` to confirm:
81
+
- Working tree is clean (no uncommitted changes)
82
+
- The commit exists
85
83
86
-
### Non-Interactive Rule
84
+
If `git status` shows uncommitted changes, resume the coding agent:
87
85
88
-
The coding phase is autonomous. Don't stop to ask the user for help.
**One exception:** You discover a genuinely new technical constraint not known at design time that materially changes the plan (e.g., an API doesn't support an assumed operation, a framework has an undocumented limitation).
88
+
Verify again after.
91
89
92
-
In this case — and only this case — pause and surface the issue to the user for a decision.
90
+
### Step 5: Present Summary
91
+
92
+
Show the phase summary to the user.
93
93
94
94
## Implement All
95
95
96
-
A lightweight coordinator that runs all remaining phases in sequence.
96
+
Run all remaining phases in sequence:
97
+
98
+
1. Read `implementation_plan.md`, find all incomplete phases
99
+
2. For each phase: run the Single Phase Flow above
100
+
3. Between phases: show the phase summary, then immediately continue to next phase (don't stop to ask)
101
+
4. After all phases: present a final summary
102
+
103
+
If a target phase is already complete (checkbox checked), skip it.
104
+
105
+
## Prompt Templates
106
+
107
+
These are the exact prompts the manager sends to sub-agents. Use them verbatim, filling in the bracketed values.
108
+
109
+
### Initial Coding Prompt
110
+
111
+
```
112
+
You are a coding agent implementing a phase of a spec-driven project.
113
+
114
+
**Phase:** [N]
115
+
**Project specs:** [specs/projects/PROJECT_NAME/]
116
+
117
+
Read `.cursor/skills/specs/references/coding_phase_prompt.md` for your full instructions. Follow them precisely.
118
+
119
+
Return a short summary of what you built when implementation is complete and ready for code review.
120
+
```
121
+
122
+
### CR Feedback Prompt (resume coding agent)
123
+
124
+
```
125
+
A code reviewer found issues with your implementation. Address all feedback below, then run automated checks until clean.
126
+
127
+
Return a short summary of changes made when ready for re-review.
128
+
129
+
<cr_feedback>
130
+
[CR agent's output]
131
+
</cr_feedback>
132
+
```
133
+
134
+
### Commit Prompt (resume coding agent)
135
+
136
+
```
137
+
Your code has passed review. Commit all changes with a descriptive message summarizing the work done in this phase. Mark the phase checkbox complete in implementation_plan.md.
138
+
139
+
Return the commit message you used.
140
+
```
141
+
142
+
### CR Agent Prompt
143
+
144
+
```
145
+
Review code changes for phase [N] of the project at [specs/projects/PROJECT_NAME/].
97
146
98
-
### Coordinator Process
147
+
Read `.cursor/skills/specs/references/cr_agent_prompt.md` for your full review instructions. Follow them precisely.
148
+
```
99
149
100
-
1. Get next incomplete phase from `implementation_plan.md`
101
-
2. Spawn a sub-agent with clean context to run the single-phase implementation flow above
102
-
- → Read [references/spawning_subagents.md](references/spawning_subagents.md) for how to spawn
3.**Auto-commit**: `"Phase N implementation of [project name]\n\n[description of work in phase]"`
105
-
4. Show the phase summary from the subagent to the user
106
-
5. Continue to next phase (don't stop)
107
-
6. Loop until all phases complete
150
+
For re-reviews, append:
108
151
109
-
### Coordinator Context
152
+
```
153
+
<prior_cr_feedback>
154
+
[Previous CR output]
155
+
</prior_cr_feedback>
156
+
```
110
157
111
-
The coordinator has minimal context — it just manages the loop. Each phase sub-agent gets clean context.
158
+
## Escalation
112
159
113
-
CR happens inside each phase's implementation loop, not at coordinator level.
160
+
The coding agent may surface a technical roadblock instead of a "ready for CR" summary. This happens when the coding agent's "one exception" rule triggers — a genuinely new technical constraint not known at design time.
114
161
115
-
### Passed to Phase Sub-Agents
162
+
When the manager receives a roadblock message:
116
163
117
-
For implement-all, pass the content of [references/coding_phase_prompt.md](references/coding_phase_prompt.md) to each phase sub-agent. This prompt contains the full single-phase implementation instructions.
164
+
1. Present the roadblock to the user and wait for a decision
165
+
2. Resume the coding agent with the user's decision
166
+
3. Continue the single-phase flow from wherever the coding agent left off
118
167
119
168
## References
120
169
121
-
-[references/spawning_subagents.md](references/spawning_subagents.md) — How to spawn sub-agents
122
-
-[references/coding_phase_prompt.md](references/coding_phase_prompt.md) — Prompt passed to coding sub-agents
123
-
-[references/cr_agent_prompt.md](references/cr_agent_prompt.md) — Prompt passed to CR sub-agents
170
+
-[spawning_subagents.md](.cursor/skills/specs/references/spawning_subagents.md) — How to spawn and resume sub-agents
171
+
-[coding_phase_prompt.md](.cursor/skills/specs/references/coding_phase_prompt.md) — Full instructions for coding sub-agents
172
+
-[cr_agent_prompt.md](.cursor/skills/specs/references/cr_agent_prompt.md) — Full instructions for CR sub-agents
0 commit comments