Skip to content

Commit 3bf27aa

Browse files
authored
Merge pull request #1182 from Kiln-AI/scosman/improve-api-docs
Improve API docs
2 parents 5676ef5 + 55193c9 commit 3bf27aa

File tree

84 files changed

+6089
-15077
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

84 files changed

+6089
-15077
lines changed

.agents/api_code_review.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# FastAPI / OpenAPI Standards
2+
3+
Our OpenAPI spec drives our SDK, Scalar docs, and agent tool use (Kiln Chat calls our APIs). Every endpoint must be well-documented and consistently named. Flag violations during code review.
4+
5+
**Required on every endpoint:**
6+
7+
1. **`tags=[...]`** on the route decorator. Every endpoint must belong to a tag group (e.g. `tags=["Projects"]`). Untagged endpoints break Scalar navigation and agent tool discovery. Prefer existing tags, creating new ones only when really needed. All tags should be documented in `tags_metadata` in `server.py`
8+
2. **`summary=`** on the route decorator. A short, unique name for the operation. Summaries must be unambiguous — if two endpoints could share the same summary (e.g. "Edit Tags"), qualify them ("Edit Run Tags", "Edit Document Tags").
9+
3. **Docstring** on the handler function (optional if behavior is completely obvious from the path, method, and summary). When provided, docstrings should be terse — one sentence or a fragment. Never pad with filler like "This endpoint allows you to...". Longer descriptions (2–3 sentences) are warranted only when distinguishing easily confused endpoints, documenting non-obvious side effects, or noting prerequisites. Exclude if the `summary` string already covers the same level of detail.
10+
4. **`Path(description=...)`** on every path parameter, using `Annotated[str, Path(description="...")]` syntax. Recurring ID parameters must use consistent standard descriptions (e.g. `"The unique identifier of the project."`, `"The unique identifier of the task within the project."`).
11+
5. **`Query(description=...)`** on every query parameter.
12+
6. **`Field(description=...)`** on Pydantic model properties that aren't completely self-evident from name + type.
13+
7. **Class docstring** on Pydantic models used as API request/response bodies. These become the schema description in the OpenAPI spec, which agents and SDK users see when inspecting request/response types. Optional but suggested if non-obvious from name.
14+
15+
**Correct HTTP methods:**
16+
17+
- **GET** must be idempotent and side-effect-free. If an endpoint creates, modifies, or deletes data, it must not be GET. We previously had GET endpoints that established connections and ran evaluations — this is wrong and confuses both agents and humans.
18+
- **POST** for creation and actions that trigger execution.
19+
- **PATCH** for partial updates.
20+
- **DELETE** for deletion.
21+
- The only exception is SSE streaming endpoints, which must use GET due to browser `EventSource` constraints. These must have descriptions explicitly noting the mutation and the SSE reason.
22+
23+
**Naming and path conventions:**
24+
25+
- **Always use plural nouns** in path segments: `/tasks/{task_id}`, never `/task/{task_id}`. Same for `/projects`, `/specs`, `/evals`, `/runs`, `/prompts`, `/documents`, `/skills`, `/run_configs`, etc. We had inconsistencies where GET used plural but POST/PATCH/DELETE used singular — this is confusing and must be caught.
26+
- **Paths should be descriptive and intuitive.** Paths should follow REST conventions and be clear (as possible) without docstrings. Path and descriptions should distinguishing similar sounding endpoints. If a path could reasonably be improved, suggest a rename.
27+
- **Consistent path structure** for related resources. All operations on the same resource type should share a common path prefix (e.g. all run config operations under `/run_configs`, not split across `/task_run_config`, `/mcp_run_config`, `/run_config`). Important to not use similar but different prefixes, as this commonly trips up agents.
28+
- **No trailing slashes** on paths. Use `/run_configs` not `/run_configs/`. Trailing slashes cause inconsistency between endpoints and can break client routing.
29+
30+
**Example of a well-documented endpoint:**
31+
32+
```python
33+
@app.delete(
34+
"/api/projects/{project_id}",
35+
summary="Delete Project",
36+
tags=["Projects"],
37+
)
38+
async def delete_project(
39+
project_id: Annotated[
40+
str, Path(description="The unique identifier of the project.")
41+
],
42+
) -> dict:
43+
"""Removes the project from Kiln but does not delete the files from disk."""
44+
```
45+
46+
**What to flag in code review:**
47+
48+
- Missing `tags=` on any route decorator
49+
- Missing `summary=` on any route decorator
50+
- Missing `Path(description=...)` or `Query(description=...)` on any parameter
51+
- GET endpoints that perform mutations (unless SSE with documented justification)
52+
- Singular nouns in path segments where plural is standard
53+
- Ambiguous or duplicate summaries across endpoints
54+
- Trailing slashes on paths
55+
- Inconsistent path naming for the same resource type
56+
- Wordy or filler-padded docstrings ("This endpoint allows you to...")
57+
- Docstrings containing code artifacts, raw `Args:` blocks, or formatting that doesn't read as clean prose in OpenAPI
58+
- Pydantic models used in API request/response types (nested included) missing a class docstring, if the class name alone isn't obvious
59+
- Custom string types with validator-based constraints that don't surface in the OpenAPI schema. Use `StringConstraints` in the `Annotated` type definition so `minLength`/`maxLength` appear automatically (see `FilenameString`, `SkillNameString` for examples). Don't duplicate constraints in individual `Field()` calls.

.agents/code_review_guidelines.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,15 @@ The SDK in `/libs/core` is a SDK/library we expose to third parties. We code rev
2525
- All visible classes/vars should have docstrings explaining their purpose. These will be pulled into 3rd party docs automatically. The doc strings should be written for 3rd party devs learning the SDK.
2626
- Performance: the base_adapter and litellm_adapter are performance critical. They are the core run-loop of our agent system. We should avoid anything that would slow them down (file reads should be done once and passed in, etc). It's critical to avoid blocking IO - a process may be executing hundreds of these in parallel.
2727

28+
### FastAPI / OpenAPI Standards
29+
30+
If the change impacts API endpoints, read `.agents/api_code_review.md` for instructions on how to code review API endpoints.
31+
32+
Changes impacting APIs include:
33+
- adding/removing/modifying a FastAPI endpoint `@app.get`, `@app.delete`, etc
34+
- adding/removing/modifing a pydantic model which is used in an API endpoint, as a input/return value (including nested models)
35+
2836
### Project specific guide
2937

3038
- **`ModelName` enum and user input:** Do not use the `ModelName` enum for validation or typing of user-provided model identifiers (for example in a Pydantic request body that validates an API payload). Kiln loads additional models over the air; those models can use names that are not members of the locally shipped `ModelName` enum. If request validation is tied to the enum, a model that is valid according to the merged model list will fail validation. Appropriate uses of `ModelName` include aliasing a constant chosen at build time (for example default config that references a known shipped model) and entries inside the `ml_model_list` provider definitions.
39+

app/desktop/studio_server/api_models/copilot_models.py

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,18 @@
88
class TaskInfoApi(BaseModel):
99
"""Task information for copilot API calls."""
1010

11-
task_prompt: str
12-
task_input_schema: str
13-
task_output_schema: str
11+
task_prompt: str = Field(description="The task's prompt.")
12+
task_input_schema: str = Field(description="The task's input JSON schema.")
13+
task_output_schema: str = Field(description="The task's output JSON schema.")
1414

1515

1616
class TaskMetadataApi(BaseModel):
1717
"""Metadata about the model used for a task."""
1818

19-
model_name: str
20-
model_provider_name: ModelProviderName
19+
model_name: str = Field(description="The name of the AI model used.")
20+
model_provider_name: ModelProviderName = Field(
21+
description="The provider hosting the model (e.g. OpenAI, Anthropic)."
22+
)
2123

2224

2325
class SyntheticDataGenerationStepConfigApi(BaseModel):

app/desktop/studio_server/copilot_api.py

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import logging
2+
from typing import Annotated
23

34
from app.desktop.studio_server.api_client.kiln_ai_server_client.api.copilot import (
45
clarify_spec_v1_copilot_clarify_spec_post,
@@ -47,7 +48,7 @@
4748
get_copilot_api_key,
4849
)
4950
from app.desktop.studio_server.utils.response_utils import unwrap_response
50-
from fastapi import FastAPI, HTTPException
51+
from fastapi import FastAPI, HTTPException, Path
5152
from kiln_ai.datamodel import TaskRun
5253
from kiln_ai.datamodel.basemodel import FilenameString
5354
from kiln_ai.datamodel.datamodel_enums import Priority
@@ -113,7 +114,7 @@ class CreateSpecWithCopilotRequest(BaseModel):
113114

114115

115116
def connect_copilot_api(app: FastAPI):
116-
@app.post("/api/copilot/clarify_spec")
117+
@app.post("/api/copilot/clarify_spec", tags=["Copilot"])
117118
async def clarify_spec(input: ClarifySpecApiInput) -> ClarifySpecApiOutput:
118119
api_key = get_copilot_api_key()
119120
client = get_authenticated_client(api_key)
@@ -139,7 +140,7 @@ async def clarify_spec(input: ClarifySpecApiInput) -> ClarifySpecApiOutput:
139140
detail="Unknown error.",
140141
)
141142

142-
@app.post("/api/copilot/refine_spec")
143+
@app.post("/api/copilot/refine_spec", tags=["Copilot"])
143144
async def refine_spec(input: RefineSpecApiInput) -> RefineSpecApiOutput:
144145
api_key = get_copilot_api_key()
145146
client = get_authenticated_client(api_key)
@@ -165,7 +166,7 @@ async def refine_spec(input: RefineSpecApiInput) -> RefineSpecApiOutput:
165166
detail="Unknown error.",
166167
)
167168

168-
@app.post("/api/copilot/generate_batch")
169+
@app.post("/api/copilot/generate_batch", tags=["Copilot"])
169170
async def generate_batch(input: GenerateBatchApiInput) -> GenerateBatchApiOutput:
170171
api_key = get_copilot_api_key()
171172
client = get_authenticated_client(api_key)
@@ -191,7 +192,7 @@ async def generate_batch(input: GenerateBatchApiInput) -> GenerateBatchApiOutput
191192
detail="Unknown error.",
192193
)
193194

194-
@app.post("/api/copilot/question_spec")
195+
@app.post("/api/copilot/question_spec", tags=["Copilot"])
195196
async def question_spec(
196197
input: SpecQuestionerApiInput,
197198
) -> QuestionSet:
@@ -219,7 +220,7 @@ async def question_spec(
219220
detail="Unknown error.",
220221
)
221222

222-
@app.post("/api/copilot/refine_spec_with_question_answers")
223+
@app.post("/api/copilot/refine_spec_with_question_answers", tags=["Copilot"])
223224
async def submit_question_answers(
224225
request: SubmitAnswersRequest,
225226
) -> RefineSpecApiOutput:
@@ -245,9 +246,19 @@ async def submit_question_answers(
245246
detail="Unknown error.",
246247
)
247248

248-
@app.post("/api/projects/{project_id}/tasks/{task_id}/spec_with_copilot")
249+
@app.post(
250+
"/api/projects/{project_id}/tasks/{task_id}/spec_with_copilot",
251+
tags=["Copilot"],
252+
)
249253
async def create_spec_with_copilot(
250-
project_id: str, task_id: str, request: CreateSpecWithCopilotRequest
254+
project_id: Annotated[
255+
str, Path(description="The unique identifier of the project.")
256+
],
257+
task_id: Annotated[
258+
str,
259+
Path(description="The unique identifier of the task within the project."),
260+
],
261+
request: CreateSpecWithCopilotRequest,
251262
) -> Spec:
252263
"""Create a spec using Kiln Copilot.
253264

app/desktop/studio_server/data_gen_api.py

Lines changed: 88 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
from typing import Literal
1+
from typing import Annotated, Literal
22

3-
from fastapi import FastAPI, HTTPException
3+
from fastapi import FastAPI, HTTPException, Path, Query
44
from kiln_ai.adapters.adapter_registry import adapter_for_task, load_skills_for_task
55
from kiln_ai.adapters.data_gen.data_gen_task import (
66
DataGenCategoriesTask,
@@ -55,7 +55,7 @@ class DataGenSampleApiInput(BaseModel):
5555
topic: list[str] = Field(description="Topic path for sample generation", default=[])
5656
num_samples: int = Field(description="Number of samples to generate", default=8)
5757
gen_type: Literal["training", "eval"] = Field(
58-
description="The type of task to generate topics for"
58+
description="The type of data generation: eval or training."
5959
)
6060
guidance: str | None = Field(
6161
description="Optional custom guidance for generation",
@@ -122,9 +122,20 @@ class SaveQnaPairInput(BaseModel):
122122

123123

124124
def connect_data_gen_api(app: FastAPI):
125-
@app.post("/api/projects/{project_id}/tasks/{task_id}/generate_categories")
125+
@app.post(
126+
"/api/projects/{project_id}/tasks/{task_id}/generate_categories",
127+
summary="Generate Categories",
128+
tags=["Synthetic Data"],
129+
)
126130
async def generate_categories(
127-
project_id: str, task_id: str, input: DataGenCategoriesApiInput
131+
project_id: Annotated[
132+
str, Path(description="The unique identifier of the project.")
133+
],
134+
task_id: Annotated[
135+
str,
136+
Path(description="The unique identifier of the task within the project."),
137+
],
138+
input: DataGenCategoriesApiInput,
128139
) -> TaskRun:
129140
project = project_from_id(project_id)
130141
task = task_from_id(project_id, task_id)
@@ -155,9 +166,20 @@ async def generate_categories(
155166
categories_run = await adapter.invoke(task_input.model_dump())
156167
return categories_run
157168

158-
@app.post("/api/projects/{project_id}/tasks/{task_id}/generate_inputs")
169+
@app.post(
170+
"/api/projects/{project_id}/tasks/{task_id}/generate_inputs",
171+
summary="Generate Inputs",
172+
tags=["Synthetic Data"],
173+
)
159174
async def generate_samples(
160-
project_id: str, task_id: str, input: DataGenSampleApiInput
175+
project_id: Annotated[
176+
str, Path(description="The unique identifier of the project.")
177+
],
178+
task_id: Annotated[
179+
str,
180+
Path(description="The unique identifier of the task within the project."),
181+
],
182+
input: DataGenSampleApiInput,
161183
) -> TaskRun:
162184
project = project_from_id(project_id)
163185
task = task_from_id(project_id, task_id)
@@ -187,10 +209,19 @@ async def generate_samples(
187209
samples_run = await adapter.invoke(task_input.model_dump())
188210
return samples_run
189211

190-
@app.post("/api/projects/{project_id}/tasks/{task_id}/save_sample")
212+
@app.post(
213+
"/api/projects/{project_id}/tasks/{task_id}/save_sample",
214+
summary="Save Sample",
215+
tags=["Synthetic Data"],
216+
)
191217
async def save_sample(
192-
project_id: str,
193-
task_id: str,
218+
project_id: Annotated[
219+
str, Path(description="The unique identifier of the project.")
220+
],
221+
task_id: Annotated[
222+
str,
223+
Path(description="The unique identifier of the task within the project."),
224+
],
194225
task_run: TaskRun,
195226
) -> TaskRun:
196227
"""
@@ -202,12 +233,24 @@ async def save_sample(
202233
task_run.save_to_file()
203234
return task_run
204235

205-
@app.post("/api/projects/{project_id}/tasks/{task_id}/generate_sample")
236+
@app.post(
237+
"/api/projects/{project_id}/tasks/{task_id}/generate_sample",
238+
summary="Generate Sample",
239+
tags=["Synthetic Data"],
240+
)
206241
async def generate_sample(
207-
project_id: str,
208-
task_id: str,
242+
project_id: Annotated[
243+
str, Path(description="The unique identifier of the project.")
244+
],
245+
task_id: Annotated[
246+
str,
247+
Path(description="The unique identifier of the task within the project."),
248+
],
209249
sample: DataGenSaveSamplesApiInput,
210-
session_id: str | None = None,
250+
session_id: Annotated[
251+
str | None,
252+
Query(description="Optional session ID to group generated samples."),
253+
] = None,
211254
) -> TaskRun:
212255
task = task_from_id(project_id, task_id)
213256

@@ -260,12 +303,24 @@ async def generate_sample(
260303

261304
return run
262305

263-
@app.post("/api/projects/{project_id}/tasks/{task_id}/generate_qna")
306+
@app.post(
307+
"/api/projects/{project_id}/tasks/{task_id}/generate_qna",
308+
summary="Generate Q&A Pairs",
309+
tags=["Synthetic Data"],
310+
)
264311
async def generate_qna_pairs(
265-
project_id: str,
266-
task_id: str,
312+
project_id: Annotated[
313+
str, Path(description="The unique identifier of the project.")
314+
],
315+
task_id: Annotated[
316+
str,
317+
Path(description="The unique identifier of the task within the project."),
318+
],
267319
input: DataGenQnaApiInput,
268-
session_id: str | None = None,
320+
session_id: Annotated[
321+
str | None,
322+
Query(description="Optional session ID to group generated Q&A pairs."),
323+
] = None,
269324
) -> TaskRun:
270325
project = project_from_id(project_id)
271326
if not project:
@@ -303,12 +358,23 @@ async def generate_qna_pairs(
303358

304359
return qna_run
305360

306-
@app.post("/api/projects/{project_id}/tasks/{task_id}/save_qna_pair")
361+
@app.post(
362+
"/api/projects/{project_id}/tasks/{task_id}/save_qna_pair",
363+
summary="Save Q&A Pair",
364+
tags=["Synthetic Data"],
365+
)
307366
async def save_qna_pair(
308-
project_id: str,
309-
task_id: str,
367+
project_id: Annotated[
368+
str, Path(description="The unique identifier of the project.")
369+
],
370+
task_id: Annotated[
371+
str,
372+
Path(description="The unique identifier of the task within the project."),
373+
],
310374
input: SaveQnaPairInput,
311-
session_id: str,
375+
session_id: Annotated[
376+
str, Query(description="Session ID to group saved Q&A pairs.")
377+
],
312378
) -> TaskRun:
313379
"""
314380
Save a single QnA pair as a TaskRun. We store the task's system prompt

0 commit comments

Comments
 (0)