Skip to content

Commit ee1484f

Browse files
committed
docs: add CLAUDE.md with project guidance
Capture architecture notes, dual-backend dispatch conventions, cache invalidation rules, and the schema evolution workflow for future Claude Code sessions. Signed-off-by: JmPotato <github@ipotato.me>
1 parent fbfdd30 commit ee1484f

File tree

1 file changed

+126
-0
lines changed

1 file changed

+126
-0
lines changed

CLAUDE.md

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project
6+
7+
rsomhaP is a monolithic, server-rendered blog engine written in Rust. It is a rewrite of the Python project [Pomash](https://github.com/JmPotato/Pomash) and ships as a single binary backed by either a MySQL- or PostgreSQL-compatible database. Split DB config under `[database]` selects the backend via `backend = "mysql"` / `"postgres"`, while full `connection_url` / `DATABASE_URL` values still select the driver through their URL scheme. Legacy `[mysql]` config and `MYSQL_CONNECTION_URL` env input are still accepted as compatibility fallbacks, but new work should use `[database]` and `DATABASE_URL`.
8+
9+
## Commands
10+
11+
```sh
12+
# Build / run (package name and binary name are both `rsomhap`)
13+
cargo build
14+
cargo run --release
15+
16+
# Quality gates (not CI-enforced, but expected before committing)
17+
cargo fmt --check
18+
cargo clippy -- -D warnings
19+
20+
# Tests
21+
cargo test # compile + pure-logic tests; DB tests early-return when TEST_DATABASE_URL is unset
22+
TEST_DATABASE_URL="mysql://user:pass@host:port/db" cargo test -- --test-threads=1
23+
TEST_DATABASE_URL="postgres://user:pass@host:port/db" cargo test -- --test-threads=1
24+
cargo test <test_name> # run a single test by name
25+
26+
# Docker
27+
docker build -t rsomhap .
28+
docker compose up # reads DATABASE_URL / MYSQL_CONNECTION_URL / UMAMI_ID from the environment
29+
```
30+
31+
**All DB-backed tests live in `src/models/*` and must run serially.** They share real tables (`articles`, `tags`, `pages`, `users`) and truncate data between cases, so parallel execution corrupts state. Every DB test starts with `let Some(pool) = get_test_pool().await else { return; };` (where `get_test_pool` returns `Option<DbPool>`), so a plain `cargo test` without `TEST_DATABASE_URL` is a compile + pure-logic check only — it is not a substitute for running with a real DB before merging schema or model changes. The DB tests are **backend-agnostic**: they run against whichever backend `TEST_DATABASE_URL` points at. Run them against both MySQL and PostgreSQL before merging any change that touches `src/models/*`. Tests outside `src/models/*` (`is_safe_redirect`, `build_message_url`, `sort_out_tags`, `User` serde) are pure.
32+
33+
**The binary must be launched from the project root.** `TEMPLATES_DIR`, `STATIC_DIR`, and `CONFIG_FILE_PATH` in `src/app.rs` are hard-coded as relative paths (`templates`, `static`, `config.toml`). The Dockerfile copies these into the working directory for the same reason.
34+
35+
## Architecture
36+
37+
### Request pipeline
38+
`App::serve` in `src/app.rs` wires an Axum router in two tiers:
39+
40+
- **Public**: `/` (home, paginated via `/page/{num}`), `/article/{id_or_slug}`, `/articles`, `/tag/{tag}`, `/tags`, `/feed`, `/ping`, `/login`, `/logout`, and a catch-all `/{page}` that resolves a custom `Page` by title (case-insensitive, via `LOWER(title) = LOWER(?)`).
41+
- **Admin**: nested under `/admin`, wrapped with `login_required!(AppState, login_url = "/login")` from `axum-login`. The macro redirects unauthenticated users to `/login?next=...`, and `handlers::is_safe_redirect` confines that `next` to relative paths only. The redirect-safety tests in `handlers.rs` exist specifically to lock this behavior — do not weaken or bypass them when touching the login flow.
42+
43+
Two slug footguns that any change to article routing needs to respect:
44+
- **`handler_article` parses `{id_or_slug}` as `i32` first.** A numeric-looking slug like `"2024"` always resolves as article id 2024 and never reaches the slug lookup. Reject or prefix numeric slugs in the editor if this ever becomes a product concern.
45+
- **`articles.slug` is `INDEX(slug)`, not `UNIQUE`.** `Article::get_by_slug` uses `fetch_one`, so if duplicate slugs ever exist the planner silently picks one. Enforce uniqueness via the schema files (see *Models, schema, and backend dispatch* below) before treating slugs as primary identifiers.
46+
47+
### Sessions and authentication
48+
Sessions use `tower_sessions::MemoryStore` with a per-process `Key::generate()` and `with_secure(false)` (see `src/app.rs:215`). Three load-bearing consequences:
49+
50+
1. **Every restart invalidates all sessions** — both the store and the signing key are fresh on boot. There is no persistent session storage.
51+
2. **`MemoryStore` does not scale horizontally.** `fly.toml`'s single-machine model (`min_machines_running = 0`, no replicas) is load-bearing for auth, not just a cost choice. Adding replicas requires swapping to a shared store (e.g. `tower-sessions-sqlx-store`) before the auth layer will work at all.
52+
3. **The session cookie is not `Secure`-flagged.** Production relies entirely on Fly.io's `force_https = true` at the edge. Never deploy rsomhaP behind a non-HTTPS-terminating proxy without first setting `with_secure(true)` and re-testing login.
53+
54+
`src/auth.rs` implements `axum_login::AuthnBackend` for `AppState`. Password verification (argon2, via `password-auth`) runs inside `task::spawn_blocking`; that offload is load-bearing for request concurrency, so preserve it if you ever refactor the auth path.
55+
56+
### AppState and caches
57+
`AppState` (in `src/app.rs`) is cloned into an `Arc` and serves both as Axum state and as the `axum_login::AuthnBackend`. It holds:
58+
59+
- `config: Config` — parsed once from `config.toml`, with env var overrides (`DATABASE_URL`, legacy `MYSQL_CONNECTION_URL`, `PLAUSIBLE_DOMAIN`, `UMAMI_ID`) layered on top in `Config::load_env_vars`. `DATABASE_URL` takes precedence when both DB env vars are present. When split DB fields are used, `[database].backend` chooses whether `Config::database_url()` builds a MySQL or PostgreSQL DSN.
60+
- `env: minijinja::Environment` — all templates loaded eagerly at startup from `templates/` via `add_template_owned`. **Template edits require a full restart**; the `minijinja` `loader` feature is enabled in `Cargo.toml` but is not wired up for hot reload.
61+
- `db: models::DbPool` — an enum wrapping either `sqlx::MySqlPool` or `sqlx::PgPool`. Backend is picked at `DbPool::connect` time from the URL scheme.
62+
- `feed_cache` + `page_titles_cache` — both `Arc<RwLock<...>>`.
63+
64+
**Cache invalidation is a correctness requirement.** Any handler that mutates an `Article` or `Page` must call both `state.refresh_feed_cache(...)` and `state.refresh_page_titles_cache()` (see `handler_edit_post` / `handler_delete_post` in `src/handlers.rs`). `page_titles_cache` is injected into every template render via `AppState::render_template` and drives the navbar — an uninvalidated stale cache ships wrong navigation on every page.
65+
66+
**`refresh_feed_cache(force: bool)` semantics are non-obvious but load-bearing.** `force=false` treats `MAX(articles.updated_at)` as a change detector and skips re-rendering when the timestamp hasn't advanced. Inserts and updates are safe with `force=false` because both bump `updated_at`. **Deletions must use `force=true`** — deleting the most-recently-updated row can leave `MAX(updated_at)` unchanged or regressed, which would silently serve a stale feed that still contains the deleted article. `handler_delete_post` passes `true` for exactly this reason; classify any new mutation path accordingly.
67+
68+
### Editable trait and generic admin handlers
69+
`src/utils.rs` defines the `Editable` trait (`update` / `insert` / `delete` / `get_redirect_url`, all taking `&DbPool`) and a custom `Entity<T>` extractor that merges the path `id` with the form body. The same handler functions `handler_edit_post::<Article>` / `handler_edit_post::<Page>` (and their delete counterparts) are instantiated in `App::serve` for both content types. When adding a new CRUD entity, implement `Editable` and `From<EditorForm>` and reuse these handlers — they already perform cache invalidation and map `Error::PageTitleExists` to a user-visible redirect.
70+
71+
The `utils::Path<T>` wrapper around `axum::extract::Path` exists specifically to render the `error.html` template on path rejection instead of Axum's default `400`. Prefer it over the bare extractor in any new handler that takes a path parameter.
72+
73+
### Models, schema, and backend dispatch
74+
`src/models/mod.rs` defines `DbPool`, an enum that wraps `sqlx::MySqlPool` or `sqlx::PgPool` and dispatches based on the final connection URL scheme (`mysql://` vs `postgres://` / `postgresql://`). For split config fields, `Config::database_url()` injects that scheme from `[database].backend`; for full `connection_url` / `DATABASE_URL` inputs, the supplied URL is used as-is. **Every model method takes `&DbPool` and `match`es once at the entry point**, then calls a backend-specific private helper that uses the concrete pool and the dialect-correct SQL. Transactions (`sqlx::Transaction<'_, sqlx::MySql>` vs `sqlx::Transaction<'_, sqlx::Postgres>`) live entirely inside those helpers; cross-backend transaction types never leak into trait signatures.
75+
76+
**Why this codebase does not use an ORM.** This repo intentionally keeps SQL explicit instead of introducing SeaORM/Diesel-style abstractions. The model surface is small, the dialect differences are narrow but load-bearing (`LAST_INSERT_ID()` vs `RETURNING`, `INSERT IGNORE` vs `ON CONFLICT`, placeholder syntax, timestamp semantics), and hand-written dispatch keeps those differences reviewable at the exact call site without extra dependencies or hidden query generation. Do not introduce an ORM as a convenience refactor unless the underlying requirements materially change.
77+
78+
Schema DDL is split across two files, one per backend, with no incremental migrations:
79+
80+
- `src/models/mysql_schema.rs` — MySQL-flavored `CREATE TABLE IF NOT EXISTS` (with `AUTO_INCREMENT`, `DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP`, `CHARSET = utf8mb4`, inline `INDEX` / `UNIQUE INDEX`).
81+
- `src/models/postgres_schema.rs` — Postgres-flavored `CREATE TABLE IF NOT EXISTS` (with `SERIAL`, `TIMESTAMPTZ NOT NULL DEFAULT NOW()`, separate `CREATE INDEX IF NOT EXISTS` / `CREATE UNIQUE INDEX IF NOT EXISTS`). Postgres has **no** `ON UPDATE CURRENT_TIMESTAMP` equivalent — the target schema is identical in *columns*, but time bookkeeping diverges: see below.
82+
- `models::init_schema(&DbPool)` dispatches into the right one and issues the DDL through a transaction handle. Both files are idempotent via `IF NOT EXISTS`, so it is safe to call on every boot. Do not treat that as a cross-backend atomic migration guarantee: MySQL can implicitly commit DDL, so partial progress is still possible if a later statement fails.
83+
84+
**There is no schema-version table, no `run_migrations`, no error-code-catching.** The previous incremental migration pattern (catching MySQL error codes 1060/1061) is gone.
85+
86+
**Schema evolution workflow**:
87+
1. Edit both schema files (`src/models/mysql_schema.rs` and `src/models/postgres_schema.rs`) so they still describe the same target shape.
88+
2. Update every affected query in `articles.rs`, `pages.rs`, and `users.rs` for both backends.
89+
3. Add or update a test that proves repeated `init_schema` calls are still safe and do not destroy existing rows.
90+
4. Decide explicitly whether old deployed databases matter. Because this repo has no incremental migrations, changing the schema files only changes the target schema for fresh databases or manually migrated ones; an already-deployed database will not automatically gain a new column just because `init_schema` still runs on boot.
91+
92+
**`updated_at` bookkeeping is application-level**, not relied-upon at the DB level. Every `UPDATE` in the model files explicitly `SET updated_at = NOW()`, including `Page::update` and `User::modify_password`. MySQL's `ON UPDATE CURRENT_TIMESTAMP` column default remains as belt-and-suspenders but is load-bearing on **no** backend — do not rely on it. Postgres has no trigger; **any new `UPDATE` you add must bump `updated_at` explicitly or it will silently go stale on Postgres.**
93+
94+
**`SELECT LAST_INSERT_ID()` vs `RETURNING id`.** For MySQL the `Article::insert` / `Page::insert` paths run an INSERT then a separate `SELECT LAST_INSERT_ID()` inside the same transaction. For Postgres they use `INSERT ... RETURNING id` via `query_scalar::<_, i32>`. Keep the pattern symmetric across backends — callers still receive an `i32` id.
95+
96+
**`INSERT IGNORE` vs `ON CONFLICT DO NOTHING`.** `User::insert` uses `INSERT IGNORE INTO users` on MySQL and `INSERT INTO users ... ON CONFLICT (username) DO NOTHING` on Postgres. Both are load-bearing for the admin bootstrap: `password_auth::generate_hash` is non-deterministic (random salt), so every boot produces a new hash. If this ever overwrote an existing row, the admin's manually changed password would be wiped on restart. **Keep the conflict-target explicit on Postgres** (`ON CONFLICT (username)`) so future additional unique indexes do not silently change behavior.
97+
98+
**Postgres bind-type pitfalls**:
99+
- `sqlx-postgres 0.8` does **not** implement `Encode<Postgres> for u32` (Postgres has no unsigned integer types). The public pagination API keeps `u32` but the internal helpers cast via `i64::from(u)` **before** multiplying to avoid any u32-level overflow, e.g. `let offset = i64::from(page.saturating_sub(1)) * i64::from(per_page);`. Do the same for any new `u32`-bearing query.
100+
- `COUNT(col)` returns `INT8` on Postgres and sqlx-postgres 0.8 is strict about type compatibility. If you materialize a COUNT result into Rust, use `i64` or cast it explicitly in SQL. `Tags::get_all_with_count` deliberately avoids returning the count at all because the count is only used for ordering.
101+
102+
**Article tags are dual-stored**, and both copies must stay in sync:
103+
- `articles.tags` holds a comma-separated canonical string (used for display and for round-tripping through the editor).
104+
- The `tags` table holds normalized rows keyed by `article_id` (used by `GET /tag/{name}` via `INNER JOIN` and by `Tags::get_all_with_count`). The target schema no longer defines `created_at`/`updated_at` on `tags`; they were dead weight. Old databases that predate this cleanup may still physically have those columns until manually migrated or recreated.
105+
106+
`Article::insert` writes both; `Article::update` clears the `tags` rows via `clear_tags_{mysql,postgres}` and re-inserts them; `Article::delete` clears the `tags` rows inside the same transaction as the article delete. Any change to the CSV format (delimiter, normalization, case handling) must be applied atomically across both `Tags::insert_tags_{mysql,postgres}` helpers, `utils::sort_out_tags`, and `handler_article`'s display-time split on `,`.
107+
108+
The admin user is bootstrapped via `User::insert`. On first boot the admin password equals the username and must be changed via `/admin/change_password`.
109+
110+
### Markdown & templates
111+
Markdown rendering uses `comrak` with a `syntect` adapter, exposed to templates as the `md_to_html` MiniJinja filter (configured in `AppState::build_env`). The syntax theme comes from `config.toml [style] code_syntax_highlight_theme`. Other custom filters: `truncate_str`, `to_lowercase`, `concat_url`, and `get_slug` (falls back to `id` when `slug` is empty — this is how permalinks keep working for articles that were created without a slug).
112+
113+
### Config exposure to templates
114+
Not every field of `Config` is reachable from templates. `Config`, `Giscus`, `Analytics`, and `TwitterCard` manually implement `minijinja::value::Object` in `src/config.rs` and only whitelist specific fields. When adding a new config field that must be template-visible, update **both** the `get_value` *and* `enumerate` methods on the corresponding `Object` impl — missing `enumerate` breaks template iteration (`for`, `items()`) even when `get_value` works.
115+
116+
## Conventions
117+
118+
- Conventional Commits (`feat:`, `fix:`, `refactor:`, `docs:`, `test:`, `chore:`).
119+
- Sign off commits with `git commit -s` — don't hand-write `Signed-off-by:` lines, and never include `Co-Authored-By:`.
120+
- Use `git push --force-with-lease` rather than `--force`.
121+
122+
## Deployment
123+
124+
CI deploys to Fly.io on every push to `main` (`.github/workflows/fly-deploy.yml`). The workflow stages `DATABASE_URL` and `UMAMI_ID` as Fly secrets, then runs `fly deploy --remote-only`. The GitHub Actions secret must therefore be configured under the preferred `DATABASE_URL` name before deploys. The Fly app is configured in `fly.toml` (region `nrt`, internal port `5299`, `/ping` healthcheck, `min_machines_running = 0` with auto-suspend). Because the session store is in-process, suspensions and redeploys both log admin sessions out — acceptable for a single-admin blog, but a blocker for any future multi-tenant mode or multi-machine rollout.
125+
126+
**Database backend selection.** Split config-file fields (`username` / `password` / `host` / `port` / `database` under `[database]`) rely on `[database].backend = "mysql" | "postgres"` to decide which DSN scheme to build. If `connection_url` in the config or `DATABASE_URL` in the environment is present, that full URL wins and its own scheme still chooses the backend: `mysql://...` uses sqlx-mysql, `postgres://...` (or `postgresql://...`) uses sqlx-postgres. Legacy `[mysql]` and `MYSQL_CONNECTION_URL` are still accepted during upgrades, but treat them as compatibility paths rather than the steady-state interface.

0 commit comments

Comments
 (0)