Skip to content

Commit 3b9c719

Browse files
committed
fix(docker): hardening phase 3 — closes #1551
Implements the seven follow-up items tracked in #1551 after the phase 1+2 docker hardening pass. Skips item 4 (Better Auth user-not-found log demotion) which needs runtime reproduction first. Compose hardening (docker/docker-compose.yml + docker/Dockerfile): - /tmp tmpfs capped at 64m so misbehaving uploads/streams can't grow unbounded into container memory - pids_limit raised 256 → 512 (Node + Better Auth + proxy fanout routinely burst past 100 pids) - healthcheck switched from BusyBox wget to node -e fetch — guaranteed available in the runtime image - node_modules *.md cleanup narrowed to README* only so packages reading nested markdown at runtime (e.g. js-yaml schema docs) keep working Public stats opt-in: - /api/v1/public/{usage,free-models,provider-tokens} now require MANIFEST_PUBLIC_STATS=true to serve. Default off, returns 404 so unauthenticated probes can't even tell the endpoints exist. Honest HTTP statuses for tool callers: - proxy-exception.filter and the proxy.controller catch block detect chat clients via body.stream === true OR Accept: text/event-stream - Non-chat clients (curl/CI/monitors/non-streaming SDKs) now get real 401/400/500 with a structured { error: { message, type, code? } } envelope instead of a friendly HTTP-200 stub - Chat clients continue to receive the friendly envelope og: tag personalization: - New rewriteOgTags helper rewrites the SPA index.html's hardcoded app.manifest.build to BETTER_AUTH_URL at boot time. SpaFallbackFilter performs the substitution once when caching index.html. Self-hosters' shared link previews now show their own URL. /api/v1/messages status filter: - New status query parameter accepting ok | error | rate_limited | fallback_error | errors (where 'errors' expands to the union of the three error variants). Lets the dashboard build an errors-only toggle. - Count cache key extended so different statuses don't share counts. CI smoke test: - New .github/workflows/docker-smoke.yml builds the production image, boots the compose stack with read_only: true, waits for /health, asserts SPA loads + public stats are gated off, then tears down. Catches regressions from any future code that silently writes to disk.
1 parent f4b3abe commit 3b9c719

23 files changed

+675
-51
lines changed
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"manifest": patch
3+
---
4+
5+
Docker hardening Phase 3 follow-ups: add a 64MB cap to the manifest container's `/tmp` tmpfs, raise `pids_limit` from 256 to 512, and switch the healthcheck from BusyBox `wget` to a `node -e fetch(...)` invocation that's guaranteed to exist in the runtime image. Narrow the Dockerfile's `node_modules` `*.md` cleanup to `README*` only so packages that read nested markdown at runtime (e.g. `js-yaml` schema docs) keep working. Gate `/api/v1/public/{usage,free-models,provider-tokens}` behind `MANIFEST_PUBLIC_STATS=true` (default off, returns 404) so self-hosted instances don't leak aggregate stats to unauthenticated callers. Detect non-chat callers in the proxy exception filter and the `chat/completions` catch block via `body.stream === true` / `Accept: text/event-stream`; non-chat clients now receive real `401`/`400`/`500` HTTP statuses with a structured error envelope while chat UIs continue to get the friendly HTTP-200 envelope. Rewrite `og:url` / `og:image` in the SPA's `index.html` from `BETTER_AUTH_URL` at boot so self-hosters' shared link previews show their own URL instead of `app.manifest.build`. Add a `status` query parameter to `/api/v1/messages` (`ok`, `error`, `rate_limited`, `fallback_error`, or `errors` for the union of the three error variants) so the dashboard can offer an "errors only" toggle. Add `.github/workflows/docker-smoke.yml` that boots the production compose stack with `read_only: true`, waits for `/api/v1/health`, and tears down — guards against future code that silently writes to disk.
Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
name: Docker Smoke
2+
3+
# Boots the production compose stack with `read_only: true` and verifies the
4+
# app comes up cleanly and answers /api/v1/health. Catches regressions from
5+
# code that silently writes outside /tmp at boot, which would crash users on
6+
# the locked-down default deployment.
7+
8+
on:
9+
pull_request:
10+
branches: [main]
11+
paths:
12+
- ".github/workflows/docker-smoke.yml"
13+
- "docker/Dockerfile"
14+
- ".dockerignore"
15+
- "docker/docker-compose.yml"
16+
- "docker/.env.example"
17+
- "docker/install.sh"
18+
- "packages/backend/**"
19+
- "packages/frontend/**"
20+
- "packages/shared/**"
21+
- "package.json"
22+
- "package-lock.json"
23+
- "turbo.json"
24+
workflow_dispatch:
25+
26+
permissions:
27+
contents: read
28+
29+
jobs:
30+
smoke:
31+
name: Compose smoke test (read-only)
32+
runs-on: ubuntu-latest
33+
timeout-minutes: 15
34+
steps:
35+
- uses: actions/checkout@v4
36+
37+
- uses: docker/setup-buildx-action@v3
38+
39+
- name: Build local image as manifestdotbuild/manifest:latest
40+
uses: docker/build-push-action@v6
41+
with:
42+
context: .
43+
file: docker/Dockerfile
44+
push: false
45+
load: true
46+
tags: manifestdotbuild/manifest:latest
47+
platforms: linux/amd64
48+
cache-from: type=gha,scope=smoke-amd64
49+
cache-to: type=gha,mode=max,scope=smoke-amd64
50+
51+
- name: Prepare .env from .env.example
52+
working-directory: docker
53+
run: |
54+
cp .env.example .env
55+
# Generate a real secret so BETTER_AUTH_SECRET=${VAR:?…} succeeds.
56+
secret=$(openssl rand -hex 32)
57+
# Replace the empty `BETTER_AUTH_SECRET=` line.
58+
new=""
59+
while IFS= read -r line || [[ -n "$line" ]]; do
60+
if [[ "$line" == "BETTER_AUTH_SECRET=" ]]; then
61+
new+="BETTER_AUTH_SECRET=${secret}"$'\n'
62+
else
63+
new+="$line"$'\n'
64+
fi
65+
done < .env
66+
printf '%s' "$new" > .env
67+
68+
- name: Boot the stack
69+
working-directory: docker
70+
run: |
71+
docker compose up -d
72+
docker compose ps
73+
74+
- name: Wait for /api/v1/health
75+
run: |
76+
set -e
77+
for i in $(seq 1 60); do
78+
if curl -sSf http://127.0.0.1:3001/api/v1/health >/dev/null; then
79+
echo "healthy after ${i}s"
80+
exit 0
81+
fi
82+
sleep 2
83+
done
84+
echo "health never returned 200" >&2
85+
(cd docker && docker compose logs manifest)
86+
exit 1
87+
88+
- name: Verify health payload shape
89+
run: |
90+
response=$(curl -sS http://127.0.0.1:3001/api/v1/health)
91+
echo "$response"
92+
echo "$response" | grep -q '"status":"healthy"'
93+
94+
- name: Verify SPA index loads
95+
run: |
96+
curl -sSf http://127.0.0.1:3001/ -o /tmp/index.html
97+
test -s /tmp/index.html
98+
grep -q '<title>Manifest</title>' /tmp/index.html
99+
100+
- name: Verify public stats are gated off by default
101+
run: |
102+
status=$(curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:3001/api/v1/public/usage)
103+
if [[ "$status" != "404" ]]; then
104+
echo "expected 404 for /public/usage when MANIFEST_PUBLIC_STATS is unset, got $status" >&2
105+
exit 1
106+
fi
107+
108+
- name: Dump backend logs on success (for diff inspection)
109+
if: success()
110+
working-directory: docker
111+
run: docker compose logs --tail=200 manifest
112+
113+
- name: Dump backend logs on failure
114+
if: failure()
115+
working-directory: docker
116+
run: |
117+
docker compose logs manifest || true
118+
docker compose logs postgres || true
119+
120+
- name: Tear down
121+
if: always()
122+
working-directory: docker
123+
run: docker compose down -v

ā€Ždocker/Dockerfileā€Ž

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ RUN --mount=type=cache,target=/root/.npm npm ci --omit=dev --ignore-scripts && \
3030
find . -path "*/node_modules/vite-node" -type d -exec rm -rf {} + 2>/dev/null; \
3131
find . -path "*/node_modules/rollup" -type d -exec rm -rf {} + 2>/dev/null; \
3232
find . -path "*/node_modules/@vitest" -type d -exec rm -rf {} + 2>/dev/null; \
33-
find . -path "*/node_modules/*" \( -name "*.map" -o -name "*.md" \
33+
find . -path "*/node_modules/*" \( -name "*.map" -o -name "README*" \
3434
-o -name "LICENSE*" -o -name "CHANGELOG*" -o -name "*.txt" \
3535
-o -name "Makefile" -o -name "*.gyp" -o -name "*.gypi" \
3636
-o \( -name "*.ts" ! -name "*.d.ts" \) \) -delete 2>/dev/null; \
@@ -68,7 +68,7 @@ COPY packages/backend/package.json packages/backend/
6868
EXPOSE 3001
6969

7070
HEALTHCHECK --interval=30s --timeout=5s --start-period=45s --retries=3 \
71-
CMD wget -qO- http://127.0.0.1:3001/api/v1/health || exit 1
71+
CMD node -e "fetch('http://127.0.0.1:3001/api/v1/health').then(r=>process.exit(r.ok?0:1)).catch(()=>process.exit(1))"
7272

7373
USER node
7474

ā€Ždocker/docker-compose.ymlā€Ž

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,20 +51,24 @@ services:
5151
postgres:
5252
condition: service_healthy
5353
healthcheck:
54-
test: ["CMD-SHELL", "wget -qO- http://127.0.0.1:3001/api/v1/health || exit 1"]
54+
test:
55+
- "CMD"
56+
- "node"
57+
- "-e"
58+
- "fetch('http://127.0.0.1:3001/api/v1/health').then(r=>process.exit(r.ok?0:1)).catch(()=>process.exit(1))"
5559
interval: 30s
5660
timeout: 5s
5761
start_period: 45s
5862
retries: 3
5963
read_only: true
6064
tmpfs:
61-
- /tmp
65+
- /tmp:size=64m
6266
security_opt:
6367
- no-new-privileges:true
6468
cap_drop:
6569
- ALL
6670
mem_limit: 1g
67-
pids_limit: 256
71+
pids_limit: 512
6872
networks:
6973
- internal
7074
- frontend

ā€Žpackages/backend/src/analytics/controllers/messages.controller.tsā€Ž

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ export class MessagesController {
2424
limit: Math.min(query.limit ?? 50, 200),
2525
cursor: query.cursor,
2626
agent_name: query.agent_name,
27+
status: query.status,
2728
});
2829
}
2930

ā€Žpackages/backend/src/analytics/dto/messages-query.dto.spec.tsā€Ž

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,4 +55,20 @@ describe('MessagesQueryDto', () => {
5555
const errors = await validate(dto);
5656
expect(errors.length).toBeGreaterThan(0);
5757
});
58+
59+
it('accepts each known status value', async () => {
60+
for (const status of ['ok', 'error', 'rate_limited', 'fallback_error', 'errors']) {
61+
const dto = plainToInstance(MessagesQueryDto, { status });
62+
const errors = await validate(dto);
63+
expect(errors).toHaveLength(0);
64+
}
65+
});
66+
67+
it('rejects an unknown status value', async () => {
68+
const dto = plainToInstance(MessagesQueryDto, { status: 'pending' });
69+
const errors = await validate(dto);
70+
expect(errors.length).toBeGreaterThan(0);
71+
const flat = errors.flatMap((e) => Object.values(e.constraints ?? {}));
72+
expect(flat.join('\n')).toMatch(/status must be one of/);
73+
});
5874
});

ā€Žpackages/backend/src/analytics/dto/messages-query.dto.tsā€Ž

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,14 @@
11
import { Type } from 'class-transformer';
2-
import { IsNumber, IsOptional, IsString, Max, Min } from 'class-validator';
2+
import { IsIn, IsNumber, IsOptional, IsString, Max, Min } from 'class-validator';
3+
4+
export const MESSAGE_STATUS_FILTER_VALUES = [
5+
'ok',
6+
'error',
7+
'rate_limited',
8+
'fallback_error',
9+
'errors',
10+
] as const;
11+
export type MessageStatusFilter = (typeof MESSAGE_STATUS_FILTER_VALUES)[number];
312

413
export class MessagesQueryDto {
514
@IsOptional()
@@ -40,4 +49,10 @@ export class MessagesQueryDto {
4049
@IsOptional()
4150
@IsString()
4251
agent_name?: string;
52+
53+
@IsOptional()
54+
@IsIn(MESSAGE_STATUS_FILTER_VALUES, {
55+
message: `status must be one of: ${MESSAGE_STATUS_FILTER_VALUES.join(', ')}`,
56+
})
57+
status?: MessageStatusFilter;
4358
}

ā€Žpackages/backend/src/analytics/services/messages-query.service.tsā€Ž

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@ import { AgentMessage } from '../../entities/agent-message.entity';
55
import { rangeToInterval } from '../../common/utils/range.util';
66
import { addTenantFilter, formatTimestamp } from './query-helpers';
77
import { TenantCacheService } from '../../common/services/tenant-cache.service';
8+
import type { MessageStatusFilter } from '../dto/messages-query.dto';
9+
10+
const ERROR_STATUSES = ['error', 'fallback_error', 'rate_limited'] as const;
811
import {
912
DbDialect,
1013
detectDialect,
@@ -50,6 +53,7 @@ export class MessagesQueryService {
5053
limit: number;
5154
cursor?: string;
5255
agent_name?: string;
56+
status?: MessageStatusFilter;
5357
}) {
5458
const tenantId = (await this.tenantCache.resolve(params.userId)) ?? undefined;
5559
const cutoff = params.range ? computeCutoff(rangeToInterval(params.range)) : undefined;
@@ -70,6 +74,12 @@ export class MessagesQueryService {
7074
if (params.agent_name)
7175
baseQb.andWhere('at.agent_name = :filterAgent', { filterAgent: params.agent_name });
7276

77+
if (params.status === 'errors') {
78+
baseQb.andWhere('at.status IN (:...errorStatuses)', { errorStatuses: ERROR_STATUSES });
79+
} else if (params.status) {
80+
baseQb.andWhere('at.status = :statusFilter', { statusFilter: params.status });
81+
}
82+
7383
// Provider filter: prefer the stored provider column (populated by the
7484
// proxy from routing resolution), and fall back to inference for legacy
7585
// rows that pre-date the column.
@@ -262,6 +272,7 @@ export class MessagesQueryService {
262272
agent_name?: string;
263273
cost_min?: number;
264274
cost_max?: number;
275+
status?: MessageStatusFilter;
265276
}): string {
266277
return [
267278
params.userId,
@@ -271,6 +282,7 @@ export class MessagesQueryService {
271282
params.agent_name ?? '',
272283
params.cost_min ?? '',
273284
params.cost_max ?? '',
285+
params.status ?? '',
274286
].join(':');
275287
}
276288
}

ā€Žpackages/backend/src/common/filters/spa-fallback.filter.spec.tsā€Ž

Lines changed: 44 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,14 @@ jest.mock('../utils/frontend-path', () => ({
44
resolveFrontendDir: jest.fn(),
55
}));
66

7+
const SAMPLE_HTML =
8+
'<html><head><meta property="og:url" content="https://app.manifest.build" />' +
9+
'<meta property="og:image" content="https://app.manifest.build/og-image.png" /></head>' +
10+
'<body>SPA</body></html>';
11+
712
jest.mock('fs', () => ({
813
...jest.requireActual('fs'),
9-
readFileSync: jest.fn().mockReturnValue('<html><body>SPA</body></html>'),
14+
readFileSync: jest.fn().mockReturnValue(SAMPLE_HTML),
1015
}));
1116

1217
import { resolveFrontendDir } from '../utils/frontend-path';
@@ -38,19 +43,19 @@ describe('SpaFallbackFilter', () => {
3843
const exception = new NotFoundException();
3944

4045
// Must re-import after mock setup to pick up the mocked module
41-
function loadFilter() {
46+
function loadFilter(betterAuthUrl?: string) {
4247
jest.resetModules();
4348
jest.mock('../utils/frontend-path', () => ({
4449
resolveFrontendDir: mockResolveFrontendDir,
4550
}));
4651
jest.mock('fs', () => ({
4752
...jest.requireActual('fs'),
48-
readFileSync: jest.fn().mockReturnValue('<html><body>SPA</body></html>'),
53+
readFileSync: jest.fn().mockReturnValue(SAMPLE_HTML),
4954
}));
5055
const { SpaFallbackFilter } =
5156
// eslint-disable-next-line @typescript-eslint/no-require-imports
5257
require('./spa-fallback.filter') as typeof import('./spa-fallback.filter');
53-
return new SpaFallbackFilter();
58+
return new SpaFallbackFilter(betterAuthUrl);
5459
}
5560

5661
describe('when index.html exists', () => {
@@ -67,7 +72,7 @@ describe('SpaFallbackFilter', () => {
6772
expect(res.setHeader).toHaveBeenCalledWith('Content-Type', 'text/html');
6873
expect(res.setHeader).toHaveBeenCalledWith('Cache-Control', 'no-cache');
6974
expect(res.status).toHaveBeenCalledWith(200);
70-
expect(res.send).toHaveBeenCalledWith('<html><body>SPA</body></html>');
75+
expect(res.send).toHaveBeenCalledWith(SAMPLE_HTML);
7176
});
7277

7378
it('returns JSON 404 for GET to /api/ routes', () => {
@@ -99,6 +104,40 @@ describe('SpaFallbackFilter', () => {
99104
});
100105
});
101106

107+
describe('og tag rewriting', () => {
108+
beforeEach(() => {
109+
mockResolveFrontendDir.mockReturnValue('/mock/frontend');
110+
});
111+
112+
it('rewrites og: tags when BETTER_AUTH_URL is provided', () => {
113+
const filter = loadFilter('https://manifest.example.com');
114+
const { host, res } = createMockHost('GET', '/');
115+
filter.catch(exception, host);
116+
const sent = (res.send as jest.Mock).mock.calls[0][0] as string;
117+
expect(sent).toContain('content="https://manifest.example.com"');
118+
expect(sent).toContain('content="https://manifest.example.com/og-image.png"');
119+
expect(sent).not.toContain('https://app.manifest.build');
120+
});
121+
122+
it('leaves og: tags alone when BETTER_AUTH_URL is empty', () => {
123+
const filter = loadFilter('');
124+
const { host, res } = createMockHost('GET', '/');
125+
filter.catch(exception, host);
126+
const sent = (res.send as jest.Mock).mock.calls[0][0] as string;
127+
expect(sent).toContain('content="https://app.manifest.build"');
128+
});
129+
130+
it('falls back to process.env when no constructor arg is provided', () => {
131+
process.env['BETTER_AUTH_URL'] = 'https://from-env.example';
132+
const filter = loadFilter();
133+
const { host, res } = createMockHost('GET', '/');
134+
filter.catch(exception, host);
135+
const sent = (res.send as jest.Mock).mock.calls[0][0] as string;
136+
expect(sent).toContain('content="https://from-env.example"');
137+
delete process.env['BETTER_AUTH_URL'];
138+
});
139+
});
140+
102141
describe('when index.html does not exist', () => {
103142
let filter: ReturnType<typeof loadFilter>;
104143

ā€Žpackages/backend/src/common/filters/spa-fallback.filter.tsā€Ž

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,19 @@ import { Response, Request } from 'express';
33
import { join } from 'path';
44
import { readFileSync } from 'fs';
55
import { resolveFrontendDir } from '../utils/frontend-path';
6+
import { rewriteOgTags } from '../utils/og-rewrite';
67

78
const API_PREFIXES = ['/api/', '/otlp/', '/v1/'];
89

910
@Catch(NotFoundException)
1011
export class SpaFallbackFilter implements ExceptionFilter {
1112
private readonly indexContent: string | null;
1213

13-
constructor() {
14+
constructor(betterAuthUrl?: string) {
1415
const frontendDir = resolveFrontendDir();
15-
this.indexContent = frontendDir ? readFileSync(join(frontendDir, 'index.html'), 'utf-8') : null;
16+
const raw = frontendDir ? readFileSync(join(frontendDir, 'index.html'), 'utf-8') : null;
17+
const baseUrl = betterAuthUrl ?? process.env['BETTER_AUTH_URL'] ?? '';
18+
this.indexContent = raw ? rewriteOgTags(raw, baseUrl) : null;
1619
}
1720

1821
catch(exception: NotFoundException, host: ArgumentsHost) {

0 commit comments

Comments
Ā (0)
⚔