Problem
mship export <task> produces a bundle of task artifacts (journal, plan, spec, state, diffs) that users often want to share externally — with coworkers, in bug reports, with AI services, in hiring portfolios, in talks. Today there is no pre-flight redaction, which makes "share this task" an implicit audit of the user's secret hygiene. Concretely, an exported bundle can carry:
- API keys / bearer tokens pasted into journal messages or hypothesis claims
.env-style key=value lines copied into evidence blocks
- SSH / PGP private keys if accidentally staged then committed before redact
- GitHub PATs, AWS access keys, Stripe-style
sk_live_ keys
- Plain passwords in quoted error output
- Internal client / customer names the user doesn't want in a public portfolio link
The current workaround ("grep the bundle before sharing") is manual and error-prone — exactly the kind of thing a substrate should handle at the boundary.
Proposal
Add a minimal --redacted flag to mship export:
mship export <task> --redacted [--format zip|dir] [--include diagnostics]
Redact using documented deterministic regex patterns only (v1 scope — no heuristics, no ML classifier, no entropy-based detection):
sk_live_[a-zA-Z0-9]+ / sk_test_[a-zA-Z0-9]+ (Stripe-style)
ghp_[A-Za-z0-9]{36} / gho_ / ghu_ / ghs_ (GitHub tokens)
AKIA[0-9A-Z]{16} (AWS access key IDs) + nearby aws_secret_access_key values
-----BEGIN [A-Z ]+PRIVATE KEY----- … -----END …----- blocks
Bearer [A-Za-z0-9._\-]+ in journal lines / evidence
.env-style lines matching (?i)(API_KEY|SECRET|PASSWORD|TOKEN|CREDENTIAL)=\S+
- Optional user-configured list of client/customer name strings (one pattern per line in
~/.config/mship/redact.patterns or mothership.yaml#redact.patterns)
Redaction replaces the match with <REDACTED:kind> (e.g. <REDACTED:github-token>) so the shape of the artifact remains legible.
Scope cuts (explicit anti-goals for v1)
- No heavyweight classifier. No ML, no trufflehog-style entropy scoring, no AST-aware parsing. Just documented regex patterns.
- No interactive review. Redaction is deterministic and non-interactive — if you want to review, diff the redacted bundle against the unredacted one yourself.
- No partial redaction modes. Either
--redacted (all documented patterns apply) or not. No --redact github,aws pick-list in v1.
- No automatic redaction of unflagged exports. Explicit opt-in; we do not want to silently mutate artifacts a user expected to be faithful.
- No cross-task redaction history / audit log. One export, one pass, done.
Why "minimal responsible-defaults" and not a big redaction engine
The substrate thesis says: mship owns the hand-off boundary; heavyweight semantic analysis belongs in downstream tools. This issue is the smallest useful thing that prevents the common "oops I shared my key" failure, without pretending to be a DLP product. Documented patterns mean users know exactly what is and isn't caught — no false security.
Related
Problem
mship export <task>produces a bundle of task artifacts (journal, plan, spec, state, diffs) that users often want to share externally — with coworkers, in bug reports, with AI services, in hiring portfolios, in talks. Today there is no pre-flight redaction, which makes "share this task" an implicit audit of the user's secret hygiene. Concretely, an exported bundle can carry:.env-style key=value lines copied into evidence blockssk_live_keysThe current workaround ("grep the bundle before sharing") is manual and error-prone — exactly the kind of thing a substrate should handle at the boundary.
Proposal
Add a minimal
--redactedflag tomship export:Redact using documented deterministic regex patterns only (v1 scope — no heuristics, no ML classifier, no entropy-based detection):
sk_live_[a-zA-Z0-9]+/sk_test_[a-zA-Z0-9]+(Stripe-style)ghp_[A-Za-z0-9]{36}/gho_/ghu_/ghs_(GitHub tokens)AKIA[0-9A-Z]{16}(AWS access key IDs) + nearbyaws_secret_access_keyvalues-----BEGIN [A-Z ]+PRIVATE KEY-----…-----END …-----blocksBearer [A-Za-z0-9._\-]+in journal lines / evidence.env-style lines matching(?i)(API_KEY|SECRET|PASSWORD|TOKEN|CREDENTIAL)=\S+~/.config/mship/redact.patternsormothership.yaml#redact.patterns)Redaction replaces the match with
<REDACTED:kind>(e.g.<REDACTED:github-token>) so the shape of the artifact remains legible.Scope cuts (explicit anti-goals for v1)
--redacted(all documented patterns apply) or not. No--redact github,awspick-list in v1.Why "minimal responsible-defaults" and not a big redaction engine
The substrate thesis says: mship owns the hand-off boundary; heavyweight semantic analysis belongs in downstream tools. This issue is the smallest useful thing that prevents the common "oops I shared my key" failure, without pretending to be a DLP product. Documented patterns mean users know exactly what is and isn't caught — no false security.
Related
mship exportproduces the bundle that--redactedfiltersgh pr viewper task) #41-style aggregate views and Workflow: post-finish changes (reviewer feedback, last-mile fixes) #29-style journal-as-truth