Skip to content

Commit d653cfd

Browse files
authored
Describe translations (#8179)
* Describe yaml translation * Improve script * Improve script related stuff * Fix stupid prompt escaping * Adjust script * Add description to copy
1 parent 2523316 commit d653cfd

437 files changed

Lines changed: 5695 additions & 66 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

app/javascript/i18n/describe-keys/buildPrompt.ts

Lines changed: 45 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2,39 +2,63 @@ export function buildPrompt(batchContent: string): string {
22
return `
33
You are an assistant that extracts i18n metadata from TSX React component files.
44
5-
Input format:
6-
- Each file appears twice in the batch:
7-
- OLD: before i18n extraction, with literal user-facing text (or "[not found]").
8-
- NEW: after i18n extraction, with calls like t("...") or t('...').
5+
Scope:
6+
- ONLY process files/sections that contain a \`t("...")\` or \`t('...')\` call in the NEW code.
7+
- If a batch has no \`t(...)\` calls in any NEW section, respond with an empty JSON object: {}.
8+
- Ignore <Trans i18nKey="..."> entirely for this task.
99
10-
Task:
11-
- For every t("...") / t('...') key found in a NEW section, output exactly one object with:
12-
- "key": the exact key string as written in the code (copy verbatim).
13-
- "desc": a concise description (1-3 sentences) of what the string represents in the UI, grounded by the OLD text if available.
10+
Goal:
11+
- Output a single JSON OBJECT whose properties follow this exact format:
12+
"<EXACT i18n key from the code>": "<multi-line description>"
1413
15-
Style rules for "desc":
16-
- Each sentence must begin with "This is ...".
17-
- Prefer precise UI nouns: "heading", "button label", "menu item", "tooltip", "helper text".
18-
- Avoid filler like "the text for a button"; be specific and succinct.
14+
Key rules (CRITICAL):
15+
- Use the EXACT key string as written inside \`t('...')\` or \`t("...")\`.
16+
- Do NOT transform or infer namespaces.
17+
- Do NOT add, remove, or modify leading dots, prefixes, or suffixes (e.g., keep ".heading", keep "_html").
18+
- Do NOT deduce or prepend any namespace — the property name must match the code verbatim.
19+
20+
What to extract:
21+
- Scan only the NEW sections to find all \`t("...")\` / \`t('...')\` usages.
22+
- For every discovered key, create exactly one entry in the output JSON:
23+
- Property name: the exact key string from the code.
24+
- Property value: a single multi-line string with EXACTLY these fields, in this order,
25+
each on its own line starting with a bold label:
26+
**Functional Purpose**: <short, specific purpose in the UI>
27+
**UI Location**: <precise place in the UI hierarchy (e.g., "Settings → General → Header")>
28+
**When Users See This**: <concise trigger/context>
29+
**Technical Context**: <only relevant technical notes; list variables exactly and state they must remain unchanged>
30+
**Current English**: "<English text from OLD if available; else empty quotes>"
31+
32+
Grounding & variables:
33+
- Use OLD text and nearby JSX to keep descriptions specific.
34+
- If placeholders/variables appear (e.g., \`%{name}\`, \`{{count}}\`, \`{value}\`), list them under **Technical Context** EXACTLY as written and say "must remain unchanged".
35+
- Be brief; do not over-explain obvious UI strings.
36+
- Do not invent content not supported by OLD/NEW.
37+
38+
Deduplication:
39+
- If the same exact key appears multiple times, include it once; the last occurrence wins.
1940
2041
Output rules:
21-
- Output MUST be a single JSON array of objects. Do not return NDJSON, prose, or code fences.
22-
- Preserve suffixes like "_html" in keys.
23-
- Include ONLY keys that appear in NEW sections.
24-
- Do not duplicate keys.
42+
- Output MUST be a single JSON object (not an array). No prose, comments, or code fences.
43+
- Include ONLY keys found in NEW sections via \`t(...)\`.
44+
- If no \`t(...)\` keys are found, output \`{}\`.
2545
26-
Example:
46+
Example (conceptual):
2747
OLD:
2848
<h1>General settings</h1>
2949
NEW:
3050
<h1>{t('.heading')}</h1>
51+
3152
Output:
32-
[
33-
{"key":".heading","desc":"This is the main heading for the general settings page."}
34-
]
53+
{
54+
".heading": "**Functional Purpose**: Page heading for General settings\\n**UI Location**: Settings → General (page header)\\n**When Users See This**: On opening the General settings page\\n**Technical Context**: Standard text; no special formatting\\n**Current English**: \\"General settings\\""
55+
}
56+
57+
Respond with a single JSON object only. Do not include code fences, comments, or extra text.
3558
3659
File batch content:
3760
---
3861
${batchContent}
39-
---`.trim()
62+
---
63+
`.trim()
4064
}

app/javascript/i18n/describe-keys/index.ts

Lines changed: 52 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,49 +5,85 @@ import { promisify } from 'node:util'
55
import { buildPrompt } from './buildPrompt'
66
import { runLLM } from '../extract-jsx-copy/runLLM'
77
import { createBatches } from './createBatches'
8+
import { parseLLMOutput } from './parseLLMOutput'
89

910
export const execFileAsync = promisify(execFile)
1011

1112
const OUTPUT_DIR = process.env.OUTPUT_DIR || './i18n-descriptions'
13+
const DEBUG_DIR = process.env.DEBUG_DIR || './i18n-debug'
1214

13-
const parseLLMOutput = (output: string) => {
14-
if (output.trim().startsWith('[')) {
15-
return JSON.parse(output)
16-
} else {
17-
return output
18-
.split('\n')
19-
.map((l) => l.trim())
20-
.filter(Boolean)
21-
.map((l) => JSON.parse(l))
22-
}
23-
}
15+
const DEFAULT_COMMIT_SHA = 'ccaebe4d435f235be6e624b72e9a4e1c841c7520'
2416

25-
async function writeBatchJson(batchIndex: number, data: string) {
17+
async function writeBatchJson(batchIndex: number, data: unknown) {
2618
await fs.mkdir(OUTPUT_DIR, { recursive: true })
2719
const fileName = `batch-${String(batchIndex + 1).padStart(3, '0')}.json`
2820
const outPath = path.join(OUTPUT_DIR, fileName)
2921
await fs.writeFile(outPath, JSON.stringify(data, null, 2), 'utf8')
3022
return outPath
3123
}
3224

25+
async function writeDebugFile(
26+
batchIndex: number,
27+
kind: string,
28+
content: string
29+
) {
30+
await fs.mkdir(DEBUG_DIR, { recursive: true })
31+
const fileName = `batch-${String(batchIndex + 1).padStart(3, '0')}.${kind}`
32+
const outPath = path.join(DEBUG_DIR, fileName)
33+
await fs.writeFile(outPath, content, 'utf8')
34+
return outPath
35+
}
36+
3337
;(async () => {
3438
const inputDir = process.argv[2] || './input'
39+
40+
const startFromRaw = process.argv[3]
41+
const startFrom =
42+
startFromRaw && /^\d+$/.test(startFromRaw) ? Number(startFromRaw) : 1
43+
3544
const commitSha =
36-
process.argv[3] || 'ccaebe4d435f235be6e624b72e9a4e1c841c7520'
45+
process.argv[4] || process.env.COMMIT_SHA || DEFAULT_COMMIT_SHA
46+
3747
const batches = await createBatches(inputDir, commitSha)
3848

39-
for (let i = 0; i < batches.length; i++) {
49+
const startIndex = Math.min(batches.length, Math.max(1, startFrom)) - 1
50+
51+
console.log(
52+
`Total batches: ${batches.length}. Starting from batch ${startFrom} (index ${startIndex}).`
53+
)
54+
55+
for (let i = startIndex; i < batches.length; i++) {
4056
console.log('started batch', i + 1, 'of', batches.length)
4157

4258
const batch = batches[i]
59+
60+
await writeDebugFile(i, 'batch.txt', batch.content ?? '(no batch content)')
61+
4362
const prompt = buildPrompt(batch.content)
63+
64+
if (prompt.includes('${batchContent}')) {
65+
throw new Error(
66+
'Prompt still contains a literal ${batchContent}. Check buildPrompt interpolation.'
67+
)
68+
}
69+
70+
await writeDebugFile(i, 'prompt.txt', prompt)
71+
4472
const llmOutput = await runLLM(prompt)
73+
await writeDebugFile(i, 'output.raw.txt', llmOutput ?? '(undefined)')
4574

4675
const parsedOutput = llmOutput ? parseLLMOutput(llmOutput) : null
4776

4877
if (parsedOutput) {
49-
const outPath = await writeBatchJson(i, parsedOutput)
50-
console.log(`Wrote ${parsedOutput.length} entries → ${outPath}`)
78+
const outPath = await writeBatchJson(i, parsedOutput as any)
79+
80+
const count = Array.isArray(parsedOutput)
81+
? parsedOutput.length
82+
: Object.keys(parsedOutput as Record<string, unknown>).length
83+
84+
console.log(
85+
`Wrote ${count} entr${count === 1 ? 'y' : 'ies'}${outPath}`
86+
)
5187
} else {
5288
console.log(`No results from batch ${i + 1}`)
5389
}
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
import { jsonrepair } from 'jsonrepair'
2+
3+
type LLMJson = Record<string, string> | unknown[] // object (your new format) or array (old)
4+
const CODE_FENCE_RE = /^```(?:json)?\s*([\s\S]*?)\s*```$/i
5+
6+
export function parseLLMOutput(raw: string): LLMJson {
7+
const output = raw.trim()
8+
9+
// Strip code fences if the model adds them
10+
const fencedMatch = output.match(CODE_FENCE_RE)
11+
const unwrapped = fencedMatch ? fencedMatch[1].trim() : output
12+
13+
// 1) Try direct JSON (object or array)
14+
try {
15+
return JSON.parse(unwrapped)
16+
} catch {
17+
// 1a) Try to repair the whole thing
18+
try {
19+
const repaired = jsonrepair(unwrapped)
20+
return JSON.parse(repaired)
21+
} catch {
22+
// continue
23+
}
24+
25+
// 2) Try to salvage by extracting the first top-level JSON object/array
26+
const firstBrace = unwrapped.indexOf('{')
27+
const lastBrace = unwrapped.lastIndexOf('}')
28+
const firstBracket = unwrapped.indexOf('[')
29+
const lastBracket = unwrapped.lastIndexOf(']')
30+
31+
const hasObject =
32+
firstBrace !== -1 && lastBrace !== -1 && lastBrace > firstBrace
33+
const hasArray =
34+
firstBracket !== -1 && lastBracket !== -1 && lastBracket > firstBracket
35+
36+
const candidate = hasObject
37+
? unwrapped.slice(firstBrace, lastBrace + 1)
38+
: hasArray
39+
? unwrapped.slice(firstBracket, lastBracket + 1)
40+
: null
41+
42+
if (candidate) {
43+
// 2a) Parse candidate directly
44+
try {
45+
return JSON.parse(candidate)
46+
} catch {
47+
// 2b) Repair candidate if still broken
48+
try {
49+
const repairedCandidate = jsonrepair(candidate)
50+
return JSON.parse(repairedCandidate)
51+
} catch {
52+
// fall through to NDJSON attempt
53+
}
54+
}
55+
}
56+
57+
// 3) As a last resort, attempt NDJSON (one JSON per line)
58+
const lines = unwrapped
59+
.split('\n')
60+
.map((l) => l.trim())
61+
.filter(Boolean)
62+
63+
// If it's NDJSON, all lines must be valid JSON (possibly after repair)
64+
const parsedLines = lines.map((l) => {
65+
try {
66+
return JSON.parse(l)
67+
} catch {
68+
const repairedLine = jsonrepair(l)
69+
return JSON.parse(repairedLine)
70+
}
71+
})
72+
73+
return parsedLines
74+
}
75+
}

app/javascript/i18n/extract-jsx-copy/runLLM.ts

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@ export async function runLLM(prompt: string): Promise<string | undefined> {
88
model: 'gemini-2.5-flash',
99
contents: prompt,
1010
config: {
11-
responseMimeType: 'application/json',
1211
thinkingConfig: {
1312
thinkingBudget: 0,
1413
},
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"commentsList.commentView.edit": "**Functional Purpose**: Button text for editing a comment\n**UI Location**: Community Solutions → Comment List → Individual Comment\n**When Users See This**: When a user views their own comment and it is editable\n**Technical Context**: Standard text; no special formatting\n**Current English**: \"Edit\""
3+
}
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"commentsList.count.numberOfComments": "**Functional Purpose**: Displays the total number of comments for a solution\n**UI Location**: Community Solutions → Solution Detail Page → Comments section heading\n**When Users See This**: When viewing a solution detail page with comments\n**Technical Context**: Variables `number` and `pluralize` must remain unchanged.\n**Current English**: \"{number} {pluralize('comment', number)}\"",
3+
"commentsList.emptyList.noComments": "**Functional Purpose**: Informs the user that there are no comments yet\n**UI Location**: Community Solutions → Solution Detail Page → Comments section (when empty)\n**When Users See This**: When a solution has no comments\n**Technical Context**: Standard text; no special formatting\n**Current English**: \"No one has commented on this solution.\"",
4+
"commentsList.emptyList.beFirst": "**Functional Purpose**: Encourages the user to add the first comment\n**UI Location**: Community Solutions → Solution Detail Page → Comments section (when empty)\n**When Users See This**: When a solution has no comments\n**Technical Context**: Standard text; no special formatting\n**Current English**: \"Be the first to add your comment!\""
5+
}
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"commentsList.options.disableComments": "**Functional Purpose**: Button to disable comments for a solution\n**UI Location**: Community Solutions → Comments List → Options dropdown\n**When Users See This**: When viewing a solution as its author and comments are currently enabled\n**Technical Context**: Appears as an option in a dropdown menu. No variables.\n**Current English**: \"Disable comments…\"",
3+
"commentsList.options.enableComments": "**Functional Purpose**: Button to enable comments for a solution\n**UI Location**: Community Solutions → Comments List → Options dropdown\n**When Users See This**: When viewing a solution as its author and comments are currently disabled\n**Technical Context**: Appears as an option in a dropdown menu. No variables.\n**Current English**: \"Enable comments…\"",
4+
"commentsList.header.writeComment": "**Functional Purpose**: Heading for the comment submission section\n**UI Location**: Community Solutions → Comments List (header)\n**When Users See This**: Above the form for adding a new comment to a solution\n**Technical Context**: Standard text. No variables.\n**Current English**: \"Write a comment\""
5+
}
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
{
2+
"commentsList.listDisabled.disabledCommentsAuthor": "**Functional Purpose**: Inform the author that they have disabled comments.\n**UI Location**: Community Solutions → Comments list (when comments are disabled by author)\n**When Users See This**: When the author views a solution where they have disabled comments.\n**Technical Context**: Standard text; no special formatting\n**Current English**: \"You have disabled comments on this solution. Use the \\\"Options\\\" cog above to toggle this option.\"",
3+
"commentsList.listDisabled.disabledComments": "**Functional Purpose**: Inform a non-author user that comments are disabled.\n**UI Location**: Community Solutions → Comments list (when comments are disabled)\n**When Users See This**: When a user who is not the author views a solution where comments have been disabled.\n**Technical Context**: Standard text; no special formatting\n**Current English**: \"Comments have been disabled\""
4+
}
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{}
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{}

0 commit comments

Comments
 (0)