Skip to content

Feature request: terminology map for consistent translations #386

@VVytai

Description

@VVytai

Sometimes the translations of a character's name, or certain terminology, change multiple times within a movie. This is because some names/terms have multiple valid translations. This can be confusing to viewers.

Currently, we can manually provide a name list with --names, but that would require us to know all the names/terms beforehand.

Instead, we could instruct the LLM to output a list for us, something like:

Based on the translated subtitle, identify all key nouns, concepts and terminology that are crucial to the context and require a fixed translation, including but not limited to: character names/titles, organizations/brands, locations/events, unique objects/artifacts, technical/cultural terms, lyrics/poems/quotes, and idioms/expressions.

List them in a <terminology_map> block after the translated subtitles:

<terminology_map>original|translation
original|translation
original|translation</terminology_map>

I chose a simple | delimiter instead of JSON because there is less chance of formatting errors (if the LLM forgets a comma, quote, or closing brace, the entire JSON becomes invalid).

We could then build a "terminology map" from it, deduplicate between prompts, and include the map in future prompts. This ensures consistent translations throughout the movie.

The terminology map should be saved to and loaded from the project file. This allows users to resume a project and also reuse the translations across multiple projects (such as other episodes of a TV show).

This feature should be opt‑in, as it may consume more tokens and may require smarter models to work well.

This should help keep translations consistent. What do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions