Skip to content

Any way to rescue / omit any cases of non-workable languages without erroring an entire batch? #102

@DGaffney

Description

@DGaffney

Hello! First, off, this is amazing work, thank you for this. The problem I'm running into is I have a set of texts batch_texts that I want to encode. They could be in any language, and I want to convert to english. What i see when I process a batch is something like:

INFO:easynmt.EasyNMT:Translate documents of language: en
INFO:easynmt.EasyNMT:Translate documents of language: ja
INFO:easynmt.EasyNMT:Translate documents of language: de
.........
INFO:easynmt.EasyNMT:Translate documents of language: et
INFO:easynmt.EasyNMT:Translate documents of language: la
WARNING:easynmt.EasyNMT:Exception: 'la'

And then it fails. I tried pulling out the supported_languages(), pre-checking all the texts and omitting non-supported texts, but still, in the internals, I hit this error on some text for some reason. Is there anything in the repo that can do an internal rescue of "if its an unsupported language or the case crashes out for whatever reason, just return None" or something of that nature?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions