Bug report: Text Encoding Brute Force: `inputType` is `"string"` instead of `"byteArray"`

**Describe the bug**

The Text Encoding Brute Force operation declares `inputType = "string"`, which means CyberChef's framework passes it a UTF-8 decoded string. The operation then feeds this string to `codepage.utils.decode(charset, input)`, which interprets each character's `charCodeAt(0)` as a raw byte value. For non-ASCII input, the char codes no longer correspond to the original bytes, producing incorrect decoding for all code pages.

In `src/operations/TextEncodingBruteForce.mjs`:

```js
this.inputType = "string";  // BUG: should be "byteArray"
```

When `inputType` is `"string"`, the CyberChef framework calls `byteArrayToUtf8` on the input bytes before passing them to the operation's `run` method. The `run` method then passes this UTF-8 string to `codepage.decode` for each code page.

`codepage.decode(cp, data)` has a string branch:

```js
if (typeof data === "string") return decode(cp, data.split("").map(cca));
// cca = x => x.charCodeAt(0)
```

This treats each Unicode code point as a byte value. For a proper UTF-8 string like `"café"`, the `é` character has code point 233 (`0xE9`), but the original byte sequence was `[0xC3, 0xA9]` (two bytes). The single value `233` is then decoded under the target code page as if it were a single byte — wrong for every encoding.

**To Reproduce**

https://gchq.github.io/CyberChef/#recipe=From_Hex('Auto')Text_Encoding_Brute_Force('Decode')&input=NjMgNjEgNjYgYzMgYTk&oenc=65001


```js
     const input = Buffer.from([0x63, 0x61, 0x66, 0xc3, 0xa9]);

     const out = chef.textEncodingBruteForce(input, { Mode: "Decode" });

     console.log(
         "input bytes:",
         [...input].map((b) => b.toString(16).padStart(2, "0")).join(" ")
     );
     console.log("dish type:", out.type);
     console.log("utf8:", JSON.stringify(out.value["UTF-8 (65001)"]));
     console.log(
         "cp500:",
         JSON.stringify(out.value["IBM EBCDIC International (500)"])
     );
```

Output I got:

 ```text
   input bytes: 63 61 66 c3 a9
   dish type: 6
   utf8: "caf退"
   cp500: "Ä/ÃZ"
 ```

raw-byte CP500 decode should be:

 ```bash
   python3 - <<'PY'
   print(bytes([0x63, 0x61, 0x66, 0xc3, 0xa9]).decode("cp500"))
   PY
 ```

 Expected output:

 ```text
   Ä/ÃCz
 ```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug report: Text Encoding Brute Force: `inputType` is `"string"` instead of `"byteArray"` #2281

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug report: Text Encoding Brute Force: inputType is "string" instead of "byteArray" #2281

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Bug report: Text Encoding Brute Force: `inputType` is `"string"` instead of `"byteArray"` #2281