Skip to content

x.json2: @[raw] decode strips inter-token whitespace from the source slice instead of returning the verbatim bytes #26910

@enghitalo

Description

@enghitalo

Describe the bug

A field marked @[raw] in x.json2 does not preserve the verbatim source slice from the input. The decoder uses the JSON-element span computed by the scanner, which has already trimmed inter-token whitespace; the result is a re-formatted, whitespace-collapsed string that does not match the original bytes.

The legacy json module preserves the exact bytes between the field's value start and end (whitespace included).

V code

import x.json2

pub struct Dto {
pub:
	data ?string @[raw]
}

fn main() {
	raw_json := '{
        "data": { "test": 1 }
    }'
	dto := json2.decode[Dto](raw_json) or { return }
	println('[${dto.data?}]')
	// expected: [{ "test": 1 }]
	// actual:   [{"test":1}]
}

C backend result

// The @[raw] branch in decoder.decode_value does:
//   position := decoder.current_node.value.position
//   end := position + decoder.current_node.value.length
//   val.$(field.name) = decoder.json[position..end]
// `length` here is the trimmed token span recorded by check_json_format,
// not the literal source slice including separators.

Reproduction Steps

import x.json2

pub struct Dto {
pub:
	data ?string @[raw]
}

fn test_raw_preserves_source_bytes() {
	raw_json := '{
        "data": { "test": 1 }
    }'
	dto := json2.decode[Dto](raw_json)!
	assert dto.data? == '{ "test": 1 }'
}

Expected Behavior

PASS test_raw_preserves_source_bytes

Current Behavior

> assert dto.data? == '{ "test": 1 }'
  Left value (len: 10): `{"test":1}`
  Right value (len: 13): `{ "test": 1 }`

Possible Solution

Two equivalent options, pick one:

  1. In the scanner / check_json_format pass, record the closing index of every container value as the position of } / ] itself, not the position of the last child token. Then decoder.json[position..end] recovers the literal source.
  2. Or, in the @[raw] decode branch, after locating the start position, scan forward in decoder.json from position ignoring the linked-list span and walking braces/brackets/strings to compute the true end index of the JSON value (a tiny inline tokenizer). This avoids changing the scanner output for everyone else.

Either way, the raw branch must clone() the slice if the caller might keep decoder.json longer than the decoded struct (current code already does that via string assignment from a slice — confirm no shared-underlying-buffer surprises after the fix).

Additional Information/Context

Same root cause for the cJSON option_raw test failing on x.json2. The behaviour difference is small but observable for hashing, signing, or echoing the original payload back to a client.

V version

V 0.5.1 1b3385cc34ff783e793d1a26a8ec5be587c80fe0.40b3711

Environment details (OS name and version, etc.)

|V full version      |V 0.5.1 1b3385cc34ff783e793d1a26a8ec5be587c80fe0.40b3711
|:-------------------|:-------------------
|OS                  |linux, Ubuntu 24.04 LTS
|Processor           |16 cpus, 64bit, little endian, AMD Ryzen 7 5800H with Radeon Graphics
|Memory              |8.17GB/30.7GB
|                    |
|V executable        |/home/hitalo/Documents/v/v
|V last modified time|2026-04-18 09:18:00
|                    |
|V home dir          |OK, value: /home/hitalo/Documents/v
|VMODULES            |OK, value: /home/hitalo/.vmodules
|VTMP                |OK, value: /tmp/v_1000
|Current working dir |OK, value: /home/hitalo/Documents/v
|                    |
|Git version         |git version 2.43.0
|V git status        |0.5.1-1006-g40b3711b-dirty
|.git/config present |true
|                    |
|cc version          |cc (GCC) 14.2.0
|gcc version         |gcc (GCC) 14.2.0
|clang version       |Ubuntu clang version 18.1.3 (1)
|tcc version         |tcc version 0.9.28rc 2025-02-13 HEAD@f8bd136d (x86_64 Linux)
|tcc git status      |thirdparty-linux-amd64 696c1d84
|emcc version        |emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.6 ()
|glibc version       |ldd (Ubuntu GLIBC 2.39-0ubuntu8.3) 2.39

Note

You can use the 👍 reaction to increase the issue's priority for developers.

Please note that only the 👍 reaction to the issue itself counts as a vote.
Other reactions and those to comments will not be taken into account.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugThis tag is applied to issues which reports bugs.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions