This document captures the UEBA wire conventions used by generated Baboon codecs. Everything is little‑endian and intentionally simple so all backend implementations (C#, Scala, Rust, TypeScript, Python, Kotlin, Java, Dart) produce identical binary output.
UEBA has a sister format, SICK, which is an indexed binary storage for JSON.
- Writers use
LEDataOutputStream; readers useLEDataInputStream. - No global envelope; each value is written directly in place (top‑level or nested).
- When index mode is enabled (
BaboonCodecContext.Indexed), codecs may emit a one‑byte header with index data; otherwise, codecs keep payloads compact.
bit: 1 byte, 0 or 1.- Signed integers (
i08,i16,i32,i64): little‑endian two’s‑complement of the respective width. - Unsigned integers (
u08,u16,u32,u64): little‑endian representation of the unsigned value (read/write via standard signed primitives and reinterpret). - Floats (
f32,f64): IEEE‑754 little‑endian. f128: serialized as .NETdecimal(16 bytes: lo, mid, hi, flags). Flags carry sign bit 31 and scale in bits 16–23; mantissa fits 96 bits.str: UTF‑8 bytes prefixed by a varint length (LEB128, 7 bits per byte, continuation bit 0x80).bytes: lengthi32then raw bytes.uid: 16 bytes in .NET GUID mixed‑endian layout (fields Data1/2/3 little‑endian, Data4 big‑endian).tsu/tso: twoi64values (milliseconds since epoch, offset millis) plus onei8“kind” flag.
opt[T]: onebytetag (0= empty,1= present). Present payload encodesT.lst[T]/set[T]:i32count, then elements in order.map[K, V]:i32count, then key/value pairs in order:KthenVper entry.
- Enums: encoded as
i32of the discriminant. - DTO/ADT/contract branches: fields encoded in declaration order.
- ADT branches: written as the branch payload followed by branch metadata when wrapped codecs are enabled; by default branches are emitted without an envelope (caller knows the concrete branch).
When BaboonCodecContext.Indexed is used and a codec implements BaboonBinCodecIndexed, payloads can start with a one‑byte header indicating index presence. Index entries are pairs of i32 offset/length per indexed element; offsets must be monotonically increasing. This is mainly used by generated codecs with generateIndexWriters enabled on C#.
- All numeric fields are little‑endian; do not rely on platform endianness.
- Strings and
ByteStringdiffer:struses varint length;bytes/ByteStringuse fixedi32length. - UUIDs follow .NET GUID layout; convert when interoperating with big‑endian UUIDs.
- Foreign types rely on user‑provided UEBA codecs; generated stubs must be replaced.
Schema:
data Payment {
amount: i32
note: opt[str]
tags: lst[u08]
}
Encoding of { amount = 42, note = Some("ok"), tags = [1,2] }:
00 2A 00 00 00 ; i32 amount = 42
04 01 ; opt tag = present
05 02 ; varint len("ok") = 2
06 6F 6B ; "ok"
08 02 00 00 00 ; lst count = 2
0C 01 02 ; elements
Field order is declaration order; there is no extra envelope.
data M { m: map[str, i32] }
{ m = { "a" -> 7, "b" -> 9 } }:
00 02 00 00 00 ; count = 2
04 01 61 ; key "a" length=1, bytes=61
06 07 00 00 00 ; value 7
0A 01 62 ; key "b" length=1, bytes=62
0C 09 00 00 00 ; value 9
data R { left: lst[u08], right: lst[u08] }
Indexed encoding layout (example):
00 01 ; header: index present (bit0)
01 02 00 ; index elements count (short)
03 05 00 00 00 ; offset of left payload
07 03 00 00 00 ; length of left payload
0B 08 00 00 00 ; offset of right payload
0F 02 00 00 00 ; length of right payload
13 ...left payload...
16 ...right payload...
Offsets are relative to the start of the payload (after the index section) and must be monotonically increasing.