Fuzzing Guide for Robocodec

Overview

This document describes the fuzzing infrastructure for the robocodec library, including setup instructions, usage guidelines, and best practices for finding bugs and security vulnerabilities.

What is Fuzzing?

Fuzzing is an automated testing technique that provides random, invalid, or unexpected data as inputs to a program. The goals are to:

Find crashes - Segmentation faults, panics, assertion failures
Find hangs - Infinite loops, deadlocks, slow operations
Find memory leaks - Unbounded memory growth
Find logic errors - Incorrect handling of edge cases

Fuzzing Architecture

┌─────────────────────────────────────────────────────────┐
│                   libFuzzer Engine                     │
│  - Generates random/mutated test cases                 │
│  - Monitors for crashes, hangs, leaks                  │
│  - Minimizes failing test cases                        │
└────────────────┬───────────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────────────┐
│              Fuzz Target Function                      │
│  - Receives raw bytes from libFuzzer                   │
│  - Attempts to parse/decode data                       │
│  - Must handle panics gracefully                       │
└────────────────┬───────────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────────────┐
│              Robocodec Parsers                         │
│  - MCAP parser (mcap_parser)                           │
│  - ROS1 bag parser (bag_parser)                        │
│  - RRF2 parser (rrd_parser)                            │
│  - CDR decoder (cdr_decoder)                           │
│  - Schema parser (schema_parser)                       │
└─────────────────────────────────────────────────────────┘

Quick Start

1. Initial Setup

Run the initialization script:

./scripts/fuzz_init.sh

Or manually:

# Install nightly Rust toolchain
rustup install nightly

# Install cargo-fuzz
cargo +nightly install cargo-fuzz --locked

# Build fuzz targets
cargo +nightly fuzz build

2. Run a Quick Fuzzing Check

make fuzz

This runs all fuzz targets for 30 seconds each, providing a quick check for obvious issues.

3. Run Specific Fuzz Targets

make fuzz-mcap    # Fuzz MCAP parser only
make fuzz-bag     # Fuzz bag parser only
make fuzz-cdr     # Fuzz CDR decoder only
make fuzz-schema  # Fuzz schema parser only

4. Extended Fuzzing Runs

For more thorough testing:

# Run all fuzzers for 1 minute each
make fuzz-all

# Run single fuzzer with custom options
cargo +nightly fuzz run mcap_parser -- \
    -timeout=10 \
    -max_total_time=300 \
    -jobs=4 \
    -dict=fuzz/dictionaries/mcap.dict

Fuzz Targets

mcap_parser

Tests the MCAP format parser with arbitrary byte sequences. Validates:

Magic number detection
Record parsing
Chunk handling
Compression/decompression
Message indexing

Dictionary: fuzz/dictionaries/mcap.dict contains MCAP magic numbers, opcodes, and common strings.

bag_parser

Tests the ROS1 bag format parser with arbitrary byte sequences. Validates:

Header parsing
Record parsing
Chunk handling
Message data extraction
Connection tracking

Dictionary: fuzz/dictionaries/bag.dict contains bag magic, opcodes, and common message types.

rrd_parser

Tests the RRF2 (Rerun Data) format parser with arbitrary byte sequences. Validates:

Magic number detection
Chunk parsing
Arrow message handling
Compression/decompression

cdr_decoder

Tests the CDR (Common Data Representation) decoder with arbitrary byte sequences. Validates:

CDR header parsing
Primitive type decoding
Array handling
String decoding
Nested structure handling

schema_parser

Tests the ROS/IDL schema parser with arbitrary text sequences. Validates:

Type parsing
Field declaration parsing
Array notation parsing
Comment handling
Multi-file dependencies

Dictionary: fuzz/dictionaries/schema.dict contains common types, field names, and IDL keywords.

Interpreting Results

Successful Run

A successful fuzzing run produces output like:

INFO: Seed: 1234567890
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2      INITED cov: 12 ft: 12 corp: 1/1b exec/s: 0 rss: 25Mb
#1024   NEW    cov: 145 ft: 234 corp: 15/234b exec/s: 512 rss: 45Mb
...

Key metrics:

cov: Code coverage (number of edges covered)
ft: Number of unique features
corp: Number of interesting test cases in corpus
exec/s: Executions per second
rss: Memory usage

Crash Found

When a crash is found, libFuzzer will report:

==91234==ERROR: libFuzzer: deadly signal
SUMMARY: libFuzzer: deadly signal
artifact_prefix='fuzz/artifacts/mcap_parser/'; Test unit written to fuzz/artifacts/mcap_parser/crash-abc123def456

The crashing input is saved to fuzz/artifacts/<target>/crash-<hash>.

Handling Crashes

Reproduce the crash:

cargo +nightly fuzz run mcap_parser fuzz/artifacts/mcap_parser/crash-abc123

Minimize the crash input:

cargo +nightly fuzz cmin mcap_parser fuzz/artifacts/mcap_parser/crash-abc123

Debug the crash:
- Add debug prints to the fuzz target
- Use gdb or lldb to investigate
- Check for out-of-bounds access, use-after-free, etc.

Fix the bug and verify the fix:

# After fixing, verify the crash no longer occurs
cargo +nightly fuzz run mcap_parser fuzz/artifacts/mcap_parser/crash-abc123

Advanced Usage

Using Seed Corpus

Provide existing test files as seed corpus for better coverage:

# Copy test files to corpus
cp tests/fixtures/*.mcap fuzz/corpus/mcap_parser/

# Run fuzzer with seed corpus
cargo +nightly fuzz run mcap_parser

Custom Dictionaries

Create dictionaries with format-specific magic numbers and common values:

# MCAP magic
"\x14\x08\xB2\xC1\x43\x49\x0A\x0A"

# Common opcodes
"\x01"  # Header
"\x05"  # Message

Run with dictionary:

cargo +nightly fuzz run mcap_parser -- -dict=fuzz/dictionaries/mcap.dict

Parallel Fuzzing

Run multiple fuzzing jobs in parallel:

cargo +nightly fuzz run mcap_parser -- -jobs=4 -workers=4

ASan and UBSan

Enable AddressSanitizer and UndefinedBehaviorSanitizer:

# Set environment variable
export RUSTFLAGS="-Z sanitizer=address"

# Run fuzzer
cargo +nightly fuzz run mcap_parser -- -sanitizers=address

Integration with CI/CD

Add fuzzing to CI pipelines to catch regressions:

# GitHub Actions example
- name: Run fuzzers
  run: |
    ./scripts/fuzz_init.sh
    make fuzz
  continue-on-error: true  # Don't fail CI on fuzzing

- name: Upload corpus artifacts
  if: always()
  uses: actions/upload-artifact@v3
  with:
    name: fuzz-corpus
    path: fuzz/corpus/

Best Practices

1. Start with Short Runs

When developing new fuzz targets, start with short runs:

cargo +nightly fuzz run new_target -- -timeout=1 -runs=1000

2. Use Timeouts

Prevent infinite loops with timeouts:

cargo +nightly fuzz run mcap_parser -- -timeout=10

3. Limit Input Size

Prevent memory exhaustion:

cargo +nightly fuzz run mcap_parser -- -max_len=1048576  # 1 MB

4. Handle Panics Gracefully

Always use catch_unwind in fuzz targets:

let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
    // Fuzzing logic here
}));

5. Monitor Memory Usage

cargo +nightly fuzz run mcap_parser -- -rss_limit_mb=512

6. Use Deterministic Seeds

For reproducible results:

cargo +nightly fuzz run mcap_parser -- -seed=12345

Troubleshooting

cargo-fuzz not found

Install cargo-fuzz:

cargo +nightly install cargo-fuzz --locked

Nightly toolchain issues

Update nightly:

rustup update nightly

Build errors

Ensure the fuzz target has #![no_main] and uses fuzz_target! macro.

No crashes found

Increase fuzzing time
Use dictionaries for better coverage
Add seed corpus from existing test files
Check if the target is actually parsing the input

Resources

License

SPDX-License-Identifier: MulanPSL-2.0

FilesExpand file tree

FUZZING.md

Latest commit

History