This guide explains how to contribute safely to the UNF Java package while preserving output stability.
UNF behavior is sensitive: even small canonicalization changes can alter signatures and break compatibility.
src/main/java/org/dataverse/unf/: implementationsrc/test/java/org/dataverse/unf/UNF6UtilTest.java: fixture-driven unit testssrc/test/resources/test/: expected-output fixtures by data typedoc/: package documentation and examples
- JDK 17 (from
pom.xml) - Maven 3.x
Typical commands:
mvn clean test
mvn testTo run a single test class:
mvn -Dtest=UNF6UtilTest test- Pick a target area (API overload, canonicalization logic, utility behavior, or tests).
- Read existing tests and corresponding fixture file(s) first.
- Implement minimal change with explicit compatibility intent.
- Run full tests (
mvn test). - If behavior changes are intentional, update fixture expected values and document rationale in PR notes.
Most external callers use UNFUtil.calculateUNF(...) overloads.
UNFUtil performs input adaptation and delegates to UnfDigest.
UnfDigest:
- routes data to type-specific handlers,
- controls matrix orientation (
trnps), - prefixes output with
UNF:<version>..., - combines multiple UNFs via
addUNFs(...).
- Numeric:
UnfNumber+RoundRoutines - String:
UnfString+RoundRoutines/RoundString - Boolean:
UnfBoolean - Bitfield:
UnfBitfield+BitString - Date/time:
UNFUtiloverloads +UnfDateFormatter
All handlers eventually:
- feed canonical bytes into
SHA-256, - truncate to 128 bits,
- Base64-encode,
- return with UNF prefix.
Treat these as compatibility-sensitive:
RoundRoutinesandRoundStringformatting logic- missing/null sentinel handling (
UnfCons.missv, null-byte behavior) - date/time normalization and timezone treatment
- digest truncation size and Base64 conversion path
- sorting/combining logic in
UnfDigest.addUNFs(...)
Any change in these areas can alter emitted UNFs for existing data.
UNF6UtilTest reads each fixture file where:
- first line = expected UNF
- remaining lines = values to hash
- Add or extend fixtures in
src/test/resources/test/. - Add clear unit coverage for new type branches or canonicalization cases.
- Include corner cases: null/missing, blanks, NaN/Infinity, timezone-bearing dates.
If you intentionally change canonicalization:
- explain why prior output was incorrect or incomplete,
- update fixtures explicitly,
- include migration/backward-compatibility notes in PR description.
- Preserve deterministic behavior.
- Prefer explicit conversions to avoid locale/platform drift.
- Keep algorithm constants centralized in
UnfCons. - Avoid introducing side effects in static state unless necessary.
- Keep public API overload behavior predictable and symmetric across types.
- Forgetting that
UnfDigestuses static mutable state (trnps,signature,fingerprint). - Changing default precision (
DEF_NDGTS,DEF_CDGTS) without documenting compatibility impact. - Updating parsing/format rules for dates without fixture updates.
- Treating formatting cleanups as cosmetic; they may be algorithmic.
- Tests pass locally (
mvn test). - New or changed behavior is covered by tests.
- Fixture updates are intentional and explained.
- Backward-compatibility impact is explicitly stated.
- Public API changes (if any) are documented.
- Read
UNFUtilfor API shape. - Read
UnfDigestfor top-level flow and UNF composition. - Read
UnfNumberandRoundRoutinesfor numeric canonicalization details. - Use
UNF6UtilTestplus fixture files to understand expected outputs quickly.