Fix #17: shortest unique signature for current function (with xref fallback)#26
Merged
Merged
Conversation
Adds _wildcard_count helper and extends __lt__ so equal-length signatures rank by wildcard count next. XrefFinder picks this up automatically since it sorts via the same __lt__, which is a strict improvement (less-wildcarded sigs are more specific).
Wires up the action constant ahead of the radio button and dispatch branch. Existing values (0..3) are unchanged.
…function Iterates every instruction in the given function as a possible start point. Each inner search is bounded by the current best candidate's size, so as soon as a small candidate exists subsequent starts get pruned aggressively. An ideal candidate (size <= 5 bytes, 0 wildcards) ends the outer loop early. Degenerate short sigs (< 5 bytes) are rejected even when unique. Raises Unexpected if no candidate exists.
Adds a 5th radio button to the main form: 'Find shortest unique signature for current function'. Wires SigMakerPlugin.run to invoke MinimalFunctionSignatureGenerator on the containing function of the cursor; on Unexpected (no body-internal unique sig) falls back to XrefFinder.find_xrefs and prints the best xref candidate with a 'Xref signature into X (from Y):' annotation. UserCanceledError bubbles up to the existing handler for a clean cancel.
End-to-end against the compiled test binary inside the IDA container. Asserts the generator returns a unique signature, that its IDA-format text actually matches exactly one place, and that the match falls inside the original function body. Skips cleanly if the chosen function happens to have no body-internal unique sig.
An x86 immediate like the 0x13371338 in 'mov rcx, 0x13371338' is a literal value baked into the instruction encoding. It does not shift between binary builds, so wildcarding it just removes bytes that would otherwise have made the signature unique. MEM/FAR/NEAR still get wildcarded because those operands DO encode addresses that move between builds. This improves signature quality for every action that has wildcard_operands=True, not just the new FIND_FUNCTION_SIG action introduced earlier in this branch. It is also the difference that makes the function-search algorithm pick up short, distinctive constants like the example above as unique signatures. The change cannot be unit-tested under the existing harness because the MagicMock idaapi collapses all o_* operand-type constants to a single int, aliasing every BaseKind enum member onto one value. The behavior is exercised by the integration tests against the real IDA test binary.
Targets the exact scenario from the reddit feedback that motivated the WildcardPolicy.for_x86 change: a 64-bit `mov rax, imm` instruction whose immediate must survive the wildcard_operands=True path. Finds the 10-byte encoding in the compiled test binary, runs SignatureMaker.make_signature with wildcard_operands=True and wildcard_optimized=True (the form defaults), and asserts the resulting signature contains the literal immediate bytes 0x28 and 0x15 as non-wildcard bytes. Without the for_x86() fix this test fails (immediate bytes blanked to ??); with it the test passes (immediate stays concrete).
Brings in PR #24 (issue #18 cancel), PR #25 (issue #22 partial on cancel), and PR #28 (1.7.0 release). The auto-merge handled src/sigmaker/__init__.py cleanly; tests/unit_test_sigmaker.py had overlapping test-class insertions at the same anchor, resolved by taking main's test file as the base and appending PR #26's three new test classes (TestMinimalFunctionSignatureGenerator, TestActionEnumAddsFunctionSig, TestGeneratedSignatureOrdering) before the if __name__ block.
mahmoudimus
added a commit
that referenced
this pull request
May 27, 2026
* docs(changelog): add [1.7.1] entry for #26 (issue #17) Documents the new 'Find shortest unique signature for current function' action with xref fallback, the WildcardPolicy.for_x86() change that preserves x86 immediates, and the new (size, wildcards) ranking on GeneratedSignature.__lt__. * docs(readme): clarify acknowledgements Make explicit that the initial port drew from @A200K's IDA-Pro-SigMaker, and that @kweatherman's sigmakerex is independent prior work within the SigMaker ecosystem. Members of the community later requested compatibility and feature parity with parts of sigmakerex's functionality (see #17), so the link is worth flagging directly. The long-form credits chain from sigmakerex's README is preserved verbatim below the paragraph. * chore: bump version to 1.7.1 Cuts the 1.7.1 release covering the shortest-unique-signature-for- current-function action with xref fallback (#26 / #17), the WildcardPolicy.for_x86 change that preserves x86 immediates, and the (size, wildcards) ranking on GeneratedSignature.__lt__. See CHANGELOG.md for the full set of changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #17.
Background
Per issue #17, the existing
Create unique signature for current code addressgrows from wherever the cursor sits. That works for "I want a hook right here" use cases, but it is a poor fit for the much more common case of "give me a stable signature for this function so I can find it again after a binary update":This PR adds a new form action that addresses both shortcomings.
Algorithm note
The minimal-function search iterates every instruction in the function as a possible start point, growing a signature from each until unique. The inner search is bounded by
max_single_signature_lengthAND by the current best candidate's size, so as soon as a small candidate exists every subsequent start point gets pruned aggressively. An "ideal candidate" (size <= 5 bytes and zero wildcards) ends the outer loop early. Degenerate sigs (< 5 bytes) are rejected even when unique, since those are almost always a lone CALL with all-wildcard operand bytes that match thousands of places by accident.When no body-internal unique sig exists, the orchestrator automatically falls back to
XrefFinder.find_xrefs(pfn.start_ea, config), which generates a unique signature rooted at each caller of the function and returns the best. The xref fallback path was already in the codebase; I just wire it up as the automatic next step.The candidate ranking changed too.
GeneratedSignature.__lt__now compares(size, wildcards)ascending rather than justsize. Same-length sigs with fewer wildcards rank first. This is a strict improvement and the existing XREF action picks it up for free.Wildcard policy: stop wildcarding x86 immediates
While reviewing this work I realized the function-search algorithm could not pick up obvious candidates like
mov rcx, 0x13371338even when that immediate was unique across the binary, becauseWildcardPolicy.for_x86()includedBaseKind.IMMin its wildcardable set. An x86 immediate is a literal value baked into the instruction encoding; it does not shift between binary builds, so wildcarding it only removes bytes that would have made the signature unique.MEM,FAR, andNEARstill get wildcarded because those operands DO encode addresses that move between builds.This change improves signature quality for every action with
wildcard_operands=True, not just the newFIND_FUNCTION_SIG. It is what lets the function-search algorithm pick up short, distinctive constants as unique signatures.The change cannot be unit-tested under the existing harness because the mocked
idaapicollapses everyo_*constant to a single int, aliasing everyBaseKindenum member onto one value. The behavior is exercised by the integration tests against the real IDA test binary.Net behavior
main.Function signature (offset +0x10 into function 0x140001040):followed by the existingSignature for 0x140001050: <bytes>line and copies the bytes to the clipboard.No unique signature inside function 0x...; trying xref signatures..., thenXref signature into 0x... (from 0x...):followed by the existing display line.Place cursor inside a function first.and exits cleanly.Operation cancelled by user, no traceback.Verification
tests/unit_test_sigmaker.py)idapro-tests(9.0/9.1)idapro-tests-9.2Zero regressions. New test coverage:
TestGeneratedSignatureOrdering: smaller-size beats larger; equal-size, fewer-wildcards beats more; equal-size-equal-wildcards is not strictly less;_wildcard_counthelper works.TestActionEnumAddsFunctionSig:FIND_FUNCTION_SIG == 4; existing enum values unchanged.TestMinimalFunctionSignatureGenerator: returns the shortest-unique candidate; prune caps the inner search by current best; ideal-candidate early exit fires; raisesUnexpectedwhen no candidate exists; rejects degenerate sigs under 5 bytes;MIN_USEFUL_SIG_BYTES == 5.test_minimal_function_signature_against_real_function(integration): end-to-end against the compiled test binary; asserts the returned sig parses, matches exactly one place, and that the match falls inside the original function body. Skips cleanly when the chosen function has no body-internal unique sig.