Skip to content

Speed up Find-shortest-function-signature (110M-iteration is_unique fix + buffer cache)#31

Merged
mahmoudimus merged 9 commits into
mainfrom
feat/function-sig-perf
May 28, 2026
Merged

Speed up Find-shortest-function-signature (110M-iteration is_unique fix + buffer cache)#31
mahmoudimus merged 9 commits into
mainfrom
feat/function-sig-perf

Conversation

@mahmoudimus

Copy link
Copy Markdown
Owner

Background

A real-world Find shortest function signature run (issue #17 / PR #26) on a large database took 7.7 minutes for a single function. I profiled it in IDA with cProfile and found is_unique was enumerating every match of each candidate signature: 110+ million bin_search iterations, with idaapi.user_cancelled() polling alone costing 135 seconds (29% of the run). The short signatures produced during growth match hundreds of thousands of locations, and the old code counted all of them just to answer "is this unique?".

What I fixed

  1. is_unique bails at the second match. Uniqueness only depends on whether the count is 0, 1, or 2+, so there is no reason to enumerate every match. This is the dominant win: 110M iterations collapse to a few thousand. count_matches still enumerates fully, so the issue-[Feature Request] Cancelling search for any reason should return partial, even if incomplete signature instead of nothing, even if it isn't unique #22 partial-on-cancel count is unchanged.
  2. The segment buffer is loaded once per generate(), not once per is_unique. Profiling showed InMemoryBuffer.load(SEGMENTS) was being re-run on every uniqueness check (84% of wall time after the first fix). I thread an optional cached buffer through is_unique / count_matches / find_all / _find_all_simd.
  3. Each instruction is decoded once, not O(N^2) times. MinimalFunctionSignatureGenerator now pre-decodes the whole function into a small list up front (_DecodedInstruction + _decode_function_for_anchors) and grows anchors over cached data, instead of re-walking and re-reading bytes for every anchor.
  4. The compiled _speedups extension loads in source/symlink layouts. When the plugin is loaded from a source tree while a pip-installed sigmaker namespace package without a matching compiled extension shadows it, the package-level import misses. I added a fallback that loads the _speedups extension sitting next to __init__.py, so SIMD_SPEEDUP_AVAILABLE is true in those layouts (and the shipped single-file plugin is unaffected, since it has no sibling _speedups).

I also added start_profiling / stop_profiling helpers, exposed as Edit/Plugins menu actions, so this kind of slowdown can be diagnosed in-IDA without a sys.path dance.

Changes

Commit What
_DecodedInstruction dataclass Pre-decoded instruction container
_decode_function_for_anchors One-shot function decode
Refactor MinimalFunctionSignatureGenerator Grow anchors over pre-decoded data
benchmark_predecode_function_sig Benchmark hook
Cache InMemoryBuffer per generate One segment load per search
start/stop profiling helpers In-IDA cProfile controls
Profiling menu actions Edit/Plugins entries
Bail at second match in is_unique The dominant fix
Sibling _speedups fallback SIMD on in source/symlink layouts

Net behavior

Same function, measured on the test binary's largest function, before and after:

Build Median wall time
Before (v1.7.1) 0.554s
After 0.104s

On the real-world database that motivated this, the 7.7-minute run now completes effectively instantly. Signatures produced are byte-identical; only the work to reach them changed.

Verification

Suite Before After
Host unit 148 OK 166 OK
Docker idapro-tests (9.0/9.1) 166 OK 184 OK
Docker idapro-tests-9.2 166 OK 184 OK

Zero regressions on either image.

Notes

  • The public API surface that downstream consumers use (SignatureMaker.make_signature, XrefFinder.find_xrefs, SigMakerConfig kwargs, the Signature format specs) is unchanged. New parameters are optional with behavior-preserving defaults.
  • CREATE_UNIQUE (UniqueSignatureGenerator) still uses count_matches for its real-count display, so it does not get the early-bail. If it proves slow on very large databases, that is a separate follow-up.

@mahmoudimus mahmoudimus merged commit 8eb2a08 into main May 28, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant