Commit b545e2f
authored
Merge pull request #4055 from pratham-mcw:sgm-neon-optimization
stereo: performance optimization of sgm on Window-ARM64 #4055
### Pull Request Readiness Checklist
- This PR adds an ARM64 NEON intrinsics-based optimization for the computeDisparityBinarySGBM function in stereo_binary_sgbm.cpp.
- The new implementation uses NEON vector instructions (e.g., vld1q_s16, vminq_s16, vqaddq_s16), allowing for efficient parallel computation. This is guarded under the CV_NEON macro and does not affect other platforms.
- This change is similar to existing SSE2 optimizations for x64 and brings the same performance benefits to ARM64.
**Performance Improvements:**
- The optimization significantly improves the performance of sgm on Windows ARM64 targets.
- The table below shows timing comparisons before and after the optimization:
<img width="1047" height="199" alt="image" src="https://github.com/user-attachments/assets/0752cfc0-3c82-4595-8e3f-5d87cbdfdf96" />
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch1 parent 45dd594 commit b545e2f
1 file changed
+67
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
147 | 147 | | |
148 | 148 | | |
149 | 149 | | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
150 | 154 | | |
151 | 155 | | |
152 | 156 | | |
| |||
420 | 424 | | |
421 | 425 | | |
422 | 426 | | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
423 | 490 | | |
424 | 491 | | |
425 | 492 | | |
| |||
0 commit comments