Skip to content

refactor(paddr.c): update store_commit_queue_push logic#994

Open
skyhgzsh wants to merge 1 commit intoOpenXiangShan:masterfrom
skyhgzsh:skyhgzsh-perf
Open

refactor(paddr.c): update store_commit_queue_push logic#994
skyhgzsh wants to merge 1 commit intoOpenXiangShan:masterfrom
skyhgzsh:skyhgzsh-perf

Conversation

@skyhgzsh
Copy link
Copy Markdown
Contributor

@skyhgzsh skyhgzsh commented Apr 8, 2026

BREAKING CHANGE: store_queue diff is not backward compatible!

Copilot AI review requested due to automatic review settings April 8, 2026 07:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the difftest store-commit queue encoding in paddr.c, changing how store events are normalized (address alignment), masked, and split before being pushed to the C++ queue. The PR is explicitly marked as a breaking change due to store_queue diff format incompatibility.

Changes:

  • Replaces prior store-commit push logic (including misalignment/vector special-casing) with a unified 8-byte-aligned encoding that may split stores across an 8B boundary.
  • Introduces mask_expand() to convert an 8-bit byte mask into a 64-bit per-byte data mask for masking shifted store data.
  • Adds stricter runtime validation of len (power-of-two unless cross_page_store).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.157e+07 2.686e+07 2.979e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.330e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.858e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.751e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 1.590e+07
aliastest-riscv64-xs.bin 7.700e+06 1.787e+06 3.484e+06
softprefetchtest-riscv64-xs.bin 7.744e+06 3.413e+06 5.993e+06
zacas-riscv64-xs.bin 1.212e+07 5.339e+07 2.294e+07
linux-hello 1.856e+10 4.065e+07 4.808e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

@skyhgzsh skyhgzsh force-pushed the skyhgzsh-perf branch 2 times, most recently from 355ca1a to dd287e4 Compare April 8, 2026 07:46
@skyhgzsh skyhgzsh requested a review from Copilot April 8, 2026 07:47
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.156e+07 2.686e+07 3.260e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.799e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.822e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.706e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 1.841e+07
aliastest-riscv64-xs.bin 7.697e+06 1.788e+06 3.801e+06
softprefetchtest-riscv64-xs.bin 7.747e+06 3.412e+06 5.900e+06
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 1.840e+07
linux-hello 1.856e+10 4.065e+07 4.893e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.156e+07 2.686e+07 2.777e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.809e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.803e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.736e+08
amtest-riscv64-xs.bin 8.675e+06 1.830e+07 2.280e+07
aliastest-riscv64-xs.bin 7.697e+06 1.788e+06 2.921e+06
softprefetchtest-riscv64-xs.bin 7.747e+06 3.412e+06 6.263e+06
zacas-riscv64-xs.bin 1.212e+07 5.339e+07 2.058e+07
linux-hello 1.856e+10 4.065e+07 4.824e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

@skyhgzsh skyhgzsh force-pushed the skyhgzsh-perf branch 2 times, most recently from 2d4da60 to 83cb867 Compare April 8, 2026 08:04
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.156e+07 2.686e+07 2.809e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.845e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.822e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.673e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 1.669e+07
aliastest-riscv64-xs.bin 7.700e+06 1.787e+06 3.261e+06
softprefetchtest-riscv64-xs.bin 7.744e+06 3.413e+06 5.796e+06
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 1.874e+07
linux-hello 1.856e+10 4.065e+07 4.766e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.156e+07 2.686e+07 2.745e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.807e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.836e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.716e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 1.636e+07
aliastest-riscv64-xs.bin 7.697e+06 1.788e+06 3.316e+06
softprefetchtest-riscv64-xs.bin 7.745e+06 3.413e+06 1.413e+07
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 2.165e+07
linux-hello 1.856e+10 4.065e+07 4.775e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.156e+07 2.686e+07 2.974e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.765e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.860e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.710e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 1.277e+07
aliastest-riscv64-xs.bin 7.697e+06 1.788e+06 3.065e+06
softprefetchtest-riscv64-xs.bin 7.745e+06 3.413e+06 6.104e+06
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 1.895e+07
linux-hello 1.856e+10 4.065e+07 4.893e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.156e+07 2.686e+07 2.990e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.831e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.834e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.693e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 1.802e+07
aliastest-riscv64-xs.bin 7.697e+06 1.788e+06 3.051e+06
softprefetchtest-riscv64-xs.bin 7.745e+06 3.413e+06 5.809e+06
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 1.873e+07
linux-hello 1.856e+10 4.065e+07 4.904e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.157e+07 2.686e+07 2.641e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.819e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.856e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.716e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 1.519e+07
aliastest-riscv64-xs.bin 7.697e+06 1.788e+06 3.113e+06
softprefetchtest-riscv64-xs.bin 7.745e+06 3.413e+06 5.647e+06
zacas-riscv64-xs.bin 1.212e+07 5.339e+07 1.963e+07
linux-hello 1.856e+10 4.065e+07 4.878e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

Copy link
Copy Markdown
Contributor

@huxuan0307 huxuan0307 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.156e+07 2.686e+07 2.796e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.758e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.783e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.709e+08
amtest-riscv64-xs.bin 8.675e+06 1.830e+07 1.611e+07
aliastest-riscv64-xs.bin 7.697e+06 1.788e+06 3.414e+06
softprefetchtest-riscv64-xs.bin 7.747e+06 3.412e+06 6.175e+06
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 1.876e+07
linux-hello 1.855e+10 4.067e+07 4.775e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.156e+07 2.686e+07 2.743e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.815e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.786e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.691e+08
amtest-riscv64-xs.bin 8.675e+06 1.830e+07 1.773e+07
aliastest-riscv64-xs.bin 7.697e+06 1.788e+06 3.528e+06
softprefetchtest-riscv64-xs.bin 7.747e+06 3.412e+06 8.553e+06
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 2.001e+07
linux-hello 1.856e+10 4.065e+07 4.813e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 9, 2026

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.157e+07 2.686e+07 3.005e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.779e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.780e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.722e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 1.645e+07
aliastest-riscv64-xs.bin 7.699e+06 1.787e+06 3.348e+06
softprefetchtest-riscv64-xs.bin 7.745e+06 3.413e+06 5.913e+06
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 1.934e+07
linux-hello 1.856e+10 4.065e+07 4.868e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

skyhgzsh added a commit to skyhgzsh/ready-to-run that referenced this pull request Apr 9, 2026
* NEMU commit: de7d035fd308bbd95312d58e4d0fe41b07c4b5ad
* NEMU configs:
    * riscv64-xs-ref_defconfig
    * riscv64-dual-xs-ref_defconfig
    * riscv64-xs-ref-debug_defconfig
    * riscv64-dual-xs-ref-debug_defconfig

Including:
  * refactor(paddr.c, macro.h): update store_commit_queue_push logic and add new macro([#994](OpenXiangShan/NEMU#994))
@skyhgzsh skyhgzsh force-pushed the skyhgzsh-perf branch 2 times, most recently from f4a5402 to dc8aa6c Compare April 10, 2026 03:39
@github-actions
Copy link
Copy Markdown

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.156e+07 2.686e+07 2.913e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.712e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.783e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.699e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 2.077e+07
aliastest-riscv64-xs.bin 7.697e+06 1.788e+06 3.268e+06
softprefetchtest-riscv64-xs.bin 7.745e+06 3.413e+06 4.977e+06
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 1.863e+07
linux-hello 1.855e+10 4.067e+07 4.777e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

@github-actions
Copy link
Copy Markdown

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.157e+07 2.686e+07 2.893e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.742e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.808e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.661e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 1.669e+07
aliastest-riscv64-xs.bin 7.700e+06 1.787e+06 3.178e+06
softprefetchtest-riscv64-xs.bin 7.745e+06 3.413e+06 5.495e+06
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 2.053e+07
linux-hello 1.855e+10 4.067e+07 4.802e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…add new macro

* if a committed store crosses a 8 byte boundary,
the function store_commit_queue_push() will split
the store commit into two parts, and push them into
the commit queue separately.

* add macro SAFE_BITMASK for generating bitmask safely.

* add macro SAFE_BITMASKRANGE for generating range-based bitmask safely.

* add macro IS_POW_OF_2 for check if a number is a power of 2.

BREAKING CHANGE: store_queue diff is not backward compatible!
@skyhgzsh skyhgzsh requested a review from Copilot April 10, 2026 07:19
skyhgzsh added a commit to skyhgzsh/ready-to-run that referenced this pull request Apr 10, 2026
* NEMU commit: de7d035fd308bbd95312d58e4d0fe41b07c4b5ad
* NEMU configs:
    * riscv64-xs-ref_defconfig
    * riscv64-dual-xs-ref_defconfig
    * riscv64-xs-ref-debug_defconfig
    * riscv64-dual-xs-ref-debug_defconfig

Including:
  * refactor(paddr.c, macro.h): update store_commit_queue_push logic and add new macro([#994](OpenXiangShan/NEMU#994))
skyhgzsh added a commit to OpenXiangShan/ready-to-run that referenced this pull request Apr 10, 2026
* NEMU commit: de7d035fd308bbd95312d58e4d0fe41b07c4b5ad
* NEMU configs:
    * riscv64-xs-ref_defconfig
    * riscv64-dual-xs-ref_defconfig
    * riscv64-xs-ref-debug_defconfig
    * riscv64-dual-xs-ref-debug_defconfig

Including:
  * refactor(paddr.c, macro.h): update store_commit_queue_push logic and add new macro([#994](OpenXiangShan/NEMU#994))
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Copy Markdown

NEMU Performance Results

Test Instructions Executed Estimated Host Throughput (instr/s) Actual NEMU Throughput (instr/s)
bitmanip.bin 5.156e+07 2.686e+07 2.831e+07
coremark-riscv64-xs-rv64gc-o2.bin 1.731e+08 1.938e+08 1.769e+08
coremark-riscv64-xs-rv64gc-o3.bin 1.726e+08 1.967e+08 1.821e+08
coremark-riscv64-xs-rv64gcb-o3.bin 1.697e+08 1.788e+08 1.664e+08
amtest-riscv64-xs.bin 8.673e+06 1.830e+07 1.660e+07
aliastest-riscv64-xs.bin 7.700e+06 1.787e+06 3.245e+06
softprefetchtest-riscv64-xs.bin 7.745e+06 3.413e+06 6.132e+06
zacas-riscv64-xs.bin 1.212e+07 5.340e+07 2.278e+07
linux-hello 1.855e+10 4.067e+07 4.728e+07
  • Host throughput is estimated based on 4GHz CPU frequency and IPC=2.5.
  • Actual throughput may vary based on the host CPU performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants