misc: Add benchmark filters to manual perf workflow by happy-lx · Pull Request #787 · OpenXiangShan/GEM5

happy-lx · 2026-03-12T10:16:27Z

Add a specific_benchmarks workflow input and pass it to parallel_sim.sh. Filter workloads with case-insensitive substring matching. Keep default behavior unchanged when filter is empty.

Change-Id: I7686d05c28aca7234aa9357be4d5bdf05d103236

Summary by CodeRabbit

New Features
- Added selective benchmark filtering to performance test workflows. Users can specify a comma-separated list of benchmarks for targeted testing scenarios, or leave empty to execute the complete benchmark suite. Enhanced error handling ensures invalid benchmark filters are properly reported and handled.

coderabbitai · 2026-03-12T10:16:59Z

📝 Walkthrough

Walkthrough

A new benchmark filtering input specific_benchmarks was added to the perf workflows and threaded into parallel_sim.sh, which now supports case-insensitive, comma-separated benchmark filtering and aborts if no workloads match.

Changes

Cohort / File(s)	Summary
Workflow templates `.github/workflows/gem5-perf-template.yml`, `.github/workflows/manual-perf.yml`	Added `specific_benchmarks` workflow input (string, optional, default ""), forwarded from manual-perf to the template and passed into the gem5 execution step.
Performance script `util/xs_scripts/parallel_sim.sh`	Added `apply_benchmark_filter()` and `filtered_workloads` handling; extended argument parsing to accept `benchmark_filters` (backwards-compatible), integrated filtering before parallel runs, and improved failure cleanup/abort behavior.

Sequence Diagram(s)

sequenceDiagram
    participant User as Workflow Trigger
    participant Manual as manual-perf.yml
    participant Template as gem5-perf-template.yml
    participant Script as parallel_sim.sh
    participant Exec as Worker/Executor

    User->>Manual: trigger with specific_benchmarks
    Manual->>Template: pass specific_benchmarks via workflow_call
    Template->>Script: invoke parallel_sim.sh with benchmark_filters arg
    Script->>Script: apply_benchmark_filter() -> produce filtered_workloads
    Script->>Exec: execute filtered workload set (or abort if none)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

util: support config files in parallel_sim.sh #610: Changes to util/xs_scripts/parallel_sim.sh argument parsing and invocation logic (closely related to the script edits here).

Suggested labels

perf

Suggested reviewers

jensen-yan
Yakkhini
tastynoob

Poem

🐰 I hopped through workflows, nose so keen,
I filtered benchmarks, neat and lean.
Commas and cases I sort with care,
Only chosen tests now dare to dare—
Hooray for runs that finish fair! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title uses 'misc:' prefix which is vague and doesn't clearly convey the main purpose. While it mentions 'benchmark filters' and 'manual perf workflow', the 'misc:' prefix masks the actual significance of adding filtering capability to the workflow.	Consider using a more specific prefix like 'feat:' or 'perf:' and removing 'misc:' to better reflect that this adds new functionality for filtering benchmarks in the performance workflow.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch specific-bmk-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Add a specific_benchmarks workflow input and pass it to parallel_sim.sh. Filter workloads in-memory with case-insensitive substring matching. Keep default behavior unchanged when filter is empty. Change-Id: I7686d05c28aca7234aa9357be4d5bdf05d103236

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.github/workflows/gem5-perf-template.yml (1)

151-156: ⚠️ Potential issue | 🔴 Critical

Pass specific_benchmarks and extra_args through environment variables instead of directly interpolating into shell source.

${{ inputs.specific_benchmarks }} and ${{ inputs.extra_args }} are user-controlled inputs expanded by GitHub Actions before the shell runs. Values containing ", backticks, or $() will break quoting and execute arbitrary commands on the self-hosted runner. The same vulnerability exists in the metadata write where specific_benchmarks is used in an echo statement.

Proposed fix

       - name: XS-GEM5 - Run performance test
         # ${{ steps.config.outputs.comment }}
         env:
           GCBV_REF_SO: "/nfs/home/share/gem5_ci/ref/normal/riscv64-nemu-notama-tvalref-so"
           GCB_RESTORER: ""
           GEM5_HOME: ${{ github.workspace }}
           GEM5_BUILD_TYPE: fast
+          SPECIFIC_BENCHMARKS: ${{ inputs.specific_benchmarks }}
+          EXTRA_ARGS: ${{ inputs.extra_args }}
         run: |
           mkdir -p $GEM5_HOME/util/xs_scripts/test
           cd $GEM5_HOME/util/xs_scripts/test
           CONFIG_PATH="${{ inputs.config_path }}"
           if [[ "$CONFIG_PATH" == /* ]]; then
             echo "Error: config_path must be repo-root-relative, got absolute path: '$CONFIG_PATH'"
             exit 1
           fi
           if [[ "$CONFIG_PATH" != *.py ]]; then
             echo "Error: config_path must be a .py gem5 config file path, got: '$CONFIG_PATH'"
             exit 1
           fi
           CONFIG_PATH="$GEM5_HOME/$CONFIG_PATH"
           bash ../parallel_sim.sh "$(realpath "$CONFIG_PATH")" \
             ${{ steps.config.outputs.checkpoint_list }} \
             ${{ steps.config.outputs.checkpoint_root_node}} \
             spec_all \
-            "${{ inputs.specific_benchmarks }}" \
-            "${{ inputs.extra_args }}"
+            "$SPECIFIC_BENCHMARKS" \
+            "$EXTRA_ARGS"
       - name: Archive performance data
         if: always()
+        env:
+          SPECIFIC_BENCHMARKS: ${{ inputs.specific_benchmarks }}
         run: |
           # Create archive directory with timestamp and commit
@@
-          echo "specific_benchmarks: ${{ inputs.specific_benchmarks }}" >> "$TARGET_DIR/metadata.txt"
+          printf 'specific_benchmarks: %s\n' "$SPECIFIC_BENCHMARKS" >> "$TARGET_DIR/metadata.txt"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/gem5-perf-template.yml around lines 151 - 156, The step
currently interpolates user inputs directly into the shell command (the
parallel_sim.sh invocation using ${{ inputs.specific_benchmarks }} and ${{
inputs.extra_args }}) and into a metadata echo, which is unsafe; instead expose
those inputs as environment variables (e.g. SPECIFIC_BENCHMARKS and EXTRA_ARGS)
in the job/step env and then call parallel_sim.sh with quoted references
("$SPECIFIC_BENCHMARKS" "$EXTRA_ARGS"), and update the metadata write to echo
the env var with proper quoting (echo "$SPECIFIC_BENCHMARKS") so no
user-controlled string is expanded by the shell before execution; locate the
invocation of parallel_sim.sh and the metadata echo to make these changes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@util/xs_scripts/parallel_sim.sh`:
- Around line 124-128: The awk filter currently lowercases and searches the
entire row via lower_line = tolower($0) and index(lower_line, patterns[i]) which
causes matches in checkpoint paths and numeric columns; change it to only
examine the workload name by using lower_line = tolower($1) (and keep the index
check against patterns[i]) so pattern matching is constrained to the first field
(workload_name) in the loop that uses variables lower_line, patterns and valid.

---

Outside diff comments:
In @.github/workflows/gem5-perf-template.yml:
- Around line 151-156: The step currently interpolates user inputs directly into
the shell command (the parallel_sim.sh invocation using ${{
inputs.specific_benchmarks }} and ${{ inputs.extra_args }}) and into a metadata
echo, which is unsafe; instead expose those inputs as environment variables
(e.g. SPECIFIC_BENCHMARKS and EXTRA_ARGS) in the job/step env and then call
parallel_sim.sh with quoted references ("$SPECIFIC_BENCHMARKS" "$EXTRA_ARGS"),
and update the metadata write to echo the env var with proper quoting (echo
"$SPECIFIC_BENCHMARKS") so no user-controlled string is expanded by the shell
before execution; locate the invocation of parallel_sim.sh and the metadata echo
to make these changes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3ac2d9b9-cfa4-43e1-8b86-08bb50a9b390

📥 Commits

Reviewing files that changed from the base of the PR and between 280bf4a and 7149af1.

📒 Files selected for processing (3)

.github/workflows/gem5-perf-template.yml
.github/workflows/manual-perf.yml
util/xs_scripts/parallel_sim.sh

util/xs_scripts/parallel_sim.sh

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

util/xs_scripts/parallel_sim.sh (1)

124-128: ⚠️ Potential issue | 🟠 Major

Match filters against the workload name only.

Lines 125-127 currently search tolower($0), so a filter can match checkpoint paths or numeric columns instead of the workload identifier. With this file format, that will over-select unrelated rows and make specific_benchmarks behave unpredictably.

Suggested fix

         {
-            lower_line = tolower($0)
+            lower_name = tolower($1)
             for (i = 1; i <= valid; i++) {
-                if (index(lower_line, patterns[i]) > 0) {
+                if (index(lower_name, patterns[i]) > 0) {
                     print $0
                     next
                 }
             }
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@util/xs_scripts/parallel_sim.sh` around lines 124 - 128, The awk filter is
matching against the whole line (tolower($0)) so patterns can match checkpoint
paths or numeric columns; change it to match only the workload identifier field
instead (e.g., use tolower($1) or the actual workload field variable) so the
loop that checks patterns[i] and the print $0 only trigger when the workload
name matches; update the line "lower_line = tolower($0)" to "lower_line =
tolower($1)" (or the correct workload field) and keep the existing loop over
patterns[i] and valid.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/gem5-perf-template.yml:
- Around line 19-23: The workflow treats filtered runs the same as full-suite
runs; when inputs.specific_benchmarks is non-empty you must either skip
full-suite scoring or isolate artifacts/retention by carrying a sanitized filter
key. Update the logic that uses inputs.benchmark_type (and the scoring/summary
and archive/cleanup steps referenced by names like the scoring job, summary job,
and archive retention cleanup) to detect inputs.specific_benchmarks != "" and do
one of: 1) short-circuit the scoring/summary jobs when a filter is active, or 2)
compute a sanitized_filter (e.g., replace commas with dashes, remove unsafe
chars) and append it to artifact names, archive keys and retention-bucket
identifiers so filtered runs are stored/aged separately from baseline runs;
apply this change consistently where benchmark_type is used for naming/cleanup
and in the scoring/summary steps.

---

Duplicate comments:
In `@util/xs_scripts/parallel_sim.sh`:
- Around line 124-128: The awk filter is matching against the whole line
(tolower($0)) so patterns can match checkpoint paths or numeric columns; change
it to match only the workload identifier field instead (e.g., use tolower($1) or
the actual workload field variable) so the loop that checks patterns[i] and the
print $0 only trigger when the workload name matches; update the line
"lower_line = tolower($0)" to "lower_line = tolower($1)" (or the correct
workload field) and keep the existing loop over patterns[i] and valid.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7bad3a43-d153-4c30-9709-e41169d181e9

📥 Commits

Reviewing files that changed from the base of the PR and between 7149af1 and 986a2e6.

📒 Files selected for processing (3)

.github/workflows/gem5-perf-template.yml
.github/workflows/manual-perf.yml
util/xs_scripts/parallel_sim.sh

🚧 Files skipped from review as they are similar to previous changes (1)

.github/workflows/manual-perf.yml

.github/workflows/gem5-perf-template.yml

github-actions · 2026-03-12T11:33:37Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.2607`	-
This PR	`2.2607`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

happy-lx force-pushed the specific-bmk-support branch from 7149af1 to 986a2e6 Compare March 12, 2026 10:19

happy-lx requested a review from jensen-yan March 12, 2026 10:20

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

util/xs_scripts/parallel_sim.sh Show resolved Hide resolved

jensen-yan approved these changes Mar 12, 2026

View reviewed changes

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

.github/workflows/gem5-perf-template.yml Show resolved Hide resolved

happy-lx merged commit a6f4a18 into xs-dev Mar 13, 2026
8 of 12 checks passed

happy-lx deleted the specific-bmk-support branch March 13, 2026 02:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

misc: Add benchmark filters to manual perf workflow#787

misc: Add benchmark filters to manual perf workflow#787
happy-lx merged 1 commit intoxs-devfrom
specific-bmk-support

happy-lx commented Mar 12, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 12, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

happy-lx commented Mar 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Mar 12, 2026

🚀 Coremark Smoke Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

happy-lx commented Mar 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 12, 2026 •

edited

Loading