Skip to content

misc: Add benchmark filters to manual perf workflow#787

Merged
happy-lx merged 1 commit intoxs-devfrom
specific-bmk-support
Mar 13, 2026
Merged

misc: Add benchmark filters to manual perf workflow#787
happy-lx merged 1 commit intoxs-devfrom
specific-bmk-support

Conversation

@happy-lx
Copy link
Copy Markdown
Contributor

@happy-lx happy-lx commented Mar 12, 2026

Add a specific_benchmarks workflow input and pass it to parallel_sim.sh. Filter workloads with case-insensitive substring matching. Keep default behavior unchanged when filter is empty.

Change-Id: I7686d05c28aca7234aa9357be4d5bdf05d103236

Summary by CodeRabbit

  • New Features
    • Added selective benchmark filtering to performance test workflows. Users can specify a comma-separated list of benchmarks for targeted testing scenarios, or leave empty to execute the complete benchmark suite. Enhanced error handling ensures invalid benchmark filters are properly reported and handled.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 12, 2026

📝 Walkthrough

Walkthrough

A new benchmark filtering input specific_benchmarks was added to the perf workflows and threaded into parallel_sim.sh, which now supports case-insensitive, comma-separated benchmark filtering and aborts if no workloads match.

Changes

Cohort / File(s) Summary
Workflow templates
.github/workflows/gem5-perf-template.yml, .github/workflows/manual-perf.yml
Added specific_benchmarks workflow input (string, optional, default ""), forwarded from manual-perf to the template and passed into the gem5 execution step.
Performance script
util/xs_scripts/parallel_sim.sh
Added apply_benchmark_filter() and filtered_workloads handling; extended argument parsing to accept benchmark_filters (backwards-compatible), integrated filtering before parallel runs, and improved failure cleanup/abort behavior.

Sequence Diagram(s)

sequenceDiagram
    participant User as Workflow Trigger
    participant Manual as manual-perf.yml
    participant Template as gem5-perf-template.yml
    participant Script as parallel_sim.sh
    participant Exec as Worker/Executor

    User->>Manual: trigger with specific_benchmarks
    Manual->>Template: pass specific_benchmarks via workflow_call
    Template->>Script: invoke parallel_sim.sh with benchmark_filters arg
    Script->>Script: apply_benchmark_filter() -> produce filtered_workloads
    Script->>Exec: execute filtered workload set (or abort if none)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

perf

Suggested reviewers

  • jensen-yan
  • Yakkhini
  • tastynoob

Poem

🐰 I hopped through workflows, nose so keen,
I filtered benchmarks, neat and lean.
Commas and cases I sort with care,
Only chosen tests now dare to dare—
Hooray for runs that finish fair! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title uses 'misc:' prefix which is vague and doesn't clearly convey the main purpose. While it mentions 'benchmark filters' and 'manual perf workflow', the 'misc:' prefix masks the actual significance of adding filtering capability to the workflow. Consider using a more specific prefix like 'feat:' or 'perf:' and removing 'misc:' to better reflect that this adds new functionality for filtering benchmarks in the performance workflow.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch specific-bmk-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@happy-lx happy-lx force-pushed the specific-bmk-support branch from 7149af1 to 986a2e6 Compare March 12, 2026 10:19
@happy-lx happy-lx requested a review from jensen-yan March 12, 2026 10:20
Add a specific_benchmarks workflow input and pass it to parallel_sim.sh.
Filter workloads in-memory with case-insensitive substring matching.
Keep default behavior unchanged when filter is empty.

Change-Id: I7686d05c28aca7234aa9357be4d5bdf05d103236
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/gem5-perf-template.yml (1)

151-156: ⚠️ Potential issue | 🔴 Critical

Pass specific_benchmarks and extra_args through environment variables instead of directly interpolating into shell source.

${{ inputs.specific_benchmarks }} and ${{ inputs.extra_args }} are user-controlled inputs expanded by GitHub Actions before the shell runs. Values containing ", backticks, or $() will break quoting and execute arbitrary commands on the self-hosted runner. The same vulnerability exists in the metadata write where specific_benchmarks is used in an echo statement.

Proposed fix
       - name: XS-GEM5 - Run performance test
         # ${{ steps.config.outputs.comment }}
         env:
           GCBV_REF_SO: "/nfs/home/share/gem5_ci/ref/normal/riscv64-nemu-notama-tvalref-so"
           GCB_RESTORER: ""
           GEM5_HOME: ${{ github.workspace }}
           GEM5_BUILD_TYPE: fast
+          SPECIFIC_BENCHMARKS: ${{ inputs.specific_benchmarks }}
+          EXTRA_ARGS: ${{ inputs.extra_args }}
         run: |
           mkdir -p $GEM5_HOME/util/xs_scripts/test
           cd $GEM5_HOME/util/xs_scripts/test
           CONFIG_PATH="${{ inputs.config_path }}"
           if [[ "$CONFIG_PATH" == /* ]]; then
             echo "Error: config_path must be repo-root-relative, got absolute path: '$CONFIG_PATH'"
             exit 1
           fi
           if [[ "$CONFIG_PATH" != *.py ]]; then
             echo "Error: config_path must be a .py gem5 config file path, got: '$CONFIG_PATH'"
             exit 1
           fi
           CONFIG_PATH="$GEM5_HOME/$CONFIG_PATH"
           bash ../parallel_sim.sh "$(realpath "$CONFIG_PATH")" \
             ${{ steps.config.outputs.checkpoint_list }} \
             ${{ steps.config.outputs.checkpoint_root_node}} \
             spec_all \
-            "${{ inputs.specific_benchmarks }}" \
-            "${{ inputs.extra_args }}"
+            "$SPECIFIC_BENCHMARKS" \
+            "$EXTRA_ARGS"
       - name: Archive performance data
         if: always()
+        env:
+          SPECIFIC_BENCHMARKS: ${{ inputs.specific_benchmarks }}
         run: |
           # Create archive directory with timestamp and commit
@@
-          echo "specific_benchmarks: ${{ inputs.specific_benchmarks }}" >> "$TARGET_DIR/metadata.txt"
+          printf 'specific_benchmarks: %s\n' "$SPECIFIC_BENCHMARKS" >> "$TARGET_DIR/metadata.txt"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/gem5-perf-template.yml around lines 151 - 156, The step
currently interpolates user inputs directly into the shell command (the
parallel_sim.sh invocation using ${{ inputs.specific_benchmarks }} and ${{
inputs.extra_args }}) and into a metadata echo, which is unsafe; instead expose
those inputs as environment variables (e.g. SPECIFIC_BENCHMARKS and EXTRA_ARGS)
in the job/step env and then call parallel_sim.sh with quoted references
("$SPECIFIC_BENCHMARKS" "$EXTRA_ARGS"), and update the metadata write to echo
the env var with proper quoting (echo "$SPECIFIC_BENCHMARKS") so no
user-controlled string is expanded by the shell before execution; locate the
invocation of parallel_sim.sh and the metadata echo to make these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@util/xs_scripts/parallel_sim.sh`:
- Around line 124-128: The awk filter currently lowercases and searches the
entire row via lower_line = tolower($0) and index(lower_line, patterns[i]) which
causes matches in checkpoint paths and numeric columns; change it to only
examine the workload name by using lower_line = tolower($1) (and keep the index
check against patterns[i]) so pattern matching is constrained to the first field
(workload_name) in the loop that uses variables lower_line, patterns and valid.

---

Outside diff comments:
In @.github/workflows/gem5-perf-template.yml:
- Around line 151-156: The step currently interpolates user inputs directly into
the shell command (the parallel_sim.sh invocation using ${{
inputs.specific_benchmarks }} and ${{ inputs.extra_args }}) and into a metadata
echo, which is unsafe; instead expose those inputs as environment variables
(e.g. SPECIFIC_BENCHMARKS and EXTRA_ARGS) in the job/step env and then call
parallel_sim.sh with quoted references ("$SPECIFIC_BENCHMARKS" "$EXTRA_ARGS"),
and update the metadata write to echo the env var with proper quoting (echo
"$SPECIFIC_BENCHMARKS") so no user-controlled string is expanded by the shell
before execution; locate the invocation of parallel_sim.sh and the metadata echo
to make these changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3ac2d9b9-cfa4-43e1-8b86-08bb50a9b390

📥 Commits

Reviewing files that changed from the base of the PR and between 280bf4a and 7149af1.

📒 Files selected for processing (3)
  • .github/workflows/gem5-perf-template.yml
  • .github/workflows/manual-perf.yml
  • util/xs_scripts/parallel_sim.sh

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
util/xs_scripts/parallel_sim.sh (1)

124-128: ⚠️ Potential issue | 🟠 Major

Match filters against the workload name only.

Lines 125-127 currently search tolower($0), so a filter can match checkpoint paths or numeric columns instead of the workload identifier. With this file format, that will over-select unrelated rows and make specific_benchmarks behave unpredictably.

Suggested fix
         {
-            lower_line = tolower($0)
+            lower_name = tolower($1)
             for (i = 1; i <= valid; i++) {
-                if (index(lower_line, patterns[i]) > 0) {
+                if (index(lower_name, patterns[i]) > 0) {
                     print $0
                     next
                 }
             }
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@util/xs_scripts/parallel_sim.sh` around lines 124 - 128, The awk filter is
matching against the whole line (tolower($0)) so patterns can match checkpoint
paths or numeric columns; change it to match only the workload identifier field
instead (e.g., use tolower($1) or the actual workload field variable) so the
loop that checks patterns[i] and the print $0 only trigger when the workload
name matches; update the line "lower_line = tolower($0)" to "lower_line =
tolower($1)" (or the correct workload field) and keep the existing loop over
patterns[i] and valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/gem5-perf-template.yml:
- Around line 19-23: The workflow treats filtered runs the same as full-suite
runs; when inputs.specific_benchmarks is non-empty you must either skip
full-suite scoring or isolate artifacts/retention by carrying a sanitized filter
key. Update the logic that uses inputs.benchmark_type (and the scoring/summary
and archive/cleanup steps referenced by names like the scoring job, summary job,
and archive retention cleanup) to detect inputs.specific_benchmarks != "" and do
one of: 1) short-circuit the scoring/summary jobs when a filter is active, or 2)
compute a sanitized_filter (e.g., replace commas with dashes, remove unsafe
chars) and append it to artifact names, archive keys and retention-bucket
identifiers so filtered runs are stored/aged separately from baseline runs;
apply this change consistently where benchmark_type is used for naming/cleanup
and in the scoring/summary steps.

---

Duplicate comments:
In `@util/xs_scripts/parallel_sim.sh`:
- Around line 124-128: The awk filter is matching against the whole line
(tolower($0)) so patterns can match checkpoint paths or numeric columns; change
it to match only the workload identifier field instead (e.g., use tolower($1) or
the actual workload field variable) so the loop that checks patterns[i] and the
print $0 only trigger when the workload name matches; update the line
"lower_line = tolower($0)" to "lower_line = tolower($1)" (or the correct
workload field) and keep the existing loop over patterns[i] and valid.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7bad3a43-d153-4c30-9709-e41169d181e9

📥 Commits

Reviewing files that changed from the base of the PR and between 7149af1 and 986a2e6.

📒 Files selected for processing (3)
  • .github/workflows/gem5-perf-template.yml
  • .github/workflows/manual-perf.yml
  • util/xs_scripts/parallel_sim.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/manual-perf.yml

@github-actions
Copy link
Copy Markdown

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.2607 -
This PR 2.2607 ➡️ 0.0000 (0.00%)

✅ Difftest smoke test passed!

@happy-lx happy-lx merged commit a6f4a18 into xs-dev Mar 13, 2026
8 of 12 checks passed
@happy-lx happy-lx deleted the specific-bmk-support branch March 13, 2026 02:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants