Skip to content

Eliminate Duplicate CI Runs & Add Post-Merge Validation#2643

Draft
TackAdam wants to merge 3 commits intoopensearch-project:mainfrom
TackAdam:ci1
Draft

Eliminate Duplicate CI Runs & Add Post-Merge Validation#2643
TackAdam wants to merge 3 commits intoopensearch-project:mainfrom
TackAdam:ci1

Conversation

@TackAdam
Copy link
Copy Markdown
Collaborator

@TackAdam TackAdam commented Apr 7, 2026

Description

  • Remove push trigger from 4 PR-facing workflows to stop every PR from running the full CI suite twice
  • Add new post_merge_test.yml workflow for post-merge validation on main and release branches

Problem

All CI workflows except lint.yml trigger on both push and pull_request:

on: [pull_request, push]

When a commit is pushed to a branch with an open PR, GitHub fires both events, causing every workflow to run twice with the same commit SHA. This means:

Workflow Jobs per trigger Duplicate runs per PR push
Test and Build (build-linux + build-windows-macos) 3 6 total (3 wasted)
Integration Tests (8 Cypress groups) 8 16 total (8 wasted)
FTR E2E Test 1 2 total (1 wasted)
Verify Binary Install 1 2 total (1 wasted)
Total 13 26 total (13 wasted)

Every PR push consumes ~2x the necessary CI compute.

Changes

Workflows modified (trigger only)

File Before After
dashboards-observability-test-and-build-workflow.yml on: [pull_request, push] on: [pull_request]
integration-tests-workflow.yml on: [pull_request, push] on: [pull_request]
ftr-e2e-dashboards-observability-test.yml on: [pull_request, push] on: [pull_request]
verify-binary-install.yml on: [push, pull_request] on: [pull_request]

No other changes to these files — all job definitions, steps, and matrix configurations remain identical.

New workflow: post_merge_test.yml

Runs on push to main and release branches ([0-9]+.[0-9]+, [0-9]+.x). Contains three jobs:

  1. build-and-test-linux — Unit tests, coverage upload, and artifact build (mirrors existing build-linux job from the test-and-build workflow)
  2. integration-tests — 7 Cypress test groups with fail-fast: false so one flaky group doesn't cancel the rest

Additional improvements applied to the new workflow (not backported to existing workflows in this PR):

  • fetch-depth: 1 on all checkout steps (shallow clones)
  • timeout-minutes on all jobs (60 min build, 90 min integration, 5 min issue creation)
  • >> $GITHUB_OUTPUT instead of deprecated ::set-output
  • actions/setup-node@v4 instead of @v1
  • Bootstrap retry with 3 attempts (up from 2)
  • Plugin version derived from opensearch_dashboards.json at runtime instead of hardcoded
  • Cypress video artifacts uploaded only on failure() instead of always()

What is NOT in this PR

  • No changes to job definitions, test groups, or step logic in existing workflows
  • No changes to Cypress config or test files
  • No changes to the lint, backport, stale, or other housekeeping workflows
  • The apm_test group is excluded from post-merge integration tests (it requires Prometheus setup and is better suited for the dedicated PR workflow)

CI Cost Impact

Metric Before After
Workflow runs per PR push 2× (push + pull_request) 1× (pull_request only)
Wasted jobs per PR push ~13 0
Post-merge coverage Identical duplicate of PR run Focused Linux-only subset

Risk Assessment

Low risk. The only change to existing workflows is removing the push trigger — no job logic is modified. The new post_merge_test.yml is additive and only runs post-merge, so it cannot affect PR checks.

If a regression is introduced and merged, the post-merge workflow catches it and auto-creates an issue. Previously, the duplicate push-triggered run would also catch it, but with no notification mechanism — failures on main could go unnoticed.

References

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Adam Tackett and others added 3 commits April 7, 2026 16:40
Signed-off-by: Adam Tackett <tackadam@amazon.com>
Signed-off-by: Adam Tackett <tackadam@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant