ROSAENG-58088: ci: add fake_cluster to rosa-sts-ad profile#1178
ROSAENG-58088: ci: add fake_cluster to rosa-sts-ad profile#1178amandahla wants to merge 1 commit into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds profile-level custom properties: a ChangesCustom Properties Profile Feature
🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
76a3023 to
0439334
Compare
ce008b4 to
48cff57
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/utils/profilehandler/profile.go (1)
4-57: 💤 Low valueConsider breaking long lines for linter compliance.
The static analysis tool flags multiple lines exceeding 120 characters (lines 9, 10, 15, 16, 33, 38, 46, 57). While these appear to be alignment-only reformatting changes, the linter violations could be addressed by:
- Splitting tags across multiple lines
- Using shorter field names (less desirable)
- Adjusting struct tag formatting
Since this affects existing fields and doesn't change functionality, this can be deferred if alignment consistency is preferred.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/utils/profilehandler/profile.go` around lines 4 - 57, Multiple struct field lines (e.g., MajorVersion, Version, Zones, StorageLB, Etcd, WorkerDiskSize, AdditionalSGNumber, DontWaitForCluster) exceed the 120-char linter limit due to long inline comments/tags; fix by moving the trailing inline comments off the field line into their own preceding or following comment lines (or shortening those comments) so the field name + tag line stays <120 chars without changing tags or field names (move comments for MajorVersion, Version, Zones, StorageLB, Etcd, WorkerDiskSize, AdditionalSGNumber, DontWaitForCluster).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/utils/profilehandler/profile.go`:
- Around line 4-57: Multiple struct field lines (e.g., MajorVersion, Version,
Zones, StorageLB, Etcd, WorkerDiskSize, AdditionalSGNumber, DontWaitForCluster)
exceed the 120-char linter limit due to long inline comments/tags; fix by moving
the trailing inline comments off the field line into their own preceding or
following comment lines (or shortening those comments) so the field name + tag
line stays <120 chars without changing tags or field names (move comments for
MajorVersion, Version, Zones, StorageLB, Etcd, WorkerDiskSize,
AdditionalSGNumber, DontWaitForCluster).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 2b0b950b-10bc-4566-b5c5-98b537d391eb
📒 Files selected for processing (3)
tests/ci/profiles/tf_classic_cluster_profiles.ymltests/utils/profilehandler/handler.gotests/utils/profilehandler/profile.go
|
/test e2e-presubmits-rosa-sts-advanced-critical-high-presubmit |
48cff57 to
f3c7732
Compare
|
/test e2e-presubmits-rosa-sts-advanced-critical-high-presubmit |
8a225fa to
f139e9c
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/utils/profilehandler/handler.go (1)
235-237: ⚡ Quick winMake fake-cluster detection tolerant to case and whitespace.
Line 236 only matches exact
"true", so values like"True"or" true "silently disable fake-cluster behavior.Suggested fix
func (ctx *profileContext) isFakeCluster() bool { - return ctx.profile.CustomProperties["fake_cluster"] == "true" + return strings.EqualFold(strings.TrimSpace(ctx.profile.CustomProperties["fake_cluster"]), "true") }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/utils/profilehandler/handler.go` around lines 235 - 237, The isFakeCluster method currently only returns true for an exact "true" value; update it to trim whitespace and do a case-insensitive comparison against "true" so values like " True " or "TRUE" are accepted. In the profileContext.isFakeCluster function, use strings.TrimSpace on ctx.profile.CustomProperties["fake_cluster"] and compare with strings.EqualFold (or strings.ToLower) to "true"; add the necessary import for the strings package. This preserves the same semantics but makes fake-cluster detection tolerant to case and surrounding whitespace.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/utils/profilehandler/handler.go`:
- Around line 894-898: The code dereferences clusterArgs.AccountRolePrefix
inside the block that calls ctx.PrepareKMSKey without ensuring AccountRolePrefix
is set, which can panic when Etcd/KMSKey is true but STS (and thus
AccountRolePrefix) is unset; update the guard on the if that uses
ctx.profile.Etcd || ctx.profile.KMSKey && !ctx.isFakeCluster() to also require
clusterArgs.AccountRolePrefix != nil (or equivalently check STS flag) before
calling ctx.PrepareKMSKey, and only dereference *clusterArgs.AccountRolePrefix
inside that guarded block so PrepareKMSKey is invoked with safe, initialized
arguments (functions/variables to locate: ctx.PrepareKMSKey,
clusterArgs.AccountName/ClusterName, clusterArgs.AccountRolePrefix,
ctx.profile.UnifiedAccRolesPath, ctx.profile.Etcd, ctx.profile.KMSKey,
ctx.isFakeCluster()).
---
Nitpick comments:
In `@tests/utils/profilehandler/handler.go`:
- Around line 235-237: The isFakeCluster method currently only returns true for
an exact "true" value; update it to trim whitespace and do a case-insensitive
comparison against "true" so values like " True " or "TRUE" are accepted. In the
profileContext.isFakeCluster function, use strings.TrimSpace on
ctx.profile.CustomProperties["fake_cluster"] and compare with strings.EqualFold
(or strings.ToLower) to "true"; add the necessary import for the strings
package. This preserves the same semantics but makes fake-cluster detection
tolerant to case and surrounding whitespace.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: terraform-redhat/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 1ea99c0b-b48d-40a1-8e46-53787995ef91
📒 Files selected for processing (4)
tests/ci/profiles/tf_classic_cluster_profiles.ymltests/tf-manifests/aws/security-groups/main.tftests/utils/profilehandler/handler.gotests/utils/profilehandler/profile.go
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/ci/profiles/tf_classic_cluster_profiles.yml
- tests/tf-manifests/aws/security-groups/main.tf
- tests/utils/profilehandler/profile.go
f139e9c to
3e68ac1
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
203eef9 to
868d5ac
Compare
|
@amandahla: This pull request references ROSAENG-58088 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
868d5ac to
233a859
Compare
df0e524 to
df29c55
Compare
|
/test e2e-presubmits-rosa-sts-private-critical-high-presubmit |
1 similar comment
|
/test e2e-presubmits-rosa-sts-private-critical-high-presubmit |
9bf020f to
dcab47f
Compare
|
/test e2e-presubmits-rosa-sts-advanced-critical-high-presubmit |
Signed-off-by: Amanda Hager Lopes de Andrade Katz <amanda.katz@redhat.com>
dcab47f to
cbbd226
Compare
|
@amandahla: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Closing since after our meeting we decided on follow a different approach by improving our unit tests + reviewing heavy e2e tests (moving them to periodic, for example). |
PR Summary
Adds
fake_cluster=truesupport to therosa-sts-adandrosa-sts-plCI profiles, replacing real AWS cluster provisioning in PR presubmit jobs with OCM-registered fake clusters. Setup time drops from 20+ minutes (plus frequent NAT Gateway quota failures) to under 5 minutes, andall previously failing Day2 tests now pass because the cluster exists in OCM.
Detailed Description of the Issue
The
e2e-presubmits-rosa-sts-advanced-critical-high-presubmitande2e-presubmits-rosa-sts-private-critical-high-presubmitprow jobs created real ROSA Classic clusters with full AWS infrastructure (VPC with 3 NAT Gateways, KMS key, proxy EC2 instance, security groups). Thiscaused two recurring problems:
ap-northeast-1CI account has a soft limit of 5 NAT Gateways per region; leaked resources from prior failed runs blocked new jobs entirely (0 tests reached execution).us-east-1CI account has a VPC limit; the private presubmit hit this limit and failed before any test ran.The terraform provider does not interact with AWS directly — it translates Terraform config into OCM API calls. AWS provisioning is OCM's responsibility and is already covered by
uhc-clusters-servicefull-cycle tests. Running provider tests against real AWS duplicates thatcoverage unnecessarily.
Setting
fake_cluster=truein clustercustom_propertiestells Hive to skip real provisioning while OCM still validates all STS roles, subnets, and configuration in preflight. All OCM API operations (create, read, update resources) work normally against a fake cluster, soDay1Post and Day2 test assertions are identical to the real-cluster case.
Known Limitations and Risks
rosa-sts-private-critical-high-presubmit— VPC Quota RiskOCM validates real subnets in preflight (before
fake_clusteris checked by Hive), so the private profile still requires a real VPC. Theus-east-1CI account has a VPC soft limit; leaked VPCs from previous failed PR runs can exhaust this quota and block the job.Workarounds:
us-east-1in theoex-aws-qeaccount.rosa-sts-private-critical-high-presubmitto a nightly periodic job — the private-link assertion is low-risk and does not need to block every PR.optional: trueso failures do not block merge.rosa-sts-advanced-critical-high-presubmit— Trust Bundle ValidationThe fake proxy implementation injects a dummy PEM certificate as the trust bundle. If OCM validates X.509 content strictly in preflight (not yet confirmed), the trust bundle constant in
FakeClusterTrustBundlemay need replacing with a real self-signed certificate.Related Issues and PRs
#Type of Change
Previous Behavior
rosa-sts-adandrosa-sts-plprofiles created real ROSA Classic clusters with full AWS infrastructure (VPC + NAT Gateways, KMS key, proxy EC2 instance, security groups).CreateClusterByProfiledue to NAT Gateway or VPC quota exhaustion, resulting in 0 tests running.CustomPropertieswere silently overwritten by global defaults.Behavior After This Change
custom_properties: {fake_cluster: "true"}, skipping real Hive/HyperShift provisioning.no_nat_gatewaypath (IGW + public/private subnets, no NAT Gateways) that satisfies OCM's 6-subnet Multi-AZ requirement without consuming NAT Gateway quota.PrepareProxy()provisioning are skipped for fake clusters; account roles and OIDC still run (OCM validates STS roles and subnets in preflight).proxy: true, dummy proxy values are injected directly into cluster args — OCM stores them and the test reads them back from OCM.rosa-sts-pl(private-link), only private subnets are passed andPrivate/PrivateLinkflags are set correctly.CustomPropertiesare merged on top of global defaults viahelper.MergeMaps.rosa-sts-adprofile addsautoscaling_enabled: trueandproxy: trueso tests 88408 and 67607 now run instead of skipping.rosa-sts-ad-realandrosa-sts-pl-realprofile are added for full real-AWS validation when needed.~> 5.0to fix a pre-existing v6 breakage (ingress_cidr_blocksrename).How to Test (Step-by-Step)
Preconditions
ap-northeast-1(advanced) andus-east-1(private).rosa token).fake_clustersupport present in the target OCM service.Test Steps
/test rosa-sts-advanced-critical-high-presubmiton a PR.SUCCESS!with 13 passing tests./test rosa-sts-private-critical-high-presubmitand confirm the same result.Expected Results
Both jobs pass. No NAT Gateways are created.
fake_cluster=trueis visible in the cluster'scustom_propertiesin the OCM API.Proof of the Fix
2062919008268587008—SUCCESS! 3m47.987639185s PASS, 13 passed, 0 failed.Test Coverage
ci/prow/e2e-presubmits-rosa-sts-advanced-critical-high-presubmit
Profile
rosa-sts-ad: fips=true, etcd=true, proxy=true, labeling=true, tagging=true, autoscaling=true, private_link=false, imdsv2=required, additional_sg=4, worker_disk=200,fake_cluster=trueSummary: 15 ✅ PASS / 0⚠️ gap / 4 🚫 always skipped / 0 ❌
ci/prow/e2e-presubmits-rosa-sts-private-critical-high-presubmit
Profile
rosa-sts-pl: fips=false, etcd=false, proxy=false, labeling=false, tagging=false, private_link=true, autoscaling=false, imdsv2=optional, additional_sg=0,fake_cluster=trueSummary: 13 ✅ PASS / 0⚠️ gap / 6 🚫 always skipped / 0 ❌
Breaking Changes
Developer Verification Checklist
[JIRA-TICKET] | [TYPE][(scope)][!]: <MESSAGE>.make install-hookshas been run in this clone.make pre-push-checkspasses.make fmt-checkpasses.make buildpasses.