[DO NOT MERGE] CVS-175243 test-hf-token-rate-limit#3692
[DO NOT MERGE] CVS-175243 test-hf-token-rate-limit#3692
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates GitHub Actions runner labels across CI workflows to use the *-st runner variants, and aligns the workflow rerun test fixtures with the updated labels.
Changes:
- Switched multiple workflow jobs from
aks-linux-*runners toaks-linux-*-strunners. - Updated sample/job runner labels embedded in the
workflow_reruntest log fixtures.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/windows.yml | Update runs-on labels to aks-linux-medium-st for containerized jobs. |
| .github/workflows/manylinux_2_28.yml | Update several runs-on labels to *-st variants for build/test jobs. |
| .github/workflows/mac.yml | Update OpenVINO download job runner label to aks-linux-medium-st. |
| .github/workflows/linux.yml | Update multiple runner labels (including matrix-provided runner) to *-st variants. |
| .github/workflows/coverity.yml | Update Coverity workflow runner labels to aks-linux-*-st. |
| .github/workflows/cleanup_caches.yml | Update cache cleanup job runner label to aks-linux-4-cores-16gb-st. |
| .github/scripts/workflow_rerun/tests/data/logs_with_error/Samples _ Samples/system.txt | Update expected runner label strings in fixture data. |
| .github/scripts/workflow_rerun/tests/data/logs_with_error/Build _ Build/system.txt | Update expected runner label strings in fixture data. |
| run: | ||
| shell: bash | ||
| runs-on: aks-linux-medium | ||
| runs-on: aks-linux-medium-st |
There was a problem hiding this comment.
PR metadata appears out of sync with the actual changes: the title/description reference “test-hf-token-rate-limit” and still contain placeholders (e.g., “Fixes #(issue)”) and an unchecked checklist, but this PR only updates GitHub Actions runner labels (e.g., switching to *-st runners). Please update the PR title/description to reflect the runner-label migration intent and either fill out the checklist or explicitly mark items N/A/remove placeholders.
| run: | ||
| shell: bash | ||
| runs-on: aks-linux-medium | ||
| runs-on: aks-linux-medium-st |
There was a problem hiding this comment.
PR metadata still contains template placeholders (e.g., "Fixes #(issue)") and the checklist is entirely unchecked. Please update the PR description to reflect the actual motivation/impact of switching runner labels to "*-st" and mark the relevant checklist items (tests/docs/ticket coverage) before this is considered for merge.
| # This file defines code owners for the OpenVINO GenAI repository | ||
|
|
||
| # Default owners | ||
| # Default owners x |
There was a problem hiding this comment.
The trailing x in the comment looks accidental and makes the comment read incorrectly. Recommend changing it back to # Default owners.
| # Default owners x | |
| # Default owners |
|
|
||
| - name: Setup HF Token | ||
| if: ${{ matrix.test.run_condition }} | ||
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" |
There was a problem hiding this comment.
The HF token is exported via $GITHUB_ENV without masking first. Please mask the token value (e.g., ::add-mask::...) before exporting, to reduce risk of accidental exposure in logs.
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" | |
| run: | | |
| hf_token="$(cat "${{ env.HF_TOKEN_PATH }}")" | |
| echo "::add-mask::$hf_token" | |
| printf 'HF_TOKEN=%s\n' "$hf_token" >> "$GITHUB_ENV" |
| working-directory: ${{ env.INSTALL_DIR }} | ||
|
|
||
| - name: Setup HF Token | ||
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" |
There was a problem hiding this comment.
This step exports HF_TOKEN from a mounted file but doesn't mask it. Please add token masking before writing to $GITHUB_ENV (and consider stripping CR/LF) to prevent accidental disclosure in logs.
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" | |
| run: | | |
| hf_token="$(tr -d '\r\n' < "${{ env.HF_TOKEN_PATH }}")" | |
| echo "::add-mask::${hf_token}" | |
| echo "HF_TOKEN=${hf_token}" >> "$GITHUB_ENV" |
| merge-multiple: true | ||
|
|
||
| - name: Setup HF Token | ||
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" |
There was a problem hiding this comment.
HF_TOKEN is being populated from a file without masking. Add ::add-mask:: for the token value before exporting it to $GITHUB_ENV to reduce the chance of leaking it in subsequent logs.
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" | |
| run: | | |
| HF_TOKEN="$(cat "${{ env.HF_TOKEN_PATH }}")" | |
| echo "::add-mask::$HF_TOKEN" | |
| echo "HF_TOKEN=$HF_TOKEN" >> "$GITHUB_ENV" |
|
|
||
| - name: Setup HF Token | ||
| if: ${{ matrix.test.run_condition }} | ||
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" |
There was a problem hiding this comment.
The HF token is loaded into an environment variable without masking. If any later step logs the environment (or errors echo the value), the token could leak. Consider reading the token into a variable, emitting an ::add-mask:: command for it, and then writing it to $GITHUB_ENV (also trimming potential CR/LF).
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" | |
| run: | | |
| hf_token="$(tr -d '\r\n' < "${{ env.HF_TOKEN_PATH }}")" | |
| echo "::add-mask::$hf_token" | |
| printf 'HF_TOKEN=%s\n' "$hf_token" >> "$GITHUB_ENV" |
| # To switch to the primary HF token, change the filename suffix to remove `-secondary`: | ||
| # HF_TOKEN_PATH: /home/runner/secrets/huggingface/hf-token | ||
| HF_TOKEN_PATH: /home/runner/secrets/huggingface/hf-token-secondary |
There was a problem hiding this comment.
The PR description still contains template placeholders (e.g., Fixes #(issue)) and the checklist is left unchecked. Please update the PR description/checklist to reflect the actual change scope and testing before merging/reviewing further.
|
|
||
| - name: Setup HF Token | ||
| if: ${{ matrix.test.run_condition }} | ||
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" |
There was a problem hiding this comment.
The HF token is loaded into an environment variable without masking. To reduce risk of accidental leakage in logs, mask the token value (via ::add-mask::) before exporting it to $GITHUB_ENV, and consider trimming CR/LF from the file contents.
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" | |
| run: | | |
| HF_TOKEN="$(tr -d '\r\n' < "${{ env.HF_TOKEN_PATH }}")" | |
| echo "::add-mask::$HF_TOKEN" | |
| echo "HF_TOKEN=$HF_TOKEN" >> "$GITHUB_ENV" |
| merge-multiple: true | ||
|
|
||
| - name: Setup HF Token | ||
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" |
There was a problem hiding this comment.
Same as above: this job exports HF_TOKEN from a file but does not mask it first. Please add masking before writing it to $GITHUB_ENV (and consider stripping CR/LF) to avoid accidental log exposure.
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" | |
| run: | | |
| hf_token="$(tr -d '\r\n' < "${{ env.HF_TOKEN_PATH }}")" | |
| echo "::add-mask::$hf_token" | |
| printf 'HF_TOKEN=%s\n' "$hf_token" >> "$GITHUB_ENV" |
| merge-multiple: true | ||
|
|
||
| - name: Setup HF Token | ||
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" |
There was a problem hiding this comment.
Same as in linux.yml: this Setup HF Token step will fail with a generic cat error if the secret file isn't available. Add an explicit check with a clear message so failures point directly to missing/mis-mounted HF token secrets.
| run: echo "HF_TOKEN=$(cat ${{ env.HF_TOKEN_PATH }})" >> "$GITHUB_ENV" | |
| run: | | |
| hf_token_path="${{ env.HF_TOKEN_PATH }}" | |
| if [ ! -r "$hf_token_path" ]; then | |
| echo "Missing or unreadable HF token secret file: $hf_token_path. Check that the HF token secret is available and correctly mounted." >&2 | |
| exit 1 | |
| fi | |
| echo "HF_TOKEN=$(cat "$hf_token_path")" >> "$GITHUB_ENV" |
| - name: Setup HF Token | ||
| run: | | ||
| hf_token_path="${{ env.HF_TOKEN_PATH }}" | ||
| if [ ! -r "$hf_token_path" ]; then | ||
| echo "Missing or unreadable HF token secret file: $hf_token_path. Check that the HF token secret is available and correctly mounted." >&2 | ||
| exit 1 | ||
| fi | ||
| hf_token="$(tr -d '\r\n' < "$hf_token_path")" | ||
| echo "::add-mask::$hf_token" | ||
| printf 'HF_TOKEN=%s\n' "$hf_token" >> "$GITHUB_ENV" |
There was a problem hiding this comment.
This Setup HF Token step hard-fails if the token secret file isn't mounted/readable. Since genai_nodejs_tests can run on pull_request, forked PRs (which don't have org secrets) are likely to fail. Please guard the job/step with a repo/fork condition (or make the HF-token logic conditional) so CI remains usable for external contributors.
| submodules: recursive | ||
|
|
||
| - name: Download Build Artifacts | ||
| - name: Download Build Artifactsx |
There was a problem hiding this comment.
Step name has an extra trailing character (Artifactsx), which looks like an accidental typo and makes CI logs harder to scan. Rename it back to "Download Build Artifacts".
| - name: Download Build Artifactsx | |
| - name: Download Build Artifacts |
| - name: Setup HF Token | ||
| run: | | ||
| hf_token_path="${{ env.HF_TOKEN_PATH }}" | ||
| if [ ! -r "$hf_token_path" ]; then | ||
| echo "Missing or unreadable HF token secret file: $hf_token_path. Check that the HF token secret is available and correctly mounted." >&2 | ||
| exit 1 | ||
| fi | ||
| hf_token="$(tr -d '\r\n' < "$hf_token_path")" | ||
| echo "::add-mask::$hf_token" | ||
| printf 'HF_TOKEN=%s\n' "$hf_token" >> "$GITHUB_ENV" | ||
|
|
There was a problem hiding this comment.
Setup HF Token unconditionally hard-fails when the secret file isn't present. Since this workflow runs on pull_request, PRs from forks (or any environment without the /home/runner/secrets/huggingface mount) will fail even if the job could otherwise run. Consider guarding the job/step (e.g., with if: github.repository_owner == 'openvinotoolkit' or a similar repo/fork check), or make the step conditional on token availability and skip the HF-dependent parts when missing.
| - name: Setup HF Token | ||
| run: | | ||
| hf_token_path="${{ env.HF_TOKEN_PATH }}" | ||
| if [ ! -r "$hf_token_path" ]; then | ||
| echo "Missing or unreadable HF token secret file: $hf_token_path. Check that the HF token secret is available and correctly mounted." >&2 | ||
| exit 1 |
There was a problem hiding this comment.
Same issue as above: this Setup HF Token step hard-fails if the secret file isn't mounted. Because genai_nodejs_tests can run on pull_request, this can break CI for forked PRs. Please gate this step/job on a condition that guarantees the secret mount exists, or skip HF-dependent actions when the token isn't available.
| tmp=$(mktemp -d) | ||
| HF_HOME="$tmp" python -c " | ||
| from huggingface_hub import snapshot_download | ||
| path = snapshot_download('katuni4ka/tiny-random-phi3', cache_dir='$tmp') |
There was a problem hiding this comment.
The probe downloads katuni4ka/tiny-random-phi3, which appears to be hosted under an individual namespace. For CI stability, prefer a clearly owned/stable test repo (e.g., an org-owned tiny-random model) so the workflow doesn’t break if the upstream repo is renamed/deleted.
| path = snapshot_download('katuni4ka/tiny-random-phi3', cache_dir='$tmp') | |
| path = snapshot_download('hf-internal-testing/tiny-random-Phi3ForCausalLM', cache_dir='$tmp') |
| - name: Setup HF Token | ||
| if: ${{ matrix.test.run_condition }} | ||
| run: | | ||
| hf_token_path="${{ env.HF_TOKEN_PATH }}" | ||
| if [ ! -r "$hf_token_path" ]; then | ||
| echo "Missing or unreadable HF token secret file: $hf_token_path. Check that the HF token secret is available and correctly mounted." >&2 | ||
| exit 1 | ||
| fi | ||
| hf_token="$(tr -d '\r\n' < "$hf_token_path")" | ||
| echo "::add-mask::$hf_token" | ||
| printf 'HF_TOKEN=%s\n' "$hf_token" >> "$GITHUB_ENV" | ||
|
|
There was a problem hiding this comment.
The HF token setup script is duplicated in multiple jobs in this workflow (and also across workflows). To reduce maintenance burden and avoid drift, consider extracting it into a reusable composite action (e.g., under .github/actions) or a reusable workflow step and calling it from each job.
| - name: Verify HF Token | ||
| if: ${{ matrix.test.run_condition }} | ||
| # Persist the token to the HF cache so it's picked up by subprocesses and tools | ||
| # that read the on-disk token instead of the HF_TOKEN env var, then prove it works. | ||
| run: | | ||
| huggingface-cli login --token "$HF_TOKEN" | ||
| huggingface-cli whoami | ||
|
|
There was a problem hiding this comment.
huggingface-cli login will persist the token to disk. With HF_HOME pointing at /mount/caches/..., this can write credentials into a shared/persistent cache volume, which is a security and hygiene risk. Prefer validating the token without writing it to a shared cache (e.g., run huggingface-cli whoami --token "$HF_TOKEN" or call HfApi(token=os.environ[...]).whoami()), or set HF_HOME to an ephemeral directory for the login/verification step.
| - name: HF cache probe | ||
| if: ${{ matrix.test.run_condition }} | ||
| # Debug: show how much of the HF cache is already warm and whether the | ||
| # authenticated API call goes out over the network at all. | ||
| run: | | ||
| python -c " | ||
| from huggingface_hub import constants | ||
| print('HF_HOME=', constants.HF_HOME) | ||
| print('HF_HUB_CACHE=', constants.HF_HUB_CACHE) | ||
| " | ||
| echo "--- HF_HOME size ---" | ||
| du -sh "$HF_HOME" 2>/dev/null || true | ||
| echo "--- Top-level HF hub cache entries ---" | ||
| ls -1 "$HF_HOME/hub" 2>/dev/null | head -50 || true | ||
| echo "--- Authenticated whoami via API (forces network call) ---" | ||
| HF_HUB_VERBOSITY=debug python -c " | ||
| from huggingface_hub import HfApi | ||
| print(HfApi().whoami()) | ||
| " | ||
|
|
||
| - name: Force fresh HF download (debug only) | ||
| if: ${{ matrix.test.run_condition }} | ||
| # Use a throwaway cache dir so the request bypasses the warm shared cache | ||
| # and every GET/HEAD + rate-limit header is visible in the job log. | ||
| env: | ||
| HF_HUB_VERBOSITY: debug | ||
| HF_HUB_DISABLE_TELEMETRY: '1' | ||
| run: | | ||
| tmp=$(mktemp -d) | ||
| HF_HOME="$tmp" python -c " | ||
| from huggingface_hub import snapshot_download | ||
| path = snapshot_download('katuni4ka/tiny-random-phi3', cache_dir='$tmp') | ||
| print('Downloaded to:', path) | ||
| " | ||
| rm -rf "$tmp" |
There was a problem hiding this comment.
The added HF debug steps (HF cache probe / Force fresh HF download) run on every applicable CI run and significantly increase log noise and runtime, and they also introduce extra external network calls (which can worsen rate-limit issues). Please gate these steps behind an explicit opt-in (workflow_dispatch input / env flag) or remove them before merging into main CI.
| # To switch to the primary HF token, change the filename suffix to remove `-secondary`: | ||
| # HF_TOKEN_PATH: /home/runner/secrets/huggingface/hf-token | ||
| HF_TOKEN_PATH: /home/runner/secrets/huggingface/hf-token-secondary |
There was a problem hiding this comment.
HF_TOKEN_PATH is introduced but never referenced anywhere else in this workflow (no steps export HF_TOKEN from this file, and HF_TOKEN is no longer passed into the container). As a result, Hugging Face-authenticated downloads/tests will not receive a token. Add a step that reads the token from HF_TOKEN_PATH and exports it (or pass the needed env var into the container) so downstream tools can authenticate.
| # To switch to the primary HF token, change the filename suffix to remove `-secondary`: | ||
| # HF_TOKEN_PATH: /home/runner/secrets/huggingface/hf-token | ||
| HF_TOKEN_PATH: /home/runner/secrets/huggingface/hf-token-secondary |
There was a problem hiding this comment.
HF_TOKEN_PATH is defined but not used anywhere else in this workflow, and HF_TOKEN is no longer passed into the test containers. This likely means Hugging Face authentication is silently lost for tests that require it. Please add a step to export a token from HF_TOKEN_PATH (or reintroduce the required container env pass-through) so the intended token switch actually takes effect.
| image: openvinogithubactions.azurecr.io/ov_test/ubuntu_22_04_x64:${{ needs.openvino_download.outputs.docker_tag }} | ||
| volumes: | ||
| - /mount:/mount | ||
| - /home/runner/secrets/huggingface:/home/runner/secrets/huggingface:ro |
There was a problem hiding this comment.
Mounting /home/runner/secrets/huggingface from the host into the job container exposes that secrets directory to any commands executed in this workflow. Since this workflow runs on pull_request, this can enable secret exfiltration by untrusted PR code. Consider avoiding host-path secret mounts for PRs, or gating the mount/token use to trusted events (e.g., push/merge_group) or trusted actors.
| - /home/runner/secrets/huggingface:/home/runner/secrets/huggingface:ro |
| - /mount:/mount | ||
| - /home/runner/secrets/huggingface:/home/runner/secrets/huggingface:ro | ||
| - ${{ github.workspace }}:${{ github.workspace }} | ||
| options: -e HF_TOKEN | ||
|
|
There was a problem hiding this comment.
This workflow mounts /home/runner/secrets/huggingface from the host into the container. On pull_request runs, that can leak host secrets to PR-controlled code. Please avoid host-path secret mounts in PR workflows, or gate this mount/token usage to trusted events/actors only.
Description
CVS-175243
Fixes #(issue)
Checklist: