Skip to content

Commit 49054e1

Browse files
committed
CI: Add automatic dependent repo resolution logic.
The openEMS Project is a multi-repo project. Changes frequently affect multiple repos, with changes depending on each other. Forking a repo also means all related repos must be forked together, creating a poor contribution experience, since the CI pipeline doesn't work properly for them. This commit introduces automatic dependent-repo resolution logic to the CI/CD pipeline, with the following rules. 1. Want to develop a multi-repo change in CSXCAD and openEMS? Use the same branch name, they'll be tested together. 2. Want to send Pull Requests to both CSXCAD and openEMS? Use the same branch name in your fork, and open two individual Pull Requests against CSXCAD and openEMS. If both Pull Requests are opened, and the source branch name is the same, they'll be tested together. 3. Want to fork openEMS and develop it at the downstream? You can fork the only the repos you need. The CI/CD script automatically uses the project founder's repos as fall back for non-forked dependencies. You can override these dependencies by forking more repos. == Owner-then-Founder Dependency Lookup == If a git commit is pushed into a repo, or if a repo has received a Pull Request from a contributor, to build this repo, we need to look for the repo's dependencies. We check all dependencies one by one. If the repo's owner also has dependent repo under their GitHub account. If it's the case, the owner's copy is used as the dependency. If it's not the case, the founder account thliebig's copy is used as the dependency. This two-tier lookup solves several problems: 1. An experimenter can only fork openEMS without forking any dependencies. Since the CI/CD script falls back to the founder's repos for dependencies, their local repo's CI/CD works automatically. If they want to make a change to CSXCAD and openEMS at the same time, they only need to selectively fork CSXCAD, so that the "repo owner override" takes effect. At the same time, they can keep using the founder's fparser repo as a fallback without forking it. 2. Someone may send a Pull Request against a downstream fork itself, rather than the project founder. For example, I have a downstream openEMS fork named fasterEMS. If someone sends a Pull Request against my "fasterEMS/CSXCAD", my repo's CI/CD needs to use the same-owner repo "fasterEMS/fparser" as the dependency, not project founder's "thliebig/fparser". == Pull-Request Ganging == If a Pull Request "pr_primary" is submitted against the upstream repo "repo_primary", we check whether the same contributor has also opened another Pull Request "pr_dependency" against our dependency "repo_dependency", and whether both Pull Requests uses the same source branch name, such as "feature". If both conditions are met, we can say that two Pull Requests are "ganged", they are tested by CI/CD together. When testing "pr_primary" at "repo_primary", instead of using default repos as dependencies, we checkout the merge commit of "pr_secondary" at the "repo_secondary" as its dependency. For more than 2 repos, the same "ganged PR" logic also applies. If Pull-Request Ganging is activated, a warning is generated in the "Annotations" panel of the GitHub Actions Summary page, reminding developers to merge the dependent PR first before the main PR. == Branch Ganging == If a git commit is submitted to a non-default branch "feature" of repo "repo_primary", we check whether our dependency "repo_dependency" also has a branch named "feature". If this condition is met, we can say that two repo branches are "ganged", they are tested by CI/CD together. When testing the branch "feature" of "repo_primary", instead of using default branch "master" of "repo_dependency" as the dependency, we checkout the branch "feature" the "repo_dependency" instead. This allows testing multi-repo features together while it's in development. For more than 2 repos, the same "ganged branches" logic also applies. If Branch Ganging is activated, a warning is generated in the "Annotations" panel of the GitHub Actions Summary page, reminding developers that a different branch of the dependent repo is used for this test. == Implementation Details == This commit adds a new CI job called "Resolve Repo Dependencies", which is executed before all other jobs. This job calls the Python script "scripts/resolve_dependent_repos.py", which uses GitHub Actions variables and GitHub APIs to find the needed dependencies according to the conditions and contexts explained above. After the script finishes, it generates several output variables that contain the repoes and branches of all dependencies. These output variables are passed using the shell variable "$GITHUB_OUTPUT" (provided by GitHub Actions), later, all other jobs use these variable as their inputs for the "actions/checkout" steps. Signed-off-by: Yifeng Li <tomli@tomli.me>
1 parent c62051e commit 49054e1

2 files changed

Lines changed: 395 additions & 45 deletions

File tree

Lines changed: 290 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
#!/usr/bin/env python3
2+
#
3+
# Copyright (C) 2026 Yifeng Li <tomli@tomli.me>
4+
#
5+
# Permission to use, copy, modify, and/or distribute this software for
6+
# any purpose with or without fee is hereby granted.
7+
#
8+
# THE SOFTWARE IS PROVIDED “AS IS” AND THE AUTHOR DISCLAIMS ALL
9+
# WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES
10+
# OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE
11+
# FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY
12+
# DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN
13+
# AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
14+
# OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
15+
#
16+
#
17+
# To test it locally, you can create a shell script:
18+
#
19+
# #!/bin/bash
20+
# export GITHUB_API_URL="https://api.github.com"
21+
# export GITHUB_REPOSITORY_OWNER="thliebig"
22+
# export GITHUB_EVENT_NAME="pull_request"
23+
# export GITHUB_OUTPUT="/dev/null"
24+
# export GITHUB_FORK_OWNER="find the username who has an open pull request"
25+
# export GITHUB_HEAD_REF="the source branch of the same pull request"
26+
# export GITHUB_TOKEN="login is optional, but can bypass low API rate limit"
27+
#
28+
# Note that all variables but GITHUB_TOKEN and GITHUB_FORK_OWNER are default
29+
# variables on GitHub Actions, while GITHUB_TOKEN and GITHUB_FORK_OWNER is are
30+
# non-standard variables set manually in the ".yml" file.
31+
#
32+
# or --project openEMS, --project QCSXCAD, --project AppCSXCAD
33+
# python3 resolve_dependent_repos.py --project CSXCAD
34+
35+
import argparse
36+
import os
37+
from time import sleep
38+
39+
import json
40+
import urllib.request
41+
42+
43+
PROJECT_FOUNDER = "thliebig"
44+
45+
46+
def print_log(loglevel, string, title=None):
47+
if loglevel not in ["debug", "notice", "warning", "error"]:
48+
raise ValueError("Unsupported GitHub Actions loglevel: %s" % loglevel)
49+
50+
parameters = ""
51+
if title:
52+
parameters = " title=%s" % title
53+
54+
print("::%s%s::%s" % (loglevel, parameters, string))
55+
56+
57+
def getenv(var):
58+
value = os.getenv(var)
59+
if not value:
60+
errormsg = "Unable to determine GitHub Actions variable $%s!" % var
61+
print_log("error", errormsg)
62+
raise ValueError(errormsg)
63+
else:
64+
return value
65+
66+
67+
def get_default_repo_dependency(project):
68+
REPO = {
69+
"fparser": {
70+
"owner": PROJECT_FOUNDER,
71+
"name": "fparser",
72+
"branch": "master",
73+
},
74+
"CSXCAD": {
75+
"owner": PROJECT_FOUNDER,
76+
"name": "CSXCAD",
77+
"branch": "master"
78+
},
79+
"openEMS": {
80+
"owner": PROJECT_FOUNDER,
81+
"name": "openEMS",
82+
"branch": "master"
83+
},
84+
"QCSXCAD": {
85+
"owner": PROJECT_FOUNDER,
86+
"name": "QCSXCAD",
87+
"branch": "master"
88+
},
89+
"AppCSXCAD": {
90+
"owner": PROJECT_FOUNDER,
91+
"name": "AppCSXCAD",
92+
"branch": "master"
93+
},
94+
}
95+
96+
# We can check all repos, but it wastes GitHub API calls.
97+
# Especially for local testing without logging it, the quota
98+
# is only 60 calls/hr. You can burn it out just after a few
99+
# try. It also spams useless NOTICE messages per repo to the
100+
# GitHub Actions Summary.
101+
#
102+
# So we only check repos actually used by "project".
103+
PER_PROJECT_DEPENDENCY = {
104+
"CSXCAD": [REPO["fparser"]],
105+
"openEMS": [REPO["fparser"], REPO["CSXCAD"]],
106+
"QCSXCAD": [REPO["fparser"], REPO["CSXCAD"]],
107+
"AppCSXCAD": [REPO["fparser"], REPO["CSXCAD"], REPO["QCSXCAD"]],
108+
}
109+
return PER_PROJECT_DEPENDENCY[project]
110+
111+
112+
def determine_repo_dependency(project):
113+
github_event_name = getenv("GITHUB_EVENT_NAME")
114+
if github_event_name not in ["pull_request", "push"]:
115+
print_log("error", "Unable to determine $GITHUB_EVENT_NAME!")
116+
return get_default_repo_dependency(project)
117+
118+
if github_event_name == "pull_request":
119+
src_branch = getenv("GITHUB_HEAD_REF")
120+
elif github_event_name == "push":
121+
src_branch = getenv("GITHUB_REF_NAME")
122+
else:
123+
assert False
124+
125+
repo_dependency = get_default_repo_dependency(project)
126+
127+
# If this project is a fork, we need to check whether the
128+
# user or organization has also forked other repos. If they
129+
# did, rewrite the project owner's name to the fork owner's
130+
# name. This allows the fork itself to become a new "upstream"
131+
# with correct CI/CD, since forks can also accept commits
132+
# and Pull Requests.
133+
owner = getenv("GITHUB_REPOSITORY_OWNER")
134+
if owner != PROJECT_FOUNDER:
135+
for repo in repo_dependency:
136+
if reponame_exists(owner, repo["name"]):
137+
repo["owner"] = owner
138+
139+
# Furthermore, if the dependency repo has a branch with the
140+
# same name as the current repo branch, use that branch instead.
141+
# This allowing testing multi-repo features.
142+
for repo in repo_dependency:
143+
if reponame_has_branch(owner, repo["name"], src_branch):
144+
repo["branch"] = src_branch
145+
146+
# only warn dependencies, not current repo
147+
current_repo = repo["name"] == getenv("GITHUB_REPOSITORY").split("/")[1]
148+
149+
if not current_repo:
150+
print_log(
151+
"warning",
152+
'repo %s/%s branch %s is used instead of the default branch.' %
153+
(repo["owner"], repo["name"], src_branch),
154+
title='%s/%s: different branch "%s" used' %
155+
(repo["owner"], repo["name"], src_branch)
156+
)
157+
158+
# If the event is a pull request...
159+
if github_event_name == "pull_request":
160+
contributor = getenv("GITHUB_FORK_OWNER")
161+
162+
# check whether the contributor has forked other dependent repos.
163+
for repo in repo_dependency:
164+
# If they didn't, use the upstream dependency.
165+
if not reponame_exists(contributor, repo["name"]):
166+
continue
167+
168+
# If they did, check whether they've submitted an open pull
169+
# request against the upstream, and the source branch uses
170+
# the same name
171+
pr = (
172+
pr_by_user_with_branch(
173+
repo["owner"], repo["name"], contributor, src_branch
174+
)
175+
)
176+
if not pr:
177+
continue
178+
179+
# If they also did, use that Pull Request's merge commit as
180+
# the dependency repo.
181+
repo["branch"] = pr["merge_commit_sha"]
182+
repo["pr_url"] = pr["html_url"]
183+
print_log(
184+
"warning",
185+
'%s/%s: %s must be merged first before merging this commit (%s)' %
186+
(repo["owner"], repo["name"], pr["html_url"], pr["title"]),
187+
title="Must-Merge Pull Request: %s" % pr["html_url"]
188+
)
189+
190+
return repo_dependency
191+
192+
193+
def url_open(url, expect_code=None):
194+
DEFAULT_HEADERS = {
195+
"Accept": "application/vnd.github+json",
196+
"X-GitHub-Api-Version": "2022-11-28",
197+
}
198+
RETRIES = 3
199+
error = None
200+
201+
headers = DEFAULT_HEADERS.copy()
202+
try:
203+
token = getenv("GITHUB_TOKEN")
204+
headers["Authorization"] = "Bearer %s" % token
205+
except ValueError:
206+
# for local testing only
207+
print_log(
208+
"warning",
209+
"$GITHUB_TOKEN not found! API requests are rate-limited!"
210+
)
211+
212+
request = urllib.request.Request(url, headers=headers)
213+
214+
for i in range(RETRIES):
215+
try:
216+
response = urllib.request.urlopen(request)
217+
return response, response.getcode()
218+
except urllib.error.HTTPError as e:
219+
if e.code == expect_code:
220+
return None, e.code
221+
else:
222+
error = e
223+
except urllib.error.URLError as e:
224+
error = e
225+
sleep(1)
226+
227+
raise error
228+
229+
230+
def url_check_existence(url):
231+
response, code = url_open(url, expect_code=404)
232+
if code == 404:
233+
return False
234+
elif code == 200:
235+
return True
236+
else:
237+
raise RuntimeError("Unexpected HTTP status code: %d" % code)
238+
239+
240+
def reponame_exists(owner, repo):
241+
API = getenv("GITHUB_API_URL")
242+
url = "%s/repos/%s/%s" % (API, owner, repo)
243+
return url_check_existence(url)
244+
245+
246+
def reponame_has_branch(owner, repo, branch):
247+
API = getenv("GITHUB_API_URL")
248+
url = "%s/repos/%s/%s/branches/%s" % (API, owner, repo, branch)
249+
return url_check_existence(url)
250+
251+
252+
def pr_by_user_with_branch(owner, repo, user, branch):
253+
API = getenv("GITHUB_API_URL")
254+
url = "%s/repos/%s/%s/pulls?head=%s:%s" % (API, owner, repo, user, branch)
255+
raw_response, code = url_open(url)
256+
pr = json.loads(raw_response.read().decode("UTF-8"))
257+
258+
if pr:
259+
return pr[0]
260+
else:
261+
return None
262+
263+
264+
def output_repo_dependency(repo_dependency):
265+
for repo in repo_dependency:
266+
repo_value = "%s/%s" % (repo["owner"], repo["name"])
267+
branch_value = repo["branch"]
268+
if "pr_url" in repo:
269+
pr_value = repo["pr_url"]
270+
else:
271+
pr_value = "None"
272+
273+
print_log(
274+
"notice", "%s, pr: %s, branch: %s" % (repo_value, pr_value, branch_value),
275+
title="%s repo" % repo["name"]
276+
)
277+
278+
with open(getenv("GITHUB_OUTPUT"), "a") as output:
279+
output.write("%s_repo=%s\n" % (repo["name"], repo_value))
280+
output.write("%s_branch=%s\n" % (repo["name"], branch_value))
281+
output.write("%s_pr_url=%s\n" % (repo["name"], pr_value))
282+
283+
284+
if __name__ == "__main__":
285+
parser = argparse.ArgumentParser(description='Resolve dependent repos')
286+
parser.add_argument("--project", type=str, required=True)
287+
args = parser.parse_args()
288+
289+
repo_dependency = determine_repo_dependency(args.project)
290+
output_repo_dependency(repo_dependency)

0 commit comments

Comments
 (0)