Skip to content

Commit 9bf655c

Browse files
committed
Add docs about large monorepos
1 parent bdb7f9f commit 9bf655c

6 files changed

Lines changed: 155 additions & 11 deletions

File tree

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
{
2+
"changes": [
3+
{
4+
"type": "patch",
5+
"comment": "Update readme with link to large repos performance info",
6+
"packageName": "beachball",
7+
"email": "elcraig@microsoft.com",
8+
"dependentChangeType": "patch"
9+
}
10+
]
11+
}

docs/.vuepress/config.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ export default defineUserConfig({
3131
'/concepts/groups',
3232
'/concepts/ci-integration',
3333
'/concepts/ai-integration',
34+
'/concepts/large-repos',
3435
],
3536
},
3637
{

docs/cli/options.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,6 @@ The options below apply to most CLI commands.
2525
| `--since` | | | only consider changes or change files since this git ref (branch name, commit SHA) |
2626
| `--verbose` | | | prints additional information to the console |
2727

28-
[1]: ../overview/configuration#determining-the-target-branch-and-remote
28+
[1]: ../overview/configuration#specifying-the-target-branch-and-remote
2929
[2]: https://www.npmjs.com/package/cosmiconfig
3030
[3]: ../overview/configuration#scoping

docs/concepts/large-repos.md

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
---
2+
tags:
3+
- overview
4+
category: doc
5+
---
6+
7+
# Optimizing performance in large repos
8+
9+
Beachball has several options that can help improve performance in large to very large monorepos.
10+
11+
All the code snippets below reference `beachball.config.js`. The snippets omit some boilerplate for brevity, but the full config should look something like this (the separate typed declaration provides intellisense):
12+
13+
```js
14+
/** @type {Partial<import('beachball').RepoOptions>} */
15+
const config = {
16+
// your options
17+
};
18+
module.exports = config;
19+
```
20+
21+
## Specifying the remote branch
22+
23+
If no `branch` option is specified, or it doesn't include a remote (recommended for GitHub due to forks), Beachball has to determine the correct remote for comparison using git operations and potentially `package.json` `"repository"`. You can reduce git operations by [providing certain settings](../overview/configuration#specifying-the-target-branch-and-remote). This most noticeably improves the perf of `beachball change` and `beachball check`.
24+
25+
## Concurrency
26+
27+
### Publish and hooks
28+
29+
**`concurrency`** (default: `1`) controls the maximum number of concurrent write operations during publish, including hook calls and `npm publish`. The default of `1` is conservative — if you don't use hooks, or your hooks are safe to run in parallel, increasing this can speed up publishing:
30+
31+
```js
32+
const config = {
33+
concurrency: 5,
34+
};
35+
```
36+
37+
Note that beachball respects topological order (package dependency order) regardless of this setting, so packages that depend on each other will still be published sequentially.
38+
39+
### npm registry read
40+
41+
When syncing or publishing, beachball fetches version information from the npm registry for each package. In large monorepos with many packages, this can be slow.
42+
43+
**`npmReadConcurrency`** (default: `5`) controls how many registry reads happen at once. Increasing this can significantly speed up the fetch step:
44+
45+
```js
46+
const config = {
47+
npmReadConcurrency: 10,
48+
};
49+
```
50+
51+
## Reducing git repository size
52+
53+
Beachball's changelogs and change files can have a [shockingly large impact](https://github.com/microsoft/beachball/issues/978) on git repository size. Some of the related issues have been improved directly in git and/or Azure DevOps, but it's still highly recommended to enable some of these settings in a large repo.
54+
55+
### Disable `CHANGELOG.json` if not using
56+
57+
If you don't have a workflow that uses `CHANGELOG.json` (most common), set **`generateChangelog: 'md'`** to only generate `CHANGELOG.md`.
58+
After enabling, you must **manually** delete existing `CHANGELOG.json` files.
59+
60+
```js
61+
const config = {
62+
generateChangelog: 'md',
63+
};
64+
```
65+
66+
It's also possible to disable changelog generation entirely with `generateChangelog: false`, though this defeats one of the main points of the tool.
67+
68+
### Limit number of versions in changelog
69+
70+
Set **`changelog.maxVersions`** to limit how many versions are included in each package's changelog. This prevents the changelog's history from growing indefinitely. Older versions will still be available from git history, and a note will be added directing people to look there.
71+
72+
```js
73+
const config = {
74+
// You can experiment with values
75+
changelog: { maxVersions: 100 },
76+
};
77+
```
78+
79+
### Add hash to changelog file names
80+
81+
Enable **`changelog.uniqueFilenames`** to add a unique suffix to changelog filenames, based on the hash of the package name: e.g. `CHANGELOG-d7d39c3f.md`/`.json`. [Increasing filename uniqueness](https://github.com/microsoft/beachball/pull/996) can improve git performance - this has been improved in Git itself, but still doesn't hurt to enable.
82+
83+
When this is initially enabled, any existing changelog files will be renamed. If the package name (and therefore the hash) changes, renaming the file should also be handled automatically.
84+
85+
```js
86+
const config = {
87+
changelog: { uniqueFilenames: true },
88+
};
89+
```
90+
91+
## Skipping change commit hashes
92+
93+
By default, beachball records the git commit hash for each change in `CHANGELOG.json`, which adds overhead during bumping. You can disable this with **`changelog.includeCommitHashes`**:
94+
95+
```js
96+
const config = {
97+
changelog: { includeCommitHashes: false },
98+
};
99+
```
100+
101+
## Selectively skipping remote fetch
102+
103+
By default, beachball fetches from the remote before comparing changes. If there's a specific situation where you're **certain** the local branch is already up to date or are willing to accept the tradeoff for performance, you can skip this with `--no-fetch` (or `fetch: false` conditionally in the config).

docs/overview/configuration.md

Lines changed: 35 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ For the latest full list of supported options, see `RepoOptions` [in this file](
109109
[2]: https://github.com/microsoft/beachball/blob/main/src/types/ChangelogOptions.ts
110110
[3]: ../concepts/groups#version-groups
111111
[4]: https://github.com/microsoft/beachball/blob/main/src/types/BeachballOptions.ts
112-
[5]: #determining-the-target-branch-and-remote
112+
[5]: #specifying-the-target-branch-and-remote
113113
[6]: #glob-matching
114114

115115
### Glob matching
@@ -129,6 +129,7 @@ This option takes a list of patterns which are matched against package paths. Pa
129129
Example: with this config, `beachball` will only consider packages under `packages/foo` (excluding `packages/foo/bar`).
130130

131131
```json
132+
// in beachball.config.js or root package.json "beachball"
132133
{
133134
"scope": ["packages/foo/*", "!packages/foo/bar"]
134135
}
@@ -138,27 +139,51 @@ On the command line, this could be specified as `--scope 'packages/foo/*' --scop
138139

139140
> Note: if you have multiple sets of packages in the repo with different scopes, `groupChanges` is not supported.
140141
141-
### Determining the target branch and remote
142+
### Specifying the target branch and remote
142143

143144
The `branch` option is the official target branch to compare against when determining changes.
144145

145-
In GitHub repos where contributions may come from forks, you should use the **name only (no remote)** and specify `repository` in the repo root `package.json`. This allows finding the official remote by matching the URL (most formats are supported), regardless of what the user decided to call the remote. For example:
146+
#### Repos which may have forks
147+
148+
If you have a public GitHub repo or another situation where **any** contributions might come from a fork, `branch` should use the **name only (no remote)** (since users can choose arbitrary names for their own remote and the official remote):
146149

147150
```json
151+
// in beachball.config.js or root package.json "beachball"
152+
{
153+
"branch": "main"
154+
}
155+
```
156+
157+
To ensure Beachball can reliably determine which local remote name corresponds to the official remote, set `repository` in the repo root `package.json`:
158+
159+
```json
160+
// repo root package.json
148161
{
149-
"name": "my-repo",
150162
"repository": {
151163
"type": "git",
152-
"url": "https://github.com/my-org/my-repo"
153-
},
154-
"beachball": {
155-
"branch": "main"
164+
// your repository URL here (most formats are supported)
165+
"url": "https://github.com/microsoft/beachball"
156166
}
157167
}
158168
```
159169

160-
In private repos that use a single remote with branches instead of forks, you can either include a remote name (e.g. `branch: 'origin/main'`) if you're certain everyone will use the same remote name, or only include the branch name and specify `repository` as above.
170+
#### Repos with a single remote (mostly Azure DevOps)
171+
172+
If **all** your contributors use branches on a single remote (as opposed to forking), you can specify the remote name as part of the `branch` setting. This is almost always the model used in internal Azure DevOps repos. (Do NOT use this approach for public GitHub repos.)
173+
174+
```json
175+
// in beachball.config.js or root package.json "beachball"
176+
{
177+
"branch": "origin/main"
178+
}
179+
```
180+
181+
For safety as a fallback, it's still recommended to set `repository` in your repo root `package.json` as detailed above.
182+
183+
#### How Beachball determines the branch and remote
161184

162185
If `branch` isn't specified, the default branch name is the system default branch name (`main` or `master`).
163186

164-
If `branch` doesn't include a remote and it can't be determined from `package.json` `repository`, the fallback remote is `upstream` if defined, `origin` if defined, or the first defined remote.
187+
If `branch` doesn't include a remote (or isn't specified), Beachball will first look for a remote matching `package.json` `repository`. If that's not found, the fallback remote is `upstream` if defined, `origin` if defined, or the first defined remote.
188+
189+
All the fallback logic involves git operations, so especially in large repos, it's best to give Beachball a hint using one of the approaches specified above for efficiency.

packages/beachball/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,10 @@ beachball publish -r http://localhost:4873 -t beta
110110

111111
In large monorepos, the process of fetching versions for sync or before publishing can be time-consuming due to the high number of packages. To optimize performance, you can override the concurrency for fetching from the registry by setting `options.npmReadConcurrency` (default: 5). You can also increase concurrency for hook calls and publish operations via `options.concurrency` (default: 1; respects topological order).
112112

113+
### Optimizing for large monorepos
114+
115+
If you have a large to very large monorepo, there are several configuration options and strategies that can help improve Beachball's performance. For details, see the [large repos guide](https://microsoft.github.io/beachball/concepts/large-repos.html).
116+
113117
### API surface
114118

115119
Beachball **does not** have a public API beyond the provided [options](https://microsoft.github.io/beachball/overview/configuration.html). Usage of private APIs is not supported and may break at any time.

0 commit comments

Comments
 (0)