Modernize Sphinx docs toolchain, add RTD build, and expand API docs#1089
Modernize Sphinx docs toolchain, add RTD build, and expand API docs#1089adcroft merged 10 commits intoNOAA-GFDL:dev/gfdlfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev/gfdl #1089 +/- ##
============================================
- Coverage 38.76% 37.31% -1.45%
============================================
Files 274 263 -11
Lines 92265 91065 -1200
Branches 17823 17795 -28
============================================
- Hits 35765 33984 -1781
- Misses 49796 50357 +561
- Partials 6704 6724 +20 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
d5bc339 to
9efee50
Compare
|
It seems to fix all of the outstanding problems, so I think it is the way to go. A link to example documentation would be good if available. The |
|
I edited the PR description to add a link. Peruse at your leisure... |
There was a problem hiding this comment.
Pull request overview
This PR modernizes the Sphinx documentation toolchain (moving to upstream Sphinx 8), vendors the Doxygen→Sphinx bridge extension in-tree, adds Read the Docs build support, and expands the generated API/source-browser documentation.
Changes:
- Upgrade docs dependencies (Sphinx 8.x) and pin upstream
VACUMM/sphinx-fortranto a reproducible commit. - Vendor and extend
autodoc_doxygenunderdocs/_ext/, including source-browser generation, performance fixes, and richer Fortran API rendering. - Add RTD build configuration and new API index pages (Functions, Source Files), plus related Makefile/Doxygen config updates.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/requirements.txt | Moves to Sphinx 8.x and pins upstream sphinx-fortran; documents new deps. |
| docs/conf.py | Enables vendored extension, RTD-safe paths, and two compatibility monkey-patches. |
| docs/apiref.rst | Adds links to new Functions and Source Files indexes. |
| docs/api/files.rst | New toctree for generated source-browser pages. |
| docs/_static/autodoxysource.css | Styling for the new source-browser directive output. |
| docs/_ext/autodoc_doxygen/init.py | Extension entrypoint; loads merged Doxygen XML + registers directives/config. |
| docs/_ext/autodoc_doxygen/xmlutils.py | Doxygen XML → RST formatting, math/ref handling, and performance improvements. |
| docs/_ext/autodoc_doxygen/autodoc.py | Fortran-domain autodoc documenters + [source] links. |
| docs/_ext/autodoc_doxygen/autodoxysource.py | New .. autodoxysource:: directive for per-file highlighted source listings. |
| docs/_ext/autodoc_doxygen/autosummary/init.py | Custom autosummary directive for Doxygen/Fortran objects. |
| docs/_ext/autodoc_doxygen/autosummary/generate.py | Generates module/page stubs + new function index + source stub pages. |
| docs/_ext/autodoc_doxygen/autosummary/templates/doxysource.rst | Template for per-file source stub pages. |
| docs/_ext/autodoc_doxygen/autosummary/templates/doxypage.rst | Template for generated Doxygen “page” content. |
| docs/_ext/autodoc_doxygen/autosummary/templates/doxymodule.rst | Template for module pages with types + functions/subroutines summaries. |
| docs/README.md | Updates documentation on the new toolchain, RTD behavior, and credits. |
| docs/Notes-sphinx-upgrade.md | New design/plan doc explaining the toolchain migration rationale. |
| docs/Makefile | Switches to sphinx-build -M and adds optional HTML equation post-processing hook. |
| docs/Doxyfile_rtd | Excludes CVMix and enables XML program listings for source browser. |
| docs/.gitignore | Ignores Python bytecode caches for vendored extension. |
| .readthedocs.yml | Adds a custom RTD build command enabling parallel Sphinx builds. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
9efee50 to
f08ebdd
Compare
This is piece 1 of the sphinx toolchain upgrade. The docs build
previously depended on four forks which haven't been updated since 2020.
The chain pinned the entire build to Sphinx 3.2.1, requiring a growing
list of transitive version ceilings (jinja2<3.1, sphinxcontrib_*<1.0.x,
alabaster<0.7.14, setuptools<82).
This commit handles the sphinxcontrib-autodoc_doxygen dependency. The
upstream (rmcgibbo) has been dormant since June 2021; the fork was
effectively a rewrite (~90% of xmlutils.py changed, both documenters
replaced, the whole domain switched from cpp to f). Monkey-patching
the upstream package was rejected because there was nothing stable to
patch against. Instead the fork's code is vendored in-tree at
docs/_ext/autodoc_doxygen/ so it can be debugged, blamed, and edited
like any other project source.
Changes vs the fork at tag 0.7.13:
- Renamed doxynamespace.rst template to doxymodule.rst and dropped
the unused C++ doxyclass.rst (DoxygenClassDocumenter was already
unregistered in the fork).
- Removed ~80 lines of dead code: commented-out pdb.set_trace lines,
the unregistered DoxygenClassDocumenter class, the dead
visit_ref_angus alternative, _import_by_name_original, and the
unused try-import of flint.
- Dropped Python 2 compatibility cruft: from __future__ imports,
from six import itervalues (replaced with dict.values()).
- Ported four Sphinx 3 -> 8 API changes:
* DoxygenAutosummary.get_items now passes self.bridge (the
DocumenterBridge) to documenter constructors instead of self,
so Documenter.__init__ finds directive.genopt.
* self.warn / self.directive.warn calls (removed in Sphinx 8)
replaced with sphinx.util.logging.getLogger(...).warning.
* Wrapped env.doc2path in os.fspath to silence the
RemovedInSphinx90Warning about str paths.
docs/conf.py: prepend _ext to sys.path and rename the extension in
the extensions list from sphinxcontrib.autodoc_doxygen to
autodoc_doxygen.
docs/conf.py also carries a monkey-patch for a parallel-build bug in
upstream VACUMM/sphinx-fortran's FortranDomain.merge_domaindata. The
upstream method references an undefined `outNames` (typo for
`ourNames`) and also unpacks the wrong tuple shape from modules and
objects dicts. With -j > 1 the merge silently fails and the f domain
ends up empty, losing every Fortran cross-reference. The patch is
heavily commented with a TODO to submit an upstream PR and drop the
workaround during piece 2 of the upgrade.
docs/requirements.txt: drop the sphinxcontrib-autodoc_doxygen git
dependency. The other three forks remain pending pieces 2, 3, 4.
Validation: built against a new venv.sph8 (Python 3.11 + Sphinx 8.2.3 +
sphinx-rtd-theme 3.1.0 + sphinxcontrib-bibtex 2.6.5 + upstream
VACUMM/sphinx-fortran master). `make html` produces 220 warnings vs
224 on the Sphinx 3 baseline. The f domain indexes 48 modules and
538 objects. Equation pages are effectively byte-identical (off only
by Sphinx 8's section-id hoist optimization and the pilcrow-as-CSS
theme change). Module pages are slightly larger because more
cross-references now resolve than in the baseline build.
Piece 2 of the sphinx toolchain upgrade. The jr3cermak/sphinx-fortran@1.2.2 fork existed to add Sphinx 3 API compatibility in 2020. Upstream VACUMM/sphinx-fortran has continued evolving since then (Sphinx logging API migration, parallel read support, modern setuptools, removal of the future dependency) and now works with Sphinx 8 directly. Upstream has not cut a PyPI release past 1.1.1 so we pin to a specific master commit for reproducibility. The pinned commit ships a broken FortranDomain.merge_domaindata that loses every f-domain object in parallel builds; the monkey-patch in conf.py from the previous commit handles that. We will revisit that patch and the pin together once an upstream PR is merged (see the TODO(piece-2) marker in conf.py setup). Validation: rebuilt docs with the pinned commit installed into venv.sph8. `make html` produces 220 warnings (identical to the pre-swap sph8 build), the f domain indexes 48 modules and 538 objects, mom.html has 36 internal cross-reference links, and f-modindex.html is generated. No behavior change from the sph8 build validated in the previous commit.
Piece 3 of the docs toolchain upgrade plan. This commit removes the last need for the patched sphinx fork and collapses the long list of transitive version ceilings it forced. Changes: - requirements.txt: drop git+https://github.com/jr3cermak/sphinx.git @v3.2.1mom6.4. Pin sphinx>=8,<9 from PyPI. Drop the eight ceiling lines (jinja2<3.1, sphinxcontrib_*<1.0.x, alabaster<0.7.14, setuptools<82.0.0) and the workaround comment around sphinx-fortran's broken requirements.txt; those existed only because Sphinx 3.2.1 dragged them in. Add lxml as an explicit dep (it is required by the vendored _ext/autodoc_doxygen extension to parse Doxygen XML). Keep six because the pinned sphinx-fortran commit imports it at module load time. - conf.py: add a monkey-patch for sphinx.util.math.wrap_displaymath that replicates the functional changes in the jr3cermak fork. The patch detects parts containing `\begin{equation}`, `\begin{eqnarray}`, or `\begin{align}` and emits them verbatim (no outer `\begin{equation}\begin{split}...\end{split} \end{equation}` wrapping). Without this, MOM6's math-heavy sources produce nested LaTeX environments that pdflatex chokes on. The patch is heavily commented with the rationale and a TODO marker for a possible upstream contribution. Verified by unit-testing the function directly (plain math still wrapped, explicit-environment math passed through verbatim) and by inspecting the generated _build/latex/ MOM6.tex: 0 double-wrapped equation/eqnarray/align blocks, 100 + 18 + 17 explicit environments preserved as top-level blocks. - conf.py: tighten exclude_patterns. Sphinx walks the entire source tree by default, which previously caused it to descend into local virtualenvs (`venv.sph3/`, `venv.sph8/`, `venv-3.11/`, etc.) and pick up LICENSE.rst, README.rst, and stray autosummary template files from inside site-packages. Each of those generated a "document isn't included in any toctree" warning and a corresponding stray .html file in the build output. The new exclusions cover venv*, venv-*, _build.*, and _ext/*/templates. Final build: 90 HTML files (down from 148, the lost 58 were all venv junk and template scaffolding) and 122 warnings (down from 220, with the same project- internal warnings as before plus the venv noise removed). - Makefile: add the equation post-processing hook to the html target, gated on UPDATEHTMLEQS=Y. The jr3cermak/sphinx fork carried this as a patch to sphinx/cmd/build.py so it ran after every sphinx-build invocation. Now invoked from the Makefile so it works against stock upstream Sphinx. The existing nortd target already had the same hook with different arguments (-p APIs -b doxygen vs -p html -b sphinx). The forked sphinx-fortran fix and the upstream PR for the math wrapping change are deferred to a follow-up. Validation: built from a fresh venv (venv.sph8.fresh) created by `python3.11 -m venv` and `pip install -r requirements.txt`. The install resolves cleanly with no manual intervention. `make html` succeeds with 122 warnings, indexes 48 modules and 538 f-domain objects, generates f-modindex.html, and produces 36 internal cross- reference links on mom.html. `make html UPDATEHTMLEQS=Y` runs the post-processing script successfully. `make latex` produces 0 double-wrapped math environments.
Piece 4 of the docs toolchain upgrade plan. With this commit the dependencies in docs/requirements.txt are only on maintained upstream sources. flint (marshallward/flint, vendored as jr3cermak/flint at 0.0.1) was historically pulled into the docs build with the intent of patching up Doxygen's incomplete parsing of Fortran functions with `result()` clauses. The fork's sphinxcontrib-autodoc_doxygen imported it at module load time but never actually called it; the import was a no-op behind a `try: import flint; except: pass`. We removed that import as part of piece 1's dead-code cleanup, so the vendored extension no longer touches flint at all. Verified empirically: uninstalled flint from venv.sph8.fresh and rebuilt. The build is byte-identical to the previous one with flint installed (122 warnings, 48 f-modules, 538 f-objects, 36 internal cross-reference links on mom.html, f-modindex.html present, 90 html files). Nothing in the current pipeline depends on flint. The two surviving "flint" mentions in docs/_ext/autodoc_doxygen/ are both in comments describing the original intent, not live code; they are left in place as historical context. If a future enhancement wants to actually fix Doxygen's result() clause parsing, that should be a separate piece of work and would not necessarily resurrect flint specifically. docs/README.md is updated to drop the flint bullet from the requirements list and the flint row from the credits table. Other stale entries in that section (sphinxcontrib_autodox-doxygen, the sphinx and sphinx-fortran fork rows) remain and will need a broader README sweep in a follow-up.
Consolidates four small fixes that followed the Sphinx 8 upgrade: - Switch all Makefile targets from `sphinx-build -b` to `-M` for consistent parallel build support. - Update docs/README.md to match the modernized toolchain (remove references to the four sphinx* forks, update requirements and credits sections). - Fix crash in autodoc_doxygen visit_image on empty <image> elements (node.text is None when doxygen produces <image/> without caption text). - Add __pycache__ and *.pyc to docs/.gitignore so the vendored extension's bytecode does not appear in git status.
Consolidates four fixes for the Read the Docs build environment: - Override RTD's default sphinx runner with a build.jobs.build.html entry in .readthedocs.yml that runs sphinx-build -M html -j auto, since RTD's high-level sphinx: key has no parallelism option. - Make doxygen_xml path resolution robust to cwd changes: resolve relative paths against app.confdir rather than the ambient cwd, which differs between local builds and RTD. - Switch Makefile html target from -j 4 to -j auto to match RTD. - Enable html_static_path = ['_static'] in conf.py so custom CSS files (autodoxysource.css) are copied into the build output. Previously commented out, which meant app.add_css_file() added the <link> tag but the file was never deployed.
Two instances of the same class of bug in the vendored
autodoc_doxygen extension, both scanning the entire merged doxygen
XML tree (1230 files, 109 MB with programlisting) on every call:
1. scanNode used `node.xpath('//latexonly')` etc. In XPath, `//`
starts from the document root, not from `node`. Since the
extension merges all XML into one tree, every call scanned
the whole tree. Fix: `//` -> `.//` (descendant-of-self).
Recovery: 6m33s -> 48.8s wall clock at full input scale.
2. visit_ref used `get_doxygen_root().findall('.//*[@id=X]')` to
resolve each prose <ref> by linearly scanning every element in
the merged tree. With XML_PROGRAMLISTING=YES the tree tripled
in size, making this the dominant cost (250s of 911s serial).
Fix: lazy {id: element} dict built once on first use, O(1)
lookup thereafter. Recovery: 911s -> 676s serial.
Generate one HTML page per Fortran source file with syntax highlighting and clickable cross-references. Clicking an identifier in the source jumps to its API documentation page; each API entry gets a [source] link back to the source listing. Implementation: - Enable XML_PROGRAMLISTING in Doxyfile_rtd (standalone commit originally, now folded in). - New autodoxysource directive in _ext/autodoc_doxygen/ that walks doxygen's <programlisting> XML, emitting per-line anchors, CSS-classed highlight spans, and pending_xref nodes for identifiers. - Stub generator produces 329 :orphan: source pages under api/generated/source/. - [source] links added to DoxygenMethodDocumenter, DoxygenTypeDocumenter, and DoxygenModuleDocumenter. - CSS styled to match the sphinx_rtd_theme's code blocks (font stack, size, colors, line gutter). - Node count optimization: coalesce text runs in _walk_highlight (10-30 nodes/line -> 3-5), set support_smartquotes=False on the source container to skip the smartquotes transform. Combined with the visit_ref fix, total build time with the source browser is 115s parallel (vs 360s before optimization).
Consolidates five improvements to the rendered API pages: - Show Fortran type declarations for function/subroutine parameters and derived-type members. Types are rendered as inline code (e.g. ``real, intent(in), optional``) by looking up the <param> elements on the parent <memberdef>. - Move [source] links from the top of each entry to the bottom, so the description and parameters appear first. - Fix functions (as opposed to subroutines) not being registered in the sphinx-fortran domain. format_name() was prepending the return type (e.g. "real") which sphinx-fortran's f_sig_re regex cannot parse, leaving function entries without anchors. - Add Functions and Source Files index pages to the API Reference, with cross-reference links to module pages and source browser pages respectively. - Exclude upstream CVMix sources from doxygen input to avoid warnings from undocumented third-party code.
Multi-line math content (e.g. \begin{align}...\end{align}) from
doxygen XML was emitted as:
.. math:: \begin{align}
\mathbf{v}
= \mathbf{u} + ...
\end{align}
RST requires directive body content to be indented. The
continuation lines at column 0 were parsed as regular text by
docutils, producing garbled LaTeX output with "Runaway argument"
and "Missing $" errors — 164 LaTeX errors on the Notation page
alone.
Fix: in visit_formula, detect multi-line math text and emit it
as a properly indented block:
.. math::
\begin{align}
\mathbf{v}
= \mathbf{u} + ...
\end{align}
Single-line math is unchanged (stays on the .. math:: line).
After fix: `make latexpdf` LaTeX errors drop from 164 to 10.
The remaining 10 are unrelated (9 Unicode Greek characters in
source comments that pdflatex cannot render, 1 duplicate label).
f08ebdd to
12560f2
Compare
Modernizes the docs build and adds new capabilities.
(sphinxcontrib-autodoc_doxygen, sphinx, sphinx-fortran, flint) that had pinned the build to Sphinx 3.2.1 and a growing
list of transitive version ceilings. End state: stock Sphinx 8 from PyPI, upstream VACUMM sphinx-fortran at a pinned
commit, autodoc_doxygen vendored in-tree at docs/_ext/ (expanding the original additions from jr3cermak). Two small
compatibility gaps are handled by commented monkey-patches in conf.py with TODO markers for upstream PRs: i) a math-wrap behavior from the sphinx fork and ii) a parallel-merge bug in upstream sphinx-fortran.
sphinx: key has no parallelism option); html_static_path is enabled so custom CSS actually deploys; doxygen_xml paths
are resolved against app.confdir so they work under RTD's cwd.
derived-type members; [source] links moved below the description; a bug that dropped functions (vs subroutines) from
the sphinx-fortran domain is fixed; new Functions and Source Files index pages; upstream CVMix sources excluded from
doxygen input.
bi-directionally linked with the API pages. Includes a multi-line LaTeX math indentation fix that dropped LaTeX errors
from 164 to 10 on the Notation page.
scanNode used // where it should have used .// (6m33s → 48.8s); visit_ref linearly scanned all elements to resolve
each , replaced with a lazy id→element dict (911s → 676s serial). End-to-end build with the new source browser is
115s parallel.
Commits are structured so each piece stands alone but this PR can be squashed. I have the necessary commits to create PRs for sphinx-fortran and sphinx which will later let us remove the patches in conf.py.
Testing:
mom.html)
navigate back
There's no way I could have fixed this in a short time: this PR was mostly debugged and implemented by AI.
Todo: