diff --git a/_config.yml b/_config.yml index d374a10f..4f115d83 100644 --- a/_config.yml +++ b/_config.yml @@ -22,6 +22,7 @@ exclude: - vendor - Gemfile* - LICENSE + - gems defaults: - diff --git a/_data/tool_and_resource_list.yml b/_data/tool_and_resource_list.yml index d23c60b3..b5cdc96c 100644 --- a/_data/tool_and_resource_list.yml +++ b/_data/tool_and_resource_list.yml @@ -1237,3 +1237,65 @@ CODECHECK tackles one of the main challenges of computational research by supporting codecheckers with a workflow, guidelines and tools to evaluate computer programs underlying scientific papers. url: https://codecheck.org.uk/ catalog: RSQKit +- id: pyright + name: Pyright + description: >- + Pyright is a fast type checker for Python, developed by Microsoft, with + support for type inference, strict mode, and integration with VS Code + and other editors via the Language Server Protocol. + url: 'https://github.com/microsoft/pyright' + catalog: RSQKit +- id: lintr + name: lintr + description: >- + lintr is a static code analysis tool for R that checks for style, + syntax errors, and possible semantic issues, and integrates with + common R development environments. + url: 'https://lintr.r-lib.org/' + catalog: RSQKit +- id: clang-tidy + name: clang-tidy + description: >- + clang-tidy is a clang-based C++ linter tool providing an extensible + framework for diagnosing and fixing typical programming errors, style + violations, and interface misuse. + url: 'https://clang.llvm.org/extra/clang-tidy/' + catalog: RSQKit +- id: cppcheck + name: cppcheck + description: >- + cppcheck is a static analysis tool for C and C++ code that detects + bugs, undefined behaviour, and dangerous coding constructs without + requiring the code to compile. + url: 'https://cppcheck.sourceforge.io/' + catalog: RSQKit +- id: fortran-linter + name: fortran-linter + description: >- + fortran-linter is a simple linting tool for Fortran source code that + checks for common style and correctness issues. + url: 'https://github.com/cphyc/fortran-linter' + catalog: RSQKit +- id: flint + name: flint + description: >- + flint is a Fortran linter focused on enforcing coding standards and + detecting potential issues in Fortran source code. + url: 'https://github.com/JonasToth/flint' + catalog: RSQKit +- id: jet-jl + name: JET.jl + description: >- + JET.jl is a Julia package that uses Julia's type inference to detect + potential errors and type instabilities in code without executing it, + functioning as a static analyser for Julia programs. + url: 'https://aviatesk.github.io/JET.jl/stable/' + catalog: RSQKit +- id: aqua-jl + name: Aqua.jl + description: >- + Aqua.jl (Auto QUality Assurance for Julia packages) provides automated + quality checks for Julia packages, including ambiguity detection, + unbound type parameters, and stale dependency checks. + url: 'https://juliatesting.github.io/Aqua.jl/stable/' + catalog: RSQKit diff --git a/pages/tasks/static_analysis.md b/pages/tasks/static_analysis.md new file mode 100644 index 00000000..8c0ba1ac --- /dev/null +++ b/pages/tasks/static_analysis.md @@ -0,0 +1,70 @@ +--- +title: "Using static analysis" +description: "How to use static analysis tools to detect bugs, enforce coding standards, and improve the quality of research software." +contributors: ["Shoaib Sufi"] +page_id: static_analysis +related_pages: + tasks: [ci_cd, code_review, languages_tools_infrastructures, reproducible_software_environments] +quality_indicators: [has_no_linting_issues, uses_tool_for_warnings_and_mistakes, static_analysis_common_vulnerabilities] +keywords: ["static analysis", "linting", "code quality", "type checking", "code smells", "pre-commit"] +training: + - name: "EVERSE TeSS" + url: "https://everse-training.app.cern.ch" +--- + +## How do you use static analysis to improve the quality of your research software? + +This page provides an overview of static analysis, guidance on choosing appropriate tools, and pointers to further resources for common research software languages. + +### Description + +Static analysis is the automated examination of source code without executing it, used to detect bugs, enforce coding standards, identify security vulnerabilities, and flag code quality issues. When you are working on research software where correctness is critical and code often evolves rapidly across a team, static analysis catches whole classes of errors — type mismatches, undefined variables, unreachable code — before they reach your results. It is one of the lowest-effort, highest-return quality practices you can adopt. + +### Considerations + +- **Static analysis complements but does not replace your tests** — it catches structural and stylistic issues that your tests may not exercise, but cannot verify that your code produces scientifically correct results. +- **Different tools serve different purposes** — linters enforce style and flag obvious errors; type checkers verify type correctness; security scanners look for known vulnerability patterns. You may need more than one. +- **Your language matters** — tool ecosystems vary significantly. Python, R, Fortran, C/C++, and Julia each have different levels of static analysis support and different community norms. +- **Strictness is configurable** — most tools can be tuned. Starting with a permissive configuration and tightening it over time is more practical than enforcing maximum strictness on an existing codebase all at once. +- **Integration into your development workflow is what makes it stick** — static analysis run only occasionally has limited value; run automatically on every commit or pull request, it becomes a reliable quality gate. +- **False positives are real** — all static analysis tools produce some false positives. You and your team should agree on which rules to enable and be prepared to suppress specific warnings with justification. +- **Your existing codebase may have many pre-existing issues** — when introducing static analysis to legacy code, consider fixing issues incrementally rather than all at once, to avoid overwhelming your team. + +### Solutions + +**Conceptual guidance** + +- Decide what you want static analysis to do for your project: enforce a consistent style, catch bugs early, check types, or all three. This shapes which tools you need. +- Treat static analysis warnings as a quality signal, not a bureaucratic hurdle. A warning that you always suppress without review is a warning that has stopped being useful. + +**Actionable steps** + +- **Choose a tool appropriate for your language and goals:** + - *Python*: {% tool "ruff" %} (fast linter and formatter), {% tool "mypy" %} or {% tool "pyright" %} (type checking), {% tool "bandit" %} (security) + - *R*: {% tool "lintr" %} (linting and style) + - *C/C++*: {% tool "clang-tidy" %} (linting and bug-finding), {% tool "cppcheck" %} (static analysis) + - *Fortran*: {% tool "fortran-linter" %}, {% tool "flint" %} (options are more limited than for modern languages) + - *Julia*: {% tool "jet-jl" %} (type-based error detection), {% tool "aqua-jl" %} (package quality checks) +- **Start with the tool's default or recommended configuration** — don't spend time customising rules before you understand what the tool flags in your codebase. +- **Integrate into your CI pipeline** — run static analysis automatically on every pull request or push. Most tools produce output that CI systems (GitHub Actions, GitLab CI, etc.) can surface as pass/fail checks. +- **Add a pre-commit hook** — running a fast linter locally before you commit catches issues earlier and reduces CI noise. {% tool "precommit" %} is a widely used framework for managing this across languages. +- **Agree team conventions** — document which tools you are using, which rules are enabled, and how suppressions should be handled. This prevents individual developers from working around warnings silently. +- **Fix issues incrementally** — if you are introducing static analysis to an existing project, use the tool's baseline or ignore-file mechanism to suppress pre-existing issues, then address them in batches over time. + +## Further Reading + +The following are authoritative and highly regarded resources for going deeper on static analysis tools, strategies, and integration practices. Practical and tool-focused resources are listed first, followed by broader texts that provide strategic and theoretical context. + +- **[Ruff documentation](https://docs.astral.sh/ruff/)** — The reference documentation for Ruff, now the most widely adopted Python linter and formatter. Exceptionally well-written, it covers the rationale behind individual rules and serves as a practical model for how a modern static analysis tool should work. Useful even if you only skim it for its approach to rule configuration and suppression. + +- **[pre-commit framework documentation](https://pre-commit.com/)** — The definitive guide to wiring static analysis and other checks into your local development workflow across any language. Covers installation, hook configuration, and CI integration — the most direct path from "I have a tool" to "the tool runs automatically". + +- **[Software Engineering at Google](https://abseil.io/resources/swe-book) — Winters, Manshreck & Wright (O'Reilly, freely available online)** — Chapter 26 covers how to think about static analysis at scale, including false positive rates, developer trust, and rolling tools out to large codebases. The strategic framing maps directly onto the considerations above, particularly if you are introducing static analysis to an established project. + +- **[Continuous Delivery](https://continuousdelivery.com/) — Jez Humble & David Farley** — The authoritative text on building automated delivery pipelines with quality gates built in. If you want to understand why static analysis belongs in your CI pipeline rather than being run ad hoc, and how it fits alongside testing and other automated checks, this is the place to start. + +- **[Code Complete](https://www.microsoftpressstore.com/store/code-complete-9780735619678) — Steve McConnell (Microsoft Press)** — A foundational text on software construction quality. Not exclusively about static analysis, but provides the broader context for why code quality practices matter and how they interact. Most useful if you want to understand the principles behind the tools rather than just use them. + +## AI Disclosure + +This work was produced with the assistance of Claude Sonnet 4.6, under the strict editorial control and factual verification of the human author.