|
| 1 | +--- |
| 2 | +title: FAIR Research Software Principles |
| 3 | +--- |
| 4 | + |
| 5 | + |
| 6 | +## FAIR Research Software |
| 7 | + |
| 8 | +FAIR stands for Findable, Accessible, Interoperable, and Reusable and comprises a set of principles designed to |
| 9 | +increase the visibility and usefulness of your research to others. |
| 10 | +The FAIR data principles, first published [in 2016][fair-data-principles], are widely known and applied today. |
| 11 | +Similar [FAIR principles for software][fair-principles-research-software] have now been defined too. In general, they mean: |
| 12 | + |
| 13 | +- **Findable** - software and its associated metadata must be easy to discover by humans and machines. |
| 14 | +- **Accessible** - in order to reuse software, the software and its metadata must be retrievable by standard protocols, free and legally usable. |
| 15 | +- **Interoperable** - when interacting with other software it must be done by exchanging data and/or metadata through |
| 16 | + standardised protocols and application programming interfaces (APIs). |
| 17 | +- **Reusable** - software should be usable (can be executed) and reusable |
| 18 | + (can be understood, modified, built upon, or incorporated into other software). |
| 19 | + |
| 20 | +Each of the above principles can be achieved by a number of practices listed below. |
| 21 | +This is not an exact science, and by all means the list below is not exhaustive, |
| 22 | +but any of the practices that you employ in your research software workflow will bring you |
| 23 | +closer to the gold standard of fully reproducible research. |
| 24 | + |
| 25 | +### Findable |
| 26 | +- Create a description of your software to make it discoverable by search engines and other search tools |
| 27 | +- Use standards (such as [CodeMeta][codemeta]) to describe interoperable metadata for your software (see [Research Software Metadata Guidelines][rsmg-1]) |
| 28 | +- Place your software in a public software repository (and ideally register it in a [general-purpose or domain-specific software registry][software-registries]) |
| 29 | +- Use a unique and persistent identifier (DOI) for your software (e.g. by depositing your code on [Zenodo][zenodo]), |
| 30 | +which is also useful for citations - note that depositing your data/code on GitHub and similar software repositories |
| 31 | +may not be enough as they may change their open access model or disappear completely in the future, so archiving your code means it stands a better chance at being preserved |
| 32 | + |
| 33 | +### Accessible |
| 34 | +- Make sure people can obtain get a copy your software using standard communication protocols (e.g. HTTP, FTP, etc.) |
| 35 | +- The code and its description (metadata) has to be available even when the software is no longer actively developed (this includes earlier versions of the software) |
| 36 | + |
| 37 | +### Interoperable |
| 38 | +- Explain the functionality of your software and protocols for interaction with it |
| 39 | +- Use community-agreed standard formats for inputs and outputs of your software and its metadata (e.g. [CodeMeta][codemeta]) |
| 40 | +- Communicate with other software and tools via standard protocols and APIs |
| 41 | + |
| 42 | +### Reusable |
| 43 | +- Document your software (including its functionality, how to install and run it) to make it more understandable by |
| 44 | +others who may wish to reuse or extend it |
| 45 | +- Follow best practices for software development, e.g. structure your code using common patterns and use coding |
| 46 | +conventions to make your code readable and understandable by people |
| 47 | +- Test your software and make sure it works on different platforms/operating systems |
| 48 | +- Give a licence to your software clearly stating how it can be reused |
| 49 | +- State how to cite your software, so people can give you credit when they reuse it |
| 50 | + |
| 51 | +## Tools and practices for FAIR research software development |
| 52 | + |
| 53 | +There are various tools and practices that support the development of FAIR research software, contributing to each of the four FAIR principles. |
| 54 | +These tools and practices work together, as no single tool or practice will fully address one principle, but can |
| 55 | +contribute to multiple principles simultaneously. |
| 56 | + |
| 57 | +It is important to note that simply using these tools, without following good practice and guidance on how best to align |
| 58 | +their usage with the FAIR principles, is not enough to produce FAIR software. |
| 59 | +In addition, FAIR is not a [software quality metric](https://everse.software/RSQKit/rs_quality) even though it can improve software quality in several aspects - |
| 60 | +software may be FAIR, but still not very good in terms of its functionality. |
| 61 | + |
| 62 | +### Development environments |
| 63 | + |
| 64 | +Virtual and integrated development environments (IDEs), such as VS Code or PyCharm, help with reading, running, testing, and debugging code. |
| 65 | +Virtual environments further enable us to share our working environments with others, making it easier to reuse and extend our code. |
| 66 | +IDEs often provide integrations with other tools, e.g. version control and command line terminals, enabling you to do many tasks from a single environment, |
| 67 | +saving time in switching between different tools. |
| 68 | + |
| 69 | +### Command line terminals |
| 70 | + |
| 71 | +Command line terminals (e.g. Bash, GitBash) enable us to run and test our code without graphical user interfaces (GUI) afforded to us by IDEs - |
| 72 | +this is sometimes needed for running our code remotely on servers and high-performance systems without a GUI provision, where time, |
| 73 | +memory and processing power are expensive or in high demand. |
| 74 | + |
| 75 | +Version control systems are typically provided as command line tools, making them often only accessible from command line terminals to enter commands and access |
| 76 | +remote version control servers to backing up and sharing our work. |
| 77 | + |
| 78 | +Finally, command line tools are interoperable software that use standard protocols for passing parameters, inputs and outputs via the command line terminal. |
| 79 | +This makes it easier to integrate with other tools, allowing us to chain command line tools and build up complex and reproducible workflows and analysis pipelines |
| 80 | +using several programs in different steps. |
| 81 | +If we write our software in a way which provides such an interoperable command line interface - we will be able to integrate it with other command line tools to |
| 82 | +automate and speed up our work. |
| 83 | + |
| 84 | +### Standard input/output formats and communication protocols |
| 85 | + |
| 86 | +Using standard data exchange, input and output formats and communication protocols helps create interoperable software that can more readily integrate |
| 87 | +with other tools into more complex pipelines - increasing its interoperability and reusability. |
| 88 | + |
| 89 | +### Version control tools |
| 90 | + |
| 91 | +Version control means knowing what changes were made to your code, when and by whom - promoting code ownership, responsibility and credit. |
| 92 | +When combined with software sharing and collaborative platforms such as GitHub or GitLab, it facilitates code publication, sharing and findability, |
| 93 | +teamwork and discussions about software and design decisions, provides backup facilities for your code and speeds up |
| 94 | +collaboration on shared code by allowing edits by more than one person at a time. |
| 95 | + |
| 96 | +### Code testing |
| 97 | + |
| 98 | +Testing ensures that your code is correct and does what it is set out to do. |
| 99 | +When you write code you often feel very confident that it is perfect, but when writing bigger codes or code that is meant to do complex operations |
| 100 | +it is very hard to consider all possible edge cases or notice every single typing mistake. |
| 101 | +Testing also gives other people confidence in your code as they can see an example of how it is meant to run and be assured that it does work |
| 102 | +correctly on their machine - helping with code understanding and reusability. |
| 103 | + |
| 104 | +### Coding conventions |
| 105 | + |
| 106 | +Following coding conventions and guides for your programming language that is agreed upon by the community and other programmers |
| 107 | +are important practices to ensure that others find it easy to read your code, reuse or extend it in their own examples and applications. |
| 108 | + |
| 109 | + |
| 110 | +### Code licensing |
| 111 | + |
| 112 | +A licence is a legal document which sets down the terms under which the creator of work (such as written text, |
| 113 | +photographs, films, music, software code) is releasing what they have created for others to use, modify, extend or exploit. |
| 114 | + |
| 115 | +It is important to state the terms under which software can be reused - the lack of a licence for your software |
| 116 | +implies that no one can reuse the software at all. |
| 117 | +A common way to declare your copyright of a piece of software and the license you are distributing it under is to |
| 118 | +include a file called LICENSE in the root directory of your code repository. |
| 119 | + |
| 120 | +Some good resources to check out for choosing a licence for your code: |
| 121 | + |
| 122 | +- [The open source guide][opensource-licence-guide] on applying, changing and editing licenses. |
| 123 | +- [choosealicense.com][choosealicense] has some great resources to help you choose a license that is appropriate for your needs, |
| 124 | +and can even automate adding the LICENSE file to your GitHub code repository. |
| 125 | + |
| 126 | +### Code citation |
| 127 | + |
| 128 | +We should add a citation file to our repository to provide instructions on how and when to cite our code. |
| 129 | +A citation file can be a plain text (CITATION.txt) or a Markdown file (CITATION.md), but there are certain benefits |
| 130 | +to using use a special file format called the [Citation File Format (CFF)][cff], which provides a way to include richer |
| 131 | +metadata about code (or datasets) we want to cite, making it easy for both humans and machines to use this information. |
| 132 | + |
| 133 | +### Code- and project- level documentation |
| 134 | + |
| 135 | +Documentation comes in many forms - from **software-level documentation** including descriptive names of variables and functions and |
| 136 | +additional comments that explain lines of your code, to **project-level documentation** (including README, LICENCE, CITATION, CONTRIBUTING, etc. files) |
| 137 | +that help to discover it, explain the legal terms of reusing it, describe its functionality and how to install, run and contribute to it, |
| 138 | +to whole websites full of documentation with function definitions, usage examples, tutorials and guides. |
| 139 | +You many not need as much documentation as a large commercial software product, but making your code reusable relies on other people being able to understand |
| 140 | +what your code does and how to use it. |
| 141 | + |
| 142 | +### Software repositories and registries |
| 143 | + |
| 144 | +Having somewhere to share your code is fundamental to making it findable and accessible. |
| 145 | +Your institution might have a code repository, your research field may have a practice of sharing code via a specific website, archive or journal, |
| 146 | +or your version control system might include an online component that makes sharing different versions of your code easy. |
| 147 | +You should check the rules or guidelines of your institution, grant or domain on publishing code, as well as any licenses of the code your software depends on or reuses. |
| 148 | + |
| 149 | +Some examples of commonly used software repositories and registries include: |
| 150 | + |
| 151 | +- general-purpose software repositories - [GitHub][github] and [GitLab][gitlab] |
| 152 | +- programming language-specific software repositories - [PyPi][pypi] (for Python) and [CRAN][cran] (for R) |
| 153 | +- software registries - [BioTools][biotools] (for biosciences) and [Awesome Research Software Registries][awesome-rs-registries], providing a list of research software registries (by country, organisation, domain and programming language) where research software can be registered to help promote its discovery |
| 154 | + |
| 155 | +### Persistent identifiers |
| 156 | + |
| 157 | +Unique persistent identifiers, such as **Digital Object Identifiers** (DOIs) provided by [Zenodo][zenodo], |
| 158 | +[FigShare][figshare], etc., or **SoftWare Heritage persistent IDentifiers** ([SWHID](swhid)) provided by [Software Heritage][software-heritage], |
| 159 | +and similar digital archiving services, and commits/tags/releases used by GitHub and similar code sharing platforms, |
| 160 | +help with findability and accessibility of your software, and can help you get credit for your work by providing citable references. |
| 161 | + |
| 162 | +### Tools for assessing FAIRness of software |
| 163 | + |
| 164 | +Here are some tools that can check your software and provide an assessment of its FAIRness: |
| 165 | + |
| 166 | +- [FAIRsoft evaluator][fair-rs-evaluator] |
| 167 | +- [FAIR software test][fair-rs-test] |
| 168 | +- [`How FAIR is your software` - command line tool to evaluate a software repository's compliance with the FAIR principles][howfairis] |
| 169 | + |
| 170 | +### Summary |
| 171 | + |
| 172 | +The table below provides a summary of how different tools and practices help with the FAIR software principles. |
| 173 | + |
| 174 | +| Tools and practices | Findable | Accessible | Interoperable | Reusable | |
| 175 | +|------------------------------------------------------------------------------------------------------|----------|------------|---------------| -------- | |
| 176 | +| Virtual development environments | | | | x | |
| 177 | +| Integrated development environments (IDEs) | | | | x | |
| 178 | +| Command line terminals - automated and reproducible pipelines | | | x | x | |
| 179 | +| Standard data exchange formats - e.g. for data exchange (CSV, YAML) | | | x | x | |
| 180 | +| Communication protocols - Command Line Interface (CLI) or Application Programming Interface (API) | | | x | x | |
| 181 | +| Version control tools | x | | | | |
| 182 | +| Code testing & correctness | | | | x | |
| 183 | +| Coding conventions | | | | x | |
| 184 | +| Code-level documentation (comments and docstrings, explaining functionality) | | | | x | |
| 185 | +| Project-level documentation & metadata (README, explaining functionality/installation/running, etc.) | | | x | x | |
| 186 | +| License - code sharing & reuse | | | | x | |
| 187 | +| Citation - code reuse & credit | | | | x | |
| 188 | +| Software repositories & registries | x | x | | | |
| 189 | +| Unique persistent identifiers | x | x | | | |
| 190 | + |
| 191 | + |
| 192 | + |
| 193 | +[fair-principles-research-software]: https://www.nature.com/articles/s41597-022-01710-x |
| 194 | +[fair-data-principles]: https://www.nature.com/articles/sdata201618 |
| 195 | +[zenodo]: https://zenodo.org/ |
| 196 | +[software-registries]: https://github.com/NLeSC/awesome-research-software-registries |
| 197 | +[github]: https://github.com |
| 198 | +[biotools]: https://biotools.us/ |
| 199 | +[pypi]: https://pypi.org/ |
| 200 | +[cran]: https://cran.r-project.org/web/packages/ |
| 201 | +[gitlab]: https://about.gitlab.com/ |
| 202 | +[awesome-rs-registries]: https://github.com/NLeSC/awesome-research-software-registries |
| 203 | +[fair-rs-evaluator]: https://openebench.bsc.es/observatory/Evaluation |
| 204 | +[fair-rs-test]: https://github.com/marioa/fair-test?tab=readme-ov-file |
| 205 | +[codemeta]: (https://codemeta.github.io/) |
| 206 | +[rsmd-1]: https://fair-impact.github.io/RSMD-guidelines/1.General/ |
| 207 | +[software-heritage]: https://www.softwareheritage.org/ |
| 208 | +[swhid]: https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html |
| 209 | +[figshare]: https://figshare.com/ |
| 210 | +[howfairis]: https://github.com/fair-software/howfairis/ |
| 211 | +[cff]: https://citation-file-format.github.io/ |
| 212 | +[opensource-licence-guide]: https://opensource.guide/legal/#which-open-source-license-is-appropriate-for-my-project |
| 213 | +[choosealicense]: https://choosealicense.com/ |
0 commit comments