What is research software
Software, in source code or compiled form, supports scholarly research. Software may be downloaded, compiled, executed and instantiated.
Why cite research software
Software is pervasive in research. A UK Research Software Survey of 1,000 randomly chosen researchers shows: more than 90% of researchers acknowledge software as being important for their own research, about 70% say their research would not be possible without it. In a separate study looking at 40 papers published in Nature from January to March 2016, 32 of them explicitly mentioned software. These surveys provide evidence that software plays an important role in research, and hence, software should be treated in the same way as other research inputs and outputs that are part of the record of research such as research data and paper publications.
Proper citation of software has the following benefits:
- Ensures scientific transparency and reasonable accountability of a researcher
- Aids scientific reproducibility through direct, unambiguous reference to the precise software used in a particular study
- Provides fair credit for software developers or researchers who spend time developing software
- Assists in tracking the use and reuse of software through reference in scientific literature and within other software
- Helps developers verify how their software are being used.
How to cite research software
Various international organisations have been working to develop guidelines for software citation. Examples of these include:
- Force11 Software Citation Principles
- DataCite Metadata Schema 4.1 (with additions to describe software and examples for software citation)
In general, software should be cited in a similar fashion to data and research papers.
The core required elements of a citation are:
- Author(s) - he people or organisations responsible for the intellectual work to develop the software.
- Publication Year - the year when the software was published to a repository or any other publication venue.
- Title - the formal title of the software/service.
- Version - the precise version of the software used. Careful version tracking is critical for accurate citation.
- Publisher - the repository where software is held, archived, distributed, released or produced, ideally an institutional or disciplinary repository that provides curation of software over the long term. For example, Climate Data Gateway at NCAR, NASA Earth Exchange, Zenodo, Github.
- Locator/Identifier - a persistent identifier (PID) for the software such as a DOI, Handle or ARK that resolves to landing page. DOI is considered best practice for software citation. DOIs are a unique, persistent identifier that can be used to track software citation metrics and to link related research outputs such as journal articles and research data.
The DataCite DOI Citation Formatter is a simple online based system which uses your dataset DOI to allow you to quickly format your citation in hundreds of different styles.
If a DOI/PID doesn’t exist, the URL can be used but must be used in conjunction with the access date.
- Access Date (optional) - ongoing development of a software may not always be reflected in release dates and versions. It is important to indicate when a software was accessed, especially when a software is not referenced through its DOI but a URL indicating the software’s location.
Software citation format:
Creator (PublicationYear): Title. Version No. Publisher. [resourceTypeGeneral]. Identifier.
In the case of software citation, use resourceTypeGeneral=software
For example:
Xu, C., & Christoffersen, B. (2017). The Functionally-Assembled Terrestrial Ecosystem Simulator Version 1. Los Alamos National Laboratory (LANL), Los Alamos, NM (United States). [Software]. https://doi.org/10.11578/dc.20171025.1962
Where the software is a library that was developed and run on a software platform, for example, a kinetic analysis software library with Matlab (TM) wrappers, it can be cited as follows:
Dowson, Nicholas; Baker, Charles; Raffelt, David; Smith, Jye; Thomas, Paul; Rose, Stephen; Salvado, Olivier (2014): InsightToolkit Kinetic Analysis (itkka) Software Library. v1. CSIRO. Software Collection. https://doi.org/10.4225/08/540E9A7D11EB0
Where the software does not have a DOI, but is accessible from a URL, a suggested citation format is as follows:
Creator (PublicationYear): Title. Version. Publisher. [resourceTypeGeneral]. URL. Access Date.
For example:
Jones E, Oliphant E, Peterson P, et al. (2001). SciPy: Open Source Scientific Tools for Python, [Software]. http://www.scipy.org/ [Online; accessed on 2018-07-26].
When the locator/identifier is a URL which doesn’t point to the exact version that has been utilised in the research, it is important to include an access date as this may help to identify the version.
Some repositories provide a recommended format for citing software from that repository.
For example:
Boulder, Colorado: UCAR/NCAR/CISL/TDD. http://dx.doi.org/10.5065/D6WD3XH5
References:
- Hettrick. S. J., et al., (2014). UK Research Software Survey 2014 [Data set]. doi:10.5281/zenodo.14809
- Carver, J.C., Gesing, S., Katz, D. S. , Ram, K., and Weber, N., (2018). Conceptualization of a US Research Software Sustainability Institute (URSSI), in Computing in Science & Engineering, vol. 20, no. 3, pp. 4-9, May./Jun. 2018. doi:10.1109/MCSE.2018.03221924
- Smith A. M., Katz D. S., Niemeyer K. E., FORCE11 Software Citation Working Group, (2016). Software Citation Principles. PeerJ Computer Science 2:e86. doi:10.7717/peerj-cs.86.