Data journals give an opportunity for researchers and data producers to formally publish, and gain acknowledgement for, their research data outputs.
This guide is for librarians, researchers and data managers who are involved in the deposition, publication, curation and citation of published data. It includes examples of data journals and outlines data submission criteria. Products and services that may influence the scholarly impact of data journals are also briefly described.
What are data journals?
Data journals are publications whose primary purpose is to expose datasets by providing the infrastructure and scholarly reward opportunities that will encourage researchers, funders and data centre managers to share research data outputs.
Data journals have evolved from the more traditional journal model, that describe datasets including supplemental material that links to datasets. Data journals have more in common with those journals that publish articles or overlay papers that describe data, but take the concept a few steps further. Fundamentally, data journals seek to promote scientific accreditation and re-use, improve transparency of scientific method and results, support good data management practices and provide an accessible, permanent and resolvable route to the dataset.
Examples of data journals include:
- Geoscience Data Journal - published by Wiley and established in 2012
- Scientific Data - published by Nature and established in 2013
- Journal of Open Archaeology Data - published by Ubiquity and established in 2011
- Biodiversity Data Journal – published by Pensoft and established in 2013.
A paper by Candela et al (2015) discusses data journals in depth and reviews over 100 data journals.
Why data journals?
As the primary purpose of data journals is to expose and share research data, this form of publishing may be of interest to researchers and data producers for whom data is a primary research output. It enables the author (or data producer) to focus on describing the data itself, rather than producing an extensive analysis of the data. In some cases, the publication cycle may be quicker than traditional journals, and where there is a requirement to deposit data in an "approved repository", long term curation and access to the data is assured.
Publishing a data paper may be regarded as best practice in data management as it:
- includes an element of peer review of the dataset
- maximises opportunities for reuse of the dataset
- provides academic accreditation for data scientists as well as front-line researchers.
While individual publisher policies vary, it's worth noting that publishing data through a data journal does not necessarily prevent the publication of data analyses and research results in a traditional journal - along with a reference and links to the data journal paper. This provides readers with access to all relevant information about a piece of research and may result in citation of both the journal article and data paper.
Data journals, like traditional journals, have differing requirements for submission, review and publication. However, to give a sense of requirements for data journals and how they may differ from traditional journals, the following points are indicative:
- Depositing data: data may need to be deposited in an "approved repository" or with the journal itself. There may be restrictions or guidelines on file size and format as well as specific requirements for metadata or data description.
- Citation and identifiers: some data journals require that data be assigned a Digital Object Identifier (DOI) or other form of persistent identifier. There may also be a defined or recommended data citation format.
- Researcher profile: you may be asked to provide details of your research profile, organisation and affiliations.
- Copyright and licensing: in addition to copyright, you may also need to consider (and agree to) licensing and access conditions for the data to be published.
Formal publication and citation of data supports the recognition of research data as a first class research output. It also enables the generation of citation metrics for research data outputs. With products such as the Thomson Reuters Data Citation Index capturing data citation metrics, the potential for formal recognition and reward mechanisms based on data publishing is enhanced. ANDS is working with Thomson Reuters to enable direct feeds to and from Research Data Australia (RDA) in the DCI to show citations in RDA.
A number of data journals also support 'altmetrics', such as number of views, number of downloads, social media 'likes' and recommendations. These can be early indicators of the impact of data, before the long tail of formal citation metrics can be assessed.