ANDS Logo
bannerbannerbannerbanner
 Find research data:

Best practice for creating collection records

All about collections

Structure and meaning  |  Best practice for creating a collection record  |  Example RIF-CS

Creating a collection record

Research Data Australia has been set up to register collections. Related parties, activities and services in Research Data Australia provide context and meaning for the collections. The collections registered are most commonly datasets, but they can also be of "collection" type ("compiled content created as separate and independent works"), such as museum or archive collections; information collections, such as registries, catalogues and indexes; or aggregated collections, such as are found in repositories.

Research Data Australia collection record examples

Step 1: Should I create a collection record at all?

Research Data Australia is a collections registry; as long as the entity you are intending to describe can be understood as a research collection, it can be described within Research Data Australia. That means what is being described:

  • is an aggregation of resources, and will be understood as a single aggregation of resources within its research context;
  • are not exclusively documents as the output of research, although they can certainly be documents as the subject matter of research;
  • has Australian relevance, either through involvement of Australian researchers, or Australian subject matter.

Step 2: How do I model my collection(s)?

A collection may be described as a self-standing entity, or it may be related to other collections. The most common relations between collections are hierarchical, for example, where a collection is derived from another collection, or part of another collection. Both these relations are supported within Research Data Australia.

Lateral relations between collections, such as "these collections have subject matter in common", "these collections are part of the same larger collection", "these collections have come out of the same research activity", and "these collections have the same primary collector" are not required in Research Data Australia. That is because faceted displays support this type of relation between collections, and the information is already captured in the system through the description of related objects. Step 12 explains relations between objects in Research Data Australia.

Step 3: Provide values for the elements of the collection record

For common elements see the Content Providers Guide. The steps below discuss issues that are specific to collections.

Step 4: Type

Collection type is required. There are five collection types, catalogueOrIndex, collection, dataset, registry and repository. More information about collection types

Step 5: Key

Keys are required and must be unique in Research Data Australia. The key identifies the collection metadata record in Research Data Australia. More information about keys

Step 6: Name

  • Collection names/titles should be as descriptive as possible. They should include keywords to provide context for non-specialist users, as well as information such as the nature of the data and spatial and temporal coverage.

    For example, a collection named "Pilbara" may be adequate in the context of a particular discipline database, but not in the general context of Research Data Australia. It would be more informative to provide a name like "Western Australian Geological Survey: Pilbara" or "Aboriginal Art Collection: Pilbara, 1950-1965".
  • Collection titles should be unique and generally should not use acronyms. If acronyms are used, they should be spelled out in an alternative title.
  • Collection names/titles must be provided in a single name part.  Name types for collections are "primary", "abbreviated" and "alternative". Each name type is contained in its own name element—this is necessary to ensure proper display of names in Research Data Australia.  More information about names

Examples

<name type="primary">
    <namePart>
        Franklin Voyage FR 04/2001 Acoustic Doppler Current Profiler Data
    </namePart>
</name>

<name type="abbreviated">
    <namePart>
        Franklin Voyage FR 04/2001 ADCP Data
    </namePart>

Step 7: Location

Collection location enables users to access the collection. This may mean direct access to the collection or mediated access via a contact person or organisation. Appropriate locations may include an electronic address or a physical address. Spatial location is less commonly used, since electronic and physical addresses fulfil the purpose of enabling access more directly. Note: do not confuse spatial location with spatial coverage.

An appropriate electronic address for a collection is a URI to a landing page in a repository and/or an email address of a person or organisation which can respond to enquiries about access. The electronic address may also lead directly to download of the collection. A physical address for the researcher's office/research centre might also be appropriate, particularly if access is mediated.

Spatial location describes where a collection is physically located, using geospatial coordinates such as latitude and longitude. This may be useful for physical collections such as museums and archives. More information about location

Currency of location information

Only use "Date From" or "Date to" for collection location information if you need to describe a period of time during which the location information was current. Date ranges should only be used where the address has changed and older addresses have been recorded in the metadata being provided. More information about date range

Step 8: Coverage

Temporal and spatial coverage describe the locations in space and time to which collections relate: they are the location or time that something is about, not the location or time where something is.

  • Temporal coverage is included where there is a well-defined time period during which data was collected or observations made, or a time period that a collection is clearly linked to intellectually or thematically. The time period that the collection is linked to intellectually or thematically takes priority over the time period during which data was collected or observations made. If temporal coverage applies, preferably this is also reflected in the collection name and/or description.
  • Spatial coverage refers to the geographical area where data was collected or a place which is the subject of a collection. The "types" of spatial information which can be expressed include co-ordinates, codes and text. Note: do not confuse with spatial location. More information about Coverage

Step 9: Description

Good quality collection descriptions will increase the chances of a collection being discoverable through search engines, as well as helping researchers decide if the collection is likely to be useful for them. The following principles are recommended:

Write for a generalist audience

  • Don't assume a reader has specialist knowledge. Write the description for a reader who has general familiarity with a research area but is not a specialist—this will make data more accessible for cross-disciplinary use.
  • Specialist acronyms or obscure jargon should be avoided or explained.
  • Include keywords that are obvious and usually implicit within a discipline, so that this information can be made explicit in the more generalist context of Research Data Australia.

Support search

  • Include important keywords within the text to make them accessible for search engines.
    • Inclusion of a paragraph beginning with Keywords: and followed by a list of keywords is acceptable. However, best practice would be to include the keywords as subjects.

Focus on the collection

  • Describe the collection, not the project (activity).
  • Describe the collection, not the publication.
    • Re-use of abstracts or research proposals can be a useful source for a description, as long as it is appropriate to the collection being described and is edited to read as a collection description. ANDS prefers that headings (such as 'Abstract' or 'Executive summary') not be imported along with the abstract if possible.

What to include

  • Include a description of the kinds of objects in the collection (for example, database, printed photographs, digital images, lab notes) and the basis of selection for objects included in the collection (for example, information about how data was collected or analysed), as well as describing what the collection is about.
  • Collection descriptions should be consistent with the assigned collection type. If describing a dataset, the description should be about that dataset, not just a general description of the research that created the dataset.
  • Collection descriptions should expand on the temporal and spatial coverage information included in those elements and be consistent with that information.
  • Do not include hyperlinks within the description field. These will not display as links in Research Data Australia. Instead, include links in the Related information part of the XML document.

Use of significance statements

  • Descriptions can also include signficance statements (using type=significanceStatement). A significance statement is a statement describing the significance of a collection within its domain or context.
    • Significance statements are produced for many museum collections in Australia. These statements assist researchers, granting bodies and others in assessing the value of the collection and providing context to the collections. The production of significance statements and how significance is assessed and described is a fundamental part of curatorial practice. Significance statements may also be used by data providers other than museums to bring attention to important aspects of a collection within its discipline context.

More information about description

Step 10: Identifier

  • Persistent, unique identifiers such as DOIs are preferred for collections. All global identifiers used publicly for the collection should be included, as an aid to discovery and citation of collections. More information about DOIs and their role in supporting citation
  • Local collection identifiers that uniquely identify the collection within the domain of a specified authority, for example, an institutional repository, should also be included.

DOI example

<identifier type=" doi">10.4225/02/4E9F69F7AE206</identifier>

This is a Digital Object Identifier minted by ANDS for a collection in Research Data Australia. See this record in Research Data Australia

Other examples (fictional)

<identifier type="local">collection:84729</identifier>

<identifier type="uri">http://www.myuni.edu.au/collection:84729</identifier>

More information about identifiers

Step 11: Subject

Provide a subject to allow Research Data Australia to associate a collection with a research field, and, indirectly, with other collections in the same field. Subjects are recommended in collection records.

The subject represents the primary topic or topics covered by the collection.

If you provide any subjects, you must provide a subject from the Australian and New Zealand Standard Research Classification (ANZSRC) 2008.  This is used as a common subject vocabulary across Research Data Australia. ANZSRC "Field of Research (FOR)" codes should be used whenever available. ANZSRC "Type of Activity (TOA)" and/or "Socio-economic Objective (SEO)" codes may also be used.

Preferably, terms from other vocabularies (e.g. LCSH, MeSH) and/or local subjects (keywords) should be used in addition to the ANZSRC codes. Under RIF-CS v1.3.0, Linked Data URLs for vocabulary terms should be provided where available. More information about subjects

Step 12: Related object

Collection records are connected to activity, (other) collection, party and service records by using the keys for those other records.

Example

<relatedObject>
    <key>102.100.100/6676</key>
    <relation type=" isManagedBy"></relation>
</relatedObject>

Primary relationships

Use the primary relationship opt-in function in the Data Source Account configuration page to link all records within your data source to a party record for your organisation. This will allow all your organisation's collections to be discoverable via the party record describing your organisation. More information about primary relationships

Bidirectional links

ANDS infers and displays bi-directional links between related objects in Research Data Australia. If a collection links to a party within the same data source, the party record does not need to link back to the collection; ANDS will display the inferred reverse link in Research Data Australia. If the party and collection are from different data sources, ANDS will only display the inferred reverse link if the receiving partner has opted in to allow bi-directional links. More information about bi-directional links

For manually supplied records, ANDS requires partners to provide links in both directions, to familiarise themselves with the link structures involved.

Related Parties

A collection must be related to at least one party. This is to allow discovery of the collection through the parties responsible for it, and to provide a contact point for queries from users.

If multiple parties within the institution have made a substantial contribution to the collection, the collection is related to all those parties.

Party records may also be available, through Trove, People and Organisations or existing records on Research Data Australia, for external collaborators who have made a substantial contribution to the collection. If so, those records are also included as related parties.

Related Collections

Hierarchical relations between collections are important to describe if they provide essential context for a collection. Lateral relations between collections are not usually necessary.

Related Activities

A collection must be related to an activity if it is the output of a well-defined funded project. There is no need to relate a collection to an activity if the collection was collected through an organisation's business-as-usual activities.

If you wish to include details of the funding for the collection, include this information in the activity's description (note) element.

Related Services

A collection should be related to a creation service if it is produced by an instrument or software, and other collections are also likely to be produced by the same instrument or software. This is so that collections produced through the same service can be related for discovery.

A collection should be related to a discovery service if it is exposed for discovery via a particular machine protocol, other than the keyword search found by default in portals.

Step 13: Related Info

Include links to related information which is external to Research Data Australia, and which provides research context for understanding the collection. Related information types are "publication", "website", "reuseInformation" and "dataQualityInformation".

An example of related information to include is the identifier and title of a publication resulting from the data in the collection. More information about related information

Publication example

<relatedInfo type="publication">
    <identifier type="uri">http://hdl.handle.net/1959.3/95058</identifier>
    <title>Growth and change dynamics in open source software systems (PhD thesis)</title>
</relatedInfo>

Step 14: Rights

At least one description of rights, licences and/or access rights for a collection is required in a collection record. Rights information supports the re-use of collections. In RIF-CS v1.3.0, a hyperlink can replace a rights, licence or access rights description.

Rights statement: a formal statement about the intellectual property rights is given; the statement sets out information about rights held in and over the collection such as copyright and other intellectual property rights. The rights holder for the collection is indicated clearly.

Licence: the text of a legal document giving official permission to do something with a collection.

Access rights: a formal statement of access rights and constraints is used; the statement sets out information about who may access the collection, when access may occur (including any embargo), and uses that may be made of the collection. Information about access rights is kept separate from information about rights. More information about rights

Step 15: Citation

If a collection is published, the citation for the published collection should be included.

Citations support the re-use of and long-term access to collections. The full citation gives a dataset citation in a single full text string, while citation metadata gives a dataset citation split into machine-readable component parts that referencing software can import from Research Data Australia.

Citations for collections should include a URI or other resolvable identifier (e.g. DOI). More information about citation

Date Change history
1 Feb 2012 New, separate and expanded Collection Best Practice page

 

 

 

Please send any feedback on this page to guides@ands.org.au