Research Data Australia has been set up to register collections. Related parties, activities and services in Research Data Australia provide context and meaning for the collections. The collections registered are most commonly datasets, but they can also be of "collection" type ("compiled content created as separate and independent works"), such as museum or archive collections; information collections, such as registries, catalogues and indexes; or aggregated collections, such as are found in repositories.
Collection records created to best practice standards will:
The steps below discuss issues that are specific to collections. See Collection for general information.
|1 Feb 2012||New, separate and expanded Collection Best Practice page|
|2 Nov 2012||Added "metadata" as a type in Step 13: Related Info|
|20 Nov 2012||Added dates (collections) as Step 14|
|9 May 2013||Reordered steps to align with Release 10 interface changes|
|26 November 2013||Updated Related Info to include information about the element according to RIF-CS v1.5.0|
|28 March 2014||Added information at Step 9, about the display of multiple collection records in Research Data Australia in Release 12|
|15 May 2014||Modified contents; modified Step 3 to include information about what best practice means|
The purpose of a party record in Research Data Australia is to support discovery of research data collections and to provide context to those collections.
ANDS collaborated with the National Library of Australia to provide infrastructure for describing parties using the Trove service. All ANDS partners can use this infrastructure to create party records which can be harvested into Research Data Australia from Trove. See Trove and TIM and the ARDC Party Infrastructure for detail.
At least one party record must be related to each collection described in Research Data Australia (see Metadata Content Requirements).
If a record is already available for your party you do not need to create a new party record. Link to the existing record by including either the Research Data Australia party record key or the NLA Trove party identifier in the related object element of your records.
Povide your own description of the person or group in a separate party record. You cannot edit other partners' records, including the National Library of Australia’s (NLA) records.
See Step 8 for how to make sure that both your record and the existing record are treated as describing the same party.
Note: Most partners will need to contribute party records, because many researchers may not have records yet, or the records that do exist may be inadequate.
External researchers including international researchers
ANDS has altered its position on creating party records for researchers external to your research institution. Organisations can now create party records for Australian and international researchers external to an organisation or independent of any organisation, where these records do not already exist. The organisation should perform a search for such existing records before proceeding to creation (see Step 2). If a new party record is created, the record-creating organisation needs to take responsibility for including the name of the employing institution, if available.
ANDS suggests that you work with collaborating institutions to ensure that all researchers responsible for the generation of collections are appropriately acknowledged by having party records in Research Data Australia. The ARDC Party Infrastructure Project is intended to help partners access party records for external researchers.
At least one party should be related to a collection (see Metadata Content Requirements). If multiple parties can be related to a collection, the description should aim to link the collection to any party that will improve discovery substantively.
That means all active known collaborators on the project
As a default, that would include all named researchers on the research grant application, but not support staff or research assistants.
Organisations are hierarchically organised, and you may have a choice of hierarchical level at which to represent your party (group). For example, you may link your collection to your research lab, to your department, your faculty, or your research institution. Which level of group is represented as responsible for collections is a matter of institutional policy.
Remember that Research Data Australia is not intended to represent group hierarchies. The default approach should be to represent the lowest-level group, with the most direct engagement with the collection. That means that the research lab or individual researchers are the parties of interest that need to be described and linked to a collection, in preference to departments or faculties. The name of a research lab, for example, may include its superior body's name as part of its name, e.g. Budawang University, Frontiers of Chemistry Research Laboratory.
Research Data Australia groups an institution's information for display using the Group attribute rather than requiring connection of all collection records to an institutional party record. However, contributors may make such connections if they wish.
Disambiguation: A party of type "group" is not the same as and has no relationship with the "group" attribute to which a Registry Object is linked for purposes of display in Research Data Australia. The "group" attribute is also used as the basis for enhanced contributor home pages. See Group for more information about this attribute.
Search to see if your party is already described before adding a new party record.
Where to search:
Type of party is required. There are three party types, person, group and administrativePosition.
Administrative position is a kind of party where the position name and contact information are present but the identity of the party filling the role is not specified.
Many data sources provide ANDS with party names such as "Data custodian", "Data Officer", or "Data Manager". There are over 1000 collection records (about 10% of the records in Research Data Australia) that contain role names for party records. Such role names are common in large data management environments.
Remember, a party of type "group" is not the same as and has no relationship with the "group" attribute.
Keys are required in party records. Keys must be unique in Research Data Australia. More information on creating keys for parties.
Collection records link to party records using the Research Data Australia party record key. Alternatively, collection records link to the NLA party records using the NLA party identifier (see Step 8 for more about identifiers).
Names for parties should be described by recording each name component in a separate name part. The type of name part is described by choosing from the following :
Only use Date range if the name has changed over time and older versions of the name have been recorded in the metadata being provided, such as when a research centre has changed its name since the related dataset was created.
|14 Oct 2011||New, separate and expanded Party Best Practice page|
|21 Nov 2011||RIF-CS v1.3.0 change information added: Party Type (administrativePosition), NameType (superior and subordinate), using primary relationships to link all collections to a party, existence dates, using termIdentifier in collections to describe parties as subjects.|
|6 Dec 2011||Added links to search for NLA party ID, Scopus author ID, & ThomsonReuters ResearcherID|
|27 Apr 2012||Added advice that party records for international researchers may be included in Trove|
|4 May 2012||Added links to new Trove index page|
|8 Jun 2012||Added administrativePosition party type examples|
|12 Jul 2012||Added more information about recording names (relocated from Name page)|
|10 Oct 2012||Clarified advice that initials should be included in Name part type="given"|
|6 Mar 2013||Added link and example for ORCID ID.|
|9 May 2013||Reordered steps to align with Release 10 interface|
|26 November 2013||Updated Related Info to include information about the element introduced in RIF-CS 1.5|
|24 March 2014||Updated Step 1 section "Which party records am I responsible for creating?" to explain that it is now acceptable to create party records for external researchers including international researchers|
|28 March 2014||Updated Step 8 with information about R12 and multiple parties|
The activity records in Research Data Australia enable the discovery of research data collections and provide context to a linked research data collection.
Create an activity record relating to a collection, if it can provide meaningful context to a collection.
Projects can always provide meaningful context as activity records: they are well-documented (because of funder requirements), with named investigators, a budget, and abstracts.
However, not all collections are gathered through easily-described activities. In the following cases, the activities are so hard to describe, or provide so little benefit for discovery, that there is no point in creating records:
Your activity may already be described in Research Data Australia. If you are collaborating on a project with other institutions, someone from the other institution may have already created an activity record for your project. If your project was funded by the ARC or NHMRC, the ARDC Activity Infrastructure is populating Research Data Australia with project records from these funders.
Check in Research Data Australia to see if your activity is already described, before adding a new activity record:
In the future, ANDS will be engaging with the ARC and NHMRC to provide Linked Data infrastructure for research projects. You will be able to do machine-to-machine and web search for project records, and get syndication of project descriptions and associated researchers.
Where consideration has been given to generating activity records from a university's grants database, it is important to be clear as to what the database is describing, grants or projects. If grants, then it may not be a suitable source of information about activities and you should consider creating your own activity record.
If a record is already available for your activity, create your own activity record only if you wish to add value to what is already there. (For example: recording project dates rather than grant dates, a more comprehensive description, a more up-to-date description, a description better reflecting your institution's concerns.) You cannot edit other partners' records, including the NHMRC and ARC records. If the existing record doesn't say enough about your project, you will need to create a separate activity record. Step 8: Identifiers explains how multiple records for the same project are treated.
If an existing activity record is adequate for your needs, then you do not need to create a new record. To link to the existing activity record from your party, service and collection records, use the key of the existing activity record to link to the activity through Related Object.
Most contributors to Research Data Australia will need to contribute some activity records as the ARC and NHMRC are not the only funders of research activity.
For elements common to other object classes, refer to the definitions of those fields. The steps below explain issues that are specific to activities.
Types of activity are project, program, course, award and event. Activity provides detailed descriptions of the various types.
Follow the Content Providers Guide, RIF-CS best practice guidelines for Key.
Do not use the NHMRC or ARC identifiers as keys for your own activity records: keys of the form http://purl.org/au-research/grants/arc/... and http://purl.org/au-research/grants/nhmrc/... are reserved for records from those funders.
Try to give your activity a name that is distinct from related groups or collections. If the collection and activity have the same name, you can suffix "project" to the activity name, or "dataset" or "collection" to the collection name. The activity name is usually already registered with a funder and in the institution's research office, so it is preferable to change the collection name.
The description of projects is typically taken from the abstract sent to the funder in the grant application. However, grants and projects are not necessarily the same thing—the grant abstract describes what you intend to do, the project abstract describes what you have ended up doing. You may choose to adjust the wording of the activity description to reflect this difference. Activity provides more information on activities, projects and grants.
Description types "brief" and "full" give summaries of the activity. If you want to provide other information about the activity, such as funding details, give it in a description of type "note". Activity gives an example of a note recording grant details.
Include any activity identifiers, and if there is a public identifier for either the activity or the associated grant also provide that identifier.
Persistent identifiers in the form of PURLs have been minted for each ARC and NHMRC research grant, and will always resolve to public information about that research grant. Include these identifiers in your own activity records if the activity is funded by ARC or NHMRC (identifier type="purl").
The PURLs resolve to the corresponding project records in Research Data Australia. They will persist through any changes to the ARDC Activity Infrastructure, including any future search and syndication services provided directly by the ARC and NHMRC. The identifiers for ARC and NHMRC projects should also be on file in your research office (in a non-PURL format).
The preferred form of ARC and NHMRC identifiers is the resolvable PURL.
Note: Some legacy records may show ARC and NHMRC identifiers as strings (e.g. <identifier type="arc">DP0559024</identifier>, <identifier type="nhmrc">100009</identifier>)
Australian Research Council identifier example:
NHMRC identifier example:
Where two or more activity records, from the same or different data sources, share common identifiers, the records are treated as describing the same activity.
In Research Data Australia, the records are merged into a single search result and links to each of the merged records are displayed on the view page of each record.
This feature of Research Data Australia is described in detail in Step 8 of Best practice for creating party records. The description and examples on this page apply equally to multiple activity records.
If there is no common public identifier that can be used to bring activity records from different sources together, partners should negotiate with each other to agree on who should provide a single comprehensive record, or if possible work with the project funder to develop a common public identifier.
For an activity, relevant locations may include a physical address or an electronic address. An appropriate electronic address for an activity is a URL to the project web page(s). If a research program has an office, a physical address might also be appropriate. An electronic address is preferred.
Temporal coverage refers to the intellectual content of an activity; for example, a project about the First World War has the temporal coverage 1914–1918.
Do not use temporal coverage to provide start and finish dates of projects; a project about the First World War should not have a temporal coverage of 2007–2009. Use the Existence Dates element to address project start and end dates. See Step 13: Existence dates.
Provide a subject to allow Research Data Australia to associate an activity with a research field, and indirectly with other collections in the same field. ARC and NHMRC activity records contain ANZSRC-FOR (Field of Research) subject codes. In any activity records you create, provide at least one ANZSRC-FOR code.
ANDS infers and displays bi-directional links between related objects in Research Data Australia. If a collection links to an activity within the same data source, the activity record does not need to link back out to the collection; ANDS will display the inferred reverse link in Research Data Australia. If the activity and collection are from different data sources, ANDS will only display the inferred reverse link if the receiving partner has opted in to allow bi-directional links.
ARC and NHMRC activity records have enabled reverse links. If your records link to an ARC or NHMRC activity record, a link back to your record will be visible from the ARC or NHMRC record in Research Data Australia.
Activities must be linked to a collection, through "hasOutput". By default, collections must link to activities, through "isOutputOf" —unless it does not make sense to provide an activity record.
Activities must be linked to a party. This is to allow networks of researchers and collections to be clustered around research activities (through the "hasParticipant" relationship). This also makes it possible for users to get in touch with at least one party involved in the activity, as a contact point (through the "isManagedBy" relationship).
Research Data Australia is a collections registry: relations between activities and parties are only relevant if they improve collections discovery. For that reason, the other possible activity—party relations, "isFundedBy", "isOwnedBy" and"hasAssociationWith", should only be included if they improve the discoverability of collections. If you wish to include details of the funding relationship between an activity and a party, include this information in the activity's description type="note" element.
Research Data Australia is a collections registry: accordingly, relations between activities (such as "isPartOf", "hasPart" or "isFundedBy") are only relevant if they improve collections discovery.
Partners can still describe umbrella projects using "hasPart", if that will present a more coherent view of the research question and approach than do the component projects ("isPartOf").
For activities, relations of type "hasPart" and "isPartOf" should only be established between activities of the same type, that is, between two programs, or between two projects. Relations between different types of activities need a more specific relation. For example, a program may fund a project, and this should be described using a "Funds"/"isFundedBy" relation.
If both an umbrella and a component project are described, ANDS prefers the collection to be described as the output of the component project, rather than the umbrella project. This is consistent with ANDS' approach to granularity for services and institutions.
No relations are currently modelled between services and activities. The existing relations "isOutputOf" and "isFundedBy" between activities and collections could be extended to services, but this is beyond the requirements of Research Data Australia.
The Existence Dates element records the start and end dates of the existence of the activity being described. Record the Start Date and where available, the End Date. End dates should be later than start dates.
It is important to note that grant dates for a project may not always align with activity start and end dates. For example, a project might extend beyond the funded dates.
Where the activity record is of type="project" then the existence dates should reflect the start and finish dates of the project.
|7 Jul 2012||
First web publication as separate page (previously part of activity page)
|12 April 2013||Included more extensive information about existenceDates and the differences between grant dates and project dates|
|9 May 2013||Reordered steps to align with Release 10 interface changes|
|28 March 2014||Updated Step 8 Identifiers with information about the display of multiple activity records for Release 12|
|19 June 2014||Content reviewed with minor changes.|
|Please send any feedback on this page to email@example.com|
Services in the research domain support the creation or use of research collections and datasets.
ISO 2146 defines a service as 'a system (analogue or digital) that provides one or more functions of value to an end user'. Services can be web services, provided across the web and following a well-defined machine protocol, such as OAI-PMH Harvest or RSS Syndication; but they may also be provided by offline software (e.g. the functionality of software running a simulation, or creating annotations).
As with parties and activities, the ANDS Collections Registry gathers service descriptions in order to provide context for the collections it registers, and to enable discovery of related collections, rather than to serve as an exhaustive registry of research services. For that reason, the services described in the registry are always related to collections—whether the service exposes the collection, or was involved in creating the collection.
To be used, a service must be implemented. Therefore, a service must have a specific delivery method which makes it available to a client.
Delivery Methods include:
Web services are the most straightforward type of service to model: the definition of their function and scope is specified through statements of behaviour and data representation, and they have a well-defined protocol for interaction with service clients. These protocols can usually be indicated through the service type.
Other types of service are used to model instruments, software, and workflows. These tools often do not have well-defined protocols for interaction, so protocols need not be specified in their service description. These tools also have properties which are not captured by modelling them as services (e.g. asset numbers, operating systems): this partial representation is deliberate, because of the restricted scope of service descriptions.
Service descriptions in the ANDS Collections Registry are meant to convey only high-level, indicative information. More complete detail about data collection provenance should be provided in local metadata stores, and linked to as Related Info from the service description.
Instruments are modelled as offline services—although strictly speaking what services model is the capability of instruments to create data collections. Instruments are often housed in facilities, but facilities should be modelled in the ANDS Collections Registry as parties: they are the organisations which own the instruments. Instruments can be composed of individual sensors; both the large-scale and more fine-grained instrument may be of interest to users. Instruments can be related to each other in a partOf relationship. For example, a specific detector can be part of a Synchrotron beamline instrument, or of a radio telescope.
Whether to model both the instrument and its component sensors in the ANDS Collection Registry depends on whether it will be useful to discover collections through sensors, rather than just through the instrument. This is a policy decision for partners; some partners have already elected not to do so. The details of sensors used to gather the data should at any rate be recorded in local metadata stores.
To be used, a service must also be instantiated: there must be a particular instance of the service being described, rather than the class of all matching services, and it should be possible to name the location of the service, and the parties managing the service. For example, the ANDS Collections Registry would describe the Monash University ARROW repository OAI-PMH feed, rather than giving a generic description of the OAI-PMH protocol.
Treating services as instances means that there may be many service records in the ANDS Collections Registry that look quite similar—distinct sensors, for example, or distinct deployments of RSS. As long as each instance is associated with a collection registered with the registry, it is still appropriate to distinguish between the service instances.
Exceptionally, software services may be described as implementations, rather than instances. A record can describe the downloadable software for the service, rather than an instance of the software running on a specific machine. A separate record would still be expected for different versions of the same software, or for different implementations.
Depending on how services relate to collections, services can be classified as Creation services, Metadata services, Discovery services, or Reuse services.
Discovery services are typically web services; creation services typically have other delivery methods. The service type is described by choosing from the following:
The kind of service (service type) is described by choosing from the following (ANDS is currently considering expanding this list):
The service names for creation & metadata services are deliberately generic (and are taken from the e-Framework, which is not research-specific). To apply them, use the following:
What is the input into the service?
No reuse services have been included in the current service type vocabulary. The service type vocabulary can be expanded as the community requires—subject to the constraint that it describes services in the ANDS Collections Registry, which are specific to registered collections.
Services may also have access policies. These are described in a separate element. More information
Researcher Fred from Notre Dame University uses the Brahe interferometer on the Farnell Radio Telescope, to gather observations on pulsar THX-1138. The observations are registered with ANDS as a collection.
The pulsar data collection represents raw data. The Tempo2 pulsar timing software is used to extract pulsar timing data from a range of observations, including TXH-1138, and the resulting analyses are also registered with the ANDS Collections Registry.
The pulsar data collection is exposed for search through the SRU protocol. The web service allowing this search is hosted at the University of Launceston.
The following diagram illustrates the relations of the objects described in this scenario:
The date metadata describing a service was last changed in the source system can be recorded. See Date modified.
Metadata records describing services are grouped together on the Research Data Australia home page. The service category and service type are displayed. The hyperlink to a page or XACML document describing service access policies is displayed. Date modified is not displayed. All information is searchable.
Often a collection is tightly bound with its discovery service, so there can be confusion about whether to model it as a collection or a service. The purpose of the ANDS Collections Registry is to promote the discovery of collections, not of services. So an entity such as a repository or portal must have a relevant collection description contributed to the registry. It can also have a relevant service description contributed, if that service description adds sufficient value. A discovery service that does not provide access to a specific collection is not relevant to the ANDS Collections Registry, and likely needs to be modelled differently.
For example: a podcast is a collection of recordings, combined with a syndication service for accessing that collection. The podcast should be described for ANDS as a collection, since that is the aspect of the podcast most relevant to the Collections Registry. The RSS feed to the podcast can be added to the Collections Registry as an associated discovery service (syndication-rss). But the podcast should not be described as a service instead of a collection.
HTTP-Search for a single keyword can be assumed as default search functionality for a collection. (This is the single search box on the home page of most collections.) If the ANDS Collections Registry already has a description of such a collection, then a single-keyword search need not be registered in the ANDS Collections Registry as a distinct service description.
Portals provide access to an aggregation of collections. A portal can be modelled as either a service or as a collection; if it is modelled as a service, its constituent collection should also be described in the ANDS Collections Registry.
The service type is a two-part string, with the first part specifying the service genre and the second part specifying the protocol (for example, syndicate-rss, harvest-oaipmh, search-sru). For creation and metadata services, which do not have generically used protocols, only the service genre is specified.
If there is a well-defined protocol for an instance of a creation or metadata service, the service description should provide that protocol information in the Related Info element. Added protocol information should also be provided in the Related Info element for discovery services, if there are local extensions to the service protocol that service users need to know.
The value for the service genre is taken from the set of service genres registered with the e-Framework. The protocol is taken from known services identified by initial Collections Registry content providers. New genre-protocol combinations may be added on application to the RIF-CS schema manager (contact firstname.lastname@example.org).
Software tools can have multiple types applicable out of the service type vocabulary: unlike web services, software tools can perform multiple functions. However the service description of software tools shall have a single type, reflecting the primary use of the tool in the research community.
For web services, the electronic address is a URI that provides access to the service: in particular, it is a URI that can be processed by a client following the service protocol (service endpoint).
If the service is syndicate-rss, for example, the location in the service description will be a URI that can be processed by an RSS reader.
Web services alone may use the <arg> element in addition to the <value> element, to differentiate between a base URL and the service arguments. This only applies to HTTP Query services, in which the service call URL contains service arguments. The <arg> element indicates whether each of the URL arguments is required or optional, whether they are plain text or embedded objects, and whether they are inline (embedded in the base URL) or key-value pairs in a HTTP query. The <arg> element does not describe the semantics of the arguments, and should not be treated as a substitute for linking to protocol documentation for the service.
If the electronic address type is "wsdl", the <value> element must be a URL pointing to the WSDL file. Human-readable descriptions of the service online should be recorded in the Related Info element instead. A physical address or electronic address (email) can be provided as a contact for arranging access to the service. Typically this will be the same address as for the party managing the service.
For software and workflows, the electronic address is likewise a URI that provides access to the service: in particular, it is a URI that the software or workflow can be downloaded from. In this case too, human-readable descriptions of the software should be recorded in Related Info instead. A physical address or electronic address (email) can be provided as a contact for arranging access to the service.
For offline services, a web address is not acceptable as a location. That is because an instrument home page does not provide direct access to the service, the way an RSS feed address or a search query does. Web pages about the service should be recorded in the Related Info element, just as they are for online services. A physical address or electronic address (email) should be provided instead; as above, the physical address is intended to allow users to gain access to the offline service (contact address).
Delivery Method will be suggested for inclusion in future versions of RIF-CS. As an interim measure, include the delivery method as a string without spaces (webservice, software, offline, workflow) in a description element of type "deliveryMethod".
Where two or more service records, from the same or different data sources, share common identifiers, the records are treated as describing the same service.
In Research Data Australia, the records are merged into a single search result and links to each of the merged records are displayed on the view page of each record.
This feature of Research Data Australia is described in detail in Step 8 of Best practice for creating party records. The description and examples on this page apply equally to multiple service records.
Note: “local” identifiers are not used to link multiple records together.
Most of the relations described below are bidirectional; for discovery to be most effective, they should be represented in RIF-CS in both directions. In particular, if a collection links to the creation service that produced it, the creation service should also link out to all the collections it has produced. This allows discovery of more collections.
Often information on relations is only available in one direction: the description of a collection will link to the service that produce it, but the description of the service does not have access to the collections that the service has produced. In such cases, it is desirable for ANDS to automatically generate bidirectional links between the objects. This functionality is forthcoming.
Currently the only relation modelled between services is hasPart/isPartof. Creation services can often be modelled as part of another creation service, as with sensors and instruments, or individual services and service workflows. Metadata and Discovery services, on the other hand, are not normally modelled as forming part of other services.
Service descriptions must have a relationship to at least one collection. Depending on the service type, services and collections can have the following relations:
The supports/isSupportedBy relation is generic; the other relations are specialisations of this relation.
If a transform or assemble service is used to change collection A into collection B, the service operates on input collection A, and produces output collection B. (For collection discovery, the produces relation is more important than the operates on relation.) Collection A and collection B are related through the relation isDerivedFrom/hasDerivedCollection. This relation is distinct from partOf: if a collection is derived from another collection, the output is a new collection, and is not considered part of the old.
If service A is part of service B, and service A is related to a collection, then service B should not also be modelled has having the same relation to the collection. It is best practice in information science to link only to the most detailed level. For example, a collection would be linked only to the Brahe interferometer—and not to both the Brahe interferometer and the Farnell telescope. Users should navigate down from the Farnell telescope to discover collections associated with individual receivers.
The following relations can be modelled between parties and services:
The relationship between a facility and its instruments is modelled through the isOwnerOf relation.
Note that the owner of a service is distinct from the owner of the associated collection. In the example above, the Norfolk Island Astronomical Commissariat owns the telescope that captured the pulsar data, but the pulsar data itself is owned by Notre Dame University.
No relations are currently modelled between services and activities. The existing relations isOutputOf and isFundedBy between activities and collections could be extended to services. However this level of detail is beyond the requirements of the ANDS Collections Registry, and is appropriate instead for a services registry.
The relation hasAssociationWith, as with other registry object classes, allows an unspecified relationship to be signalled between the service and the target object.
|April 2010||Consultation draft|
|26 October 2010||First web publication|
|25 January 2011||Complete revision to add creation and metadata services|
|14 April 2011||Added link to Access Policy (services only) page|
|28 March 2014||Add information to the best practice section, about the display of multiple service records in Release 12|
Thank you for visiting the 'new look' Content Providers Guide! We'd really appreciate your feedback. Please tell us what you like about the Guide or how it might be improved.
Send your questions and comments to: email@example.com