Research Data Australia has been set up to register collections. Related parties, activities and services in Research Data Australia provide context and meaning for the collections. The collections registered are most commonly datasets, but they can also be of "collection" type ("compiled content created as separate and independent works"), such as museum or archive collections; information collections, such as registries, catalogues and indexes; or aggregated collections, such as are found in repositories.
Collection records created to best practice standards will:
The steps below discuss issues that are specific to collections. See Collection for general information.
|1 Feb 2012||New, separate and expanded Collection Best Practice page|
|2 Nov 2012||Added "metadata" as a type in Step 13: Related Info|
|20 Nov 2012||Added dates (collections) as Step 14|
|9 May 2013||Reordered steps to align with Release 10 interface changes|
|26 November 2013||Updated Related Info to include information about the element according to RIF-CS v1.5.0|
|28 March 2014||Added information at Step 9, about the display of multiple collection records in Research Data Australia in Release 12|
|15 May 2014||Modified contents; modified Step 3 to include information about what best practice means|
|31 July 2015||Updated information|
The purpose of a party record in Research Data Australia is to support discovery of research data collections and to provide context to those collections.
ANDS collaborated with the National Library of Australia to provide infrastructure for describing parties using the Trove service. All ANDS partners can use this infrastructure to create party records which can be harvested into Research Data Australia from Trove. See Trove and TIM and the ARDC Party Infrastructure for detail.
At least one party record must be related to each collection described in Research Data Australia (see Metadata Content Requirements).
If a record is already available for your party you do not need to create a new party record. Link to the existing record by including either the Research Data Australia party record key or the NLA Trove party identifier in the related object element of your records.
Povide your own description of the person or group in a separate party record. You cannot edit other partners' records, including the National Library of Australia’s (NLA) records.
See Step 8 for how to make sure that both your record and the existing record are treated as describing the same party.
Note: Most partners will need to contribute party records, because many researchers may not have records yet, or the records that do exist may be inadequate.
External researchers including international researchers
ANDS has altered its position on creating party records for researchers external to your research institution. Organisations can now create party records for Australian and international researchers external to an organisation or independent of any organisation, where these records do not already exist. The organisation should perform a search for such existing records before proceeding to creation (see Step 2). If a new party record is created, the record-creating organisation needs to take responsibility for including the name of the employing institution, if available.
ANDS suggests that you work with collaborating institutions to ensure that all researchers responsible for the generation of collections are appropriately acknowledged by having party records in Research Data Australia. The ARDC Party Infrastructure Project is intended to help partners access party records for external researchers.
At least one party should be related to a collection (see Metadata Content Requirements). If multiple parties can be related to a collection, the description should aim to link the collection to any party that will improve discovery substantively.
That means all active known collaborators on the project
As a default, that would include all named researchers on the research grant application, but not support staff or research assistants.
Organisations are hierarchically organised, and you may have a choice of hierarchical level at which to represent your party (group). For example, you may link your collection to your research lab, to your department, your faculty, or your research institution. Which level of group is represented as responsible for collections is a matter of institutional policy.
Remember that Research Data Australia is not intended to represent group hierarchies. The default approach should be to represent the lowest-level group, with the most direct engagement with the collection. That means that the research lab or individual researchers are the parties of interest that need to be described and linked to a collection, in preference to departments or faculties. The name of a research lab, for example, may include its superior body's name as part of its name, e.g. Budawang University, Frontiers of Chemistry Research Laboratory.
Research Data Australia groups an institution's information for display using the Group attribute rather than requiring connection of all collection records to an institutional party record. However, contributors may make such connections if they wish.
Disambiguation: A party of type "group" is not the same as and has no relationship with the "group" attribute to which a Registry Object is linked for purposes of display in Research Data Australia. The "group" attribute is also used as the basis for enhanced contributor home pages. See Group for more information about this attribute.
Search to see if your party is already described before adding a new party record.
Where to search:
Type of party is required. There are three party types, person, group and administrativePosition.
Administrative position is a kind of party where the position name and contact information are present but the identity of the party filling the role is not specified.
Many data sources provide ANDS with party names such as "Data custodian", "Data Officer", or "Data Manager". There are over 1000 collection records (about 10% of the records in Research Data Australia) that contain role names for party records. Such role names are common in large data management environments.
Remember, a party of type "group" is not the same as and has no relationship with the "group" attribute.
Keys are required in party records. Keys must be unique in Research Data Australia. More information on creating keys for parties.
Collection records link to party records using the Research Data Australia party record key. Alternatively, collection records link to the NLA party records using the NLA party identifier (see Step 8 for more about identifiers).
Names for parties should be described by recording each name component in a separate name part. The type of name part is described by choosing from the following :
Only use Date range if the name has changed over time and older versions of the name have been recorded in the metadata being provided, such as when a research centre has changed its name since the related dataset was created.
Activity records in Research Data Australia (RDA) enable the description of research projects and programs as well as research grants and funding programs. Data collections are often the output of research activity and the description of related projects or grants can provide additional context. Since April 2015, RDA has offered users a specialised search option that enables the exploration of research activity in Australia. ARC and NHMRC research grants are recorded in RDA as activity records.
|7 Jul 2012||
First web publication as separate page (previously part of activity page)
|12 April 2013||Included more extensive information about existenceDates and the differences between grant dates and project dates|
|9 May 2013||Reordered steps to align with Release 10 interface changes|
|28 March 2014||Updated Step 8 Identifiers with information about the display of multiple activity records for Release 12|
|19 June 2014||Content reviewed with minor changes.|
|14 April 2015||Content updated to reflect changes implemented with Release 15|
|31 July 2015||Content updated to reflect changes implemented with Release 15|
|Please send any feedback on this page to email@example.com|
Services in the research domain support the creation or use of research collections and datasets.
ISO 2146 defines a service as 'a system (analogue or digital) that provides one or more functions of value to an end user'. Services can be web services, provided across the web and following a well-defined machine protocol, such as OAI-PMH Harvest or RSS Syndication; but they may also be provided by offline software (e.g. the functionality of software running a simulation, or creating annotations).
As with parties and activities, the ANDS Collections Registry gathers service descriptions in order to provide context for the collections it registers, and to enable discovery of related collections, rather than to serve as an exhaustive registry of research services. For that reason, the services described in the registry are always related to collections—whether the service exposes the collection, or was involved in creating the collection.
To be used, a service must be implemented. Therefore, a service must have a specific delivery method which makes it available to a client.
Delivery Methods include:
Web services are the most straightforward type of service to model: the definition of their function and scope is specified through statements of behaviour and data representation, and they have a well-defined protocol for interaction with service clients. These protocols can usually be indicated through the service type.
Other types of service are used to model instruments, software, and workflows. These tools often do not have well-defined protocols for interaction, so protocols need not be specified in their service description. These tools also have properties which are not captured by modelling them as services (e.g. asset numbers, operating systems): this partial representation is deliberate, because of the restricted scope of service descriptions.
Service descriptions in the ANDS Collections Registry are meant to convey only high-level, indicative information. More complete detail about data collection provenance should be provided in local metadata stores, and linked to as Related Info from the service description.
Instruments are modelled as offline services—although strictly speaking what services model is the capability of instruments to create data collections. Instruments are often housed in facilities, but facilities should be modelled in the ANDS Collections Registry as parties: they are the organisations which own the instruments. Instruments can be composed of individual sensors; both the large-scale and more fine-grained instrument may be of interest to users. Instruments can be related to each other in a partOf relationship. For example, a specific detector can be part of a Synchrotron beamline instrument, or of a radio telescope.
Whether to model both the instrument and its component sensors in the ANDS Collection Registry depends on whether it will be useful to discover collections through sensors, rather than just through the instrument. This is a policy decision for partners; some partners have already elected not to do so. The details of sensors used to gather the data should at any rate be recorded in local metadata stores.
To be used, a service must also be instantiated: there must be a particular instance of the service being described, rather than the class of all matching services, and it should be possible to name the location of the service, and the parties managing the service. For example, the ANDS Collections Registry would describe the Monash University ARROW repository OAI-PMH feed, rather than giving a generic description of the OAI-PMH protocol.
Treating services as instances means that there may be many service records in the ANDS Collections Registry that look quite similar—distinct sensors, for example, or distinct deployments of RSS. As long as each instance is associated with a collection registered with the registry, it is still appropriate to distinguish between the service instances.
Exceptionally, software services may be described as implementations, rather than instances. A record can describe the downloadable software for the service, rather than an instance of the software running on a specific machine. A separate record would still be expected for different versions of the same software, or for different implementations.
Depending on how services relate to collections, services can be classified as Creation services, Metadata services, Discovery services, or Reuse services.
Discovery services are typically web services; creation services typically have other delivery methods. The service type is described by choosing from the following:
The kind of service (service type) is described by choosing from the following (ANDS is currently considering expanding this list):
The service names for creation & metadata services are deliberately generic (and are taken from the e-Framework, which is not research-specific). To apply them, use the following:
What is the input into the service?
No reuse services have been included in the current service type vocabulary. The service type vocabulary can be expanded as the community requires—subject to the constraint that it describes services in the ANDS Collections Registry, which are specific to registered collections.
Services may also have access policies. These are described in a separate element. More information
Researcher Fred from Notre Dame University uses the Brahe interferometer on the Farnell Radio Telescope, to gather observations on pulsar THX-1138. The observations are registered with ANDS as a collection.
The pulsar data collection represents raw data. The Tempo2 pulsar timing software is used to extract pulsar timing data from a range of observations, including TXH-1138, and the resulting analyses are also registered with the ANDS Collections Registry.
The pulsar data collection is exposed for search through the SRU protocol. The web service allowing this search is hosted at the University of Launceston.
The following diagram illustrates the relations of the objects described in this scenario:
The date metadata describing a service was last changed in the source system can be recorded. See Date modified.
Metadata records describing services are grouped together on the Research Data Australia home page. The service category and service type are displayed. The hyperlink to a page or XACML document describing service access policies is displayed. Date modified is not displayed. All information is searchable.
Often a collection is tightly bound with its discovery service, so there can be confusion about whether to model it as a collection or a service. The purpose of the ANDS Collections Registry is to promote the discovery of collections, not of services. So an entity such as a repository or portal must have a relevant collection description contributed to the registry. It can also have a relevant service description contributed, if that service description adds sufficient value. A discovery service that does not provide access to a specific collection is not relevant to the ANDS Collections Registry, and likely needs to be modelled differently.
For example: a podcast is a collection of recordings, combined with a syndication service for accessing that collection. The podcast should be described for ANDS as a collection, since that is the aspect of the podcast most relevant to the Collections Registry. The RSS feed to the podcast can be added to the Collections Registry as an associated discovery service (syndication-rss). But the podcast should not be described as a service instead of a collection.
HTTP-Search for a single keyword can be assumed as default search functionality for a collection. (This is the single search box on the home page of most collections.) If the ANDS Collections Registry already has a description of such a collection, then a single-keyword search need not be registered in the ANDS Collections Registry as a distinct service description.
Portals provide access to an aggregation of collections. A portal can be modelled as either a service or as a collection; if it is modelled as a service, its constituent collection should also be described in the ANDS Collections Registry.
The service type is a two-part string, with the first part specifying the service genre and the second part specifying the protocol (for example, syndicate-rss, harvest-oaipmh, search-sru). For creation and metadata services, which do not have generically used protocols, only the service genre is specified.
If there is a well-defined protocol for an instance of a creation or metadata service, the service description should provide that protocol information in the Related Info element. Added protocol information should also be provided in the Related Info element for discovery services, if there are local extensions to the service protocol that service users need to know.
The value for the service genre is taken from the set of service genres registered with the e-Framework. The protocol is taken from known services identified by initial Collections Registry content providers. New genre-protocol combinations may be added on application to the RIF-CS schema manager (contact firstname.lastname@example.org).
Software tools can have multiple types applicable out of the service type vocabulary: unlike web services, software tools can perform multiple functions. However the service description of software tools shall have a single type, reflecting the primary use of the tool in the research community.
For web services, the electronic address is a URI that provides access to the service: in particular, it is a URI that can be processed by a client following the service protocol (service endpoint).
If the service is syndicate-rss, for example, the location in the service description will be a URI that can be processed by an RSS reader.
Web services alone may use the <arg> element in addition to the <value> element, to differentiate between a base URL and the service arguments. This only applies to HTTP Query services, in which the service call URL contains service arguments. The <arg> element indicates whether each of the URL arguments is required or optional, whether they are plain text or embedded objects, and whether they are inline (embedded in the base URL) or key-value pairs in a HTTP query. The <arg> element does not describe the semantics of the arguments, and should not be treated as a substitute for linking to protocol documentation for the service.
If the electronic address type is "wsdl", the <value> element must be a URL pointing to the WSDL file. Human-readable descriptions of the service online should be recorded in the Related Info element instead. A physical address or electronic address (email) can be provided as a contact for arranging access to the service. Typically this will be the same address as for the party managing the service.
For software and workflows, the electronic address is likewise a URI that provides access to the service: in particular, it is a URI that the software or workflow can be downloaded from. In this case too, human-readable descriptions of the software should be recorded in Related Info instead. A physical address or electronic address (email) can be provided as a contact for arranging access to the service.
For offline services, a web address is not acceptable as a location. That is because an instrument home page does not provide direct access to the service, the way an RSS feed address or a search query does. Web pages about the service should be recorded in the Related Info element, just as they are for online services. A physical address or electronic address (email) should be provided instead; as above, the physical address is intended to allow users to gain access to the offline service (contact address).
Delivery Method will be suggested for inclusion in future versions of RIF-CS. As an interim measure, include the delivery method as a string without spaces (webservice, software, offline, workflow) in a description element of type "deliveryMethod".
Where two or more service records, from the same or different data sources, share common identifiers, the records are treated as describing the same service.
In Research Data Australia, the records are merged into a single search result and links to each of the merged records are displayed on the view page of each record.
This feature of Research Data Australia is described in detail in Step 8 of Best practice for creating party records. The description and examples on this page apply equally to multiple service records.
Note: “local” identifiers are not used to link multiple records together.
Most of the relations described below are bidirectional; for discovery to be most effective, they should be represented in RIF-CS in both directions. In particular, if a collection links to the creation service that produced it, the creation service should also link out to all the collections it has produced. This allows discovery of more collections.
Often information on relations is only available in one direction: the description of a collection will link to the service that produce it, but the description of the service does not have access to the collections that the service has produced. In such cases, it is desirable for ANDS to automatically generate bidirectional links between the objects. This functionality is forthcoming.
Currently the only relation modelled between services is hasPart/isPartof. Creation services can often be modelled as part of another creation service, as with sensors and instruments, or individual services and service workflows. Metadata and Discovery services, on the other hand, are not normally modelled as forming part of other services.
Service descriptions must have a relationship to at least one collection. Depending on the service type, services and collections can have the following relations:
The supports/isSupportedBy relation is generic; the other relations are specialisations of this relation.
If a transform or assemble service is used to change collection A into collection B, the service operates on input collection A, and produces output collection B. (For collection discovery, the produces relation is more important than the operates on relation.) Collection A and collection B are related through the relation isDerivedFrom/hasDerivedCollection. This relation is distinct from partOf: if a collection is derived from another collection, the output is a new collection, and is not considered part of the old.
If service A is part of service B, and service A is related to a collection, then service B should not also be modelled has having the same relation to the collection. It is best practice in information science to link only to the most detailed level. For example, a collection would be linked only to the Brahe interferometer—and not to both the Brahe interferometer and the Farnell telescope. Users should navigate down from the Farnell telescope to discover collections associated with individual receivers.
The following relations can be modelled between parties and services:
The relationship between a facility and its instruments is modelled through the isOwnerOf relation.
Note that the owner of a service is distinct from the owner of the associated collection. In the example above, the Norfolk Island Astronomical Commissariat owns the telescope that captured the pulsar data, but the pulsar data itself is owned by Notre Dame University.
No relations are currently modelled between services and activities. The existing relations isOutputOf and isFundedBy between activities and collections could be extended to services. However this level of detail is beyond the requirements of the ANDS Collections Registry, and is appropriate instead for a services registry.
The relation hasAssociationWith, as with other registry object classes, allows an unspecified relationship to be signalled between the service and the target object.
|April 2010||Consultation draft|
|26 October 2010||First web publication|
|25 January 2011||Complete revision to add creation and metadata services|
|14 April 2011||Added link to Access Policy (services only) page|
|28 March 2014||Add information to the best practice section, about the display of multiple service records in Release 12|
Thank you for visiting the 'new look' Content Providers Guide! We'd really appreciate your feedback. Please tell us what you like about the Guide or how it might be improved.
Send your questions and comments to: email@example.com