CONTENT
GETTING STARTED
The research dataset description tool Qvain can be found at qvain.fairdata.fi. Qvain can also be accessed at etsin.fairdata.fi via the “Create/edit datasets” button in the page’s upper corner.
You need an active CSC customer account in order to use Qvain (instructions can be found here).
Once you have an active CSC customer account, you can log into the service at qvain.fairdata.fi by using your Haka, Virtu or your personal CSC account.
IDA, Qvain and Etsin use a common single sign-on/sign-off (SSO) service. This means that you will log into all of these Fairdata services when logging in once through IDA, Qvain or Etsin. Similarly, logging out will log you out of all of the services.
To log into Qvain, click the “Login” button in the page’s upper right corner. Choose “CSC Login” or “Haka Login” and follow the instructions.
Please note that your personal CSC customer account can be managed through MyCSC ( e.g. creating an account or changing the CSC account’s password).
FRONT PAGE
Qvain’s front page lists all the datasets you have defined in the service. From the front page, you can either create a new dataset or edit an existing one.
Creating a new dataset
You can create a new dataset by clicking the “Create new dataset” button or or by clicking “Create dataset” at the top of the page.
Dataset list
In the dataset list, you will find the following information regarding each dataset: the dataset’s title, status, owner (you / someone else), when the dataset was created as well as the actions available for each dataset.
You can filter the datasets by name using the search box above the dataset list.
The status of a dataset is either “Published” or “Draft”. Published datasets are displayed publicly to everyone in Etsin (even when access to the data included in a dataset is restricted, the metadata describing a published dataset is always public). Datasets with a “Draft” status are only visible to the person who created the dataset when they are logged in to Etsin.
If multiple versions of the dataset have been created, you can view the older versions by clicking the “>” in front of the latest version of the dataset. All versions of a dataset can be edited.
When you view a dataset through Etsin’s search, the latest version of the dataset is shown by default. You can also switch to view older versions of the dataset.
In the “Actions” section on the front page’s dataset list, you can
- Click “Edit” to edit a published or draft dataset.
- See what the dataset looks like in Etsin by clicking “View in Etsin”.
- Share editing rights with other users by clicking “Editors”.
- Use an existing dataset as a template to create a new dataset by clicking “Use as template” from the “More” drop-down menu. In this case, Qvain copies all the descriptive information from the selected dataset to the fields of the new dataset form. NB! Files attached to a dataset will not be copied to the new dataset’s form when using the “Use as template” activity.
- Create a new version of the dataset by clicking the “Create new version” from the “More” drop-down menu. NB! When creating a new version of a dataset, old versions of the dataset will remain publicly visible in Etsin unless you delete them.
- Delete the dataset completely by clicking “Delete” from the “More” drop-down menu. If you delete a dataset that has a “Draft” status, the dataset will be permanently deleted. If, on the other hand, you delete a published dataset, it will be removed from Etsin’s search results and will no longer be visible in Qvain. The deleted material will, however, have a public description page in Etsin. You will be able to find this page in Etsin through the dataset’s permanent identifier.
DATASET DESCRIPTION
Below you will find instructions on filling out the dataset form.
Fields in Qvain
Mandatory fields are marked with an asterisk *
Data origin *
Before you can attach files to your dataset, you must select whether you are attaching files from IDA or whether you enter the URLs of the external service via which the files can be obtained. The selection is made by selecting either “IDA” or “Remote resources”.
Fairdata IDA files
If you attach files to your dataset from the IDA service, first select a project in IDA. You will then see all frozen files and folders for the project. Select the files and folders you want to attach to your dataset. If you select a folder, all files and subfolders in that folder will be attached to the dataset. Files can be attached to a dataset from only one IDA project. These selected files can be downloaded via Etsin after the dataset is published, unless there are restrictions on downloading the files under “Access type”. NB! Unfreezing or deleting frozen files in IDA will immediately and permanently deprecate all datasets to which those files are attached A warning is presented in IDA if an unfreeze or delete action will deprecate any datasets.
DOI identifier
If you attach files from IDA, you can ask a URN identifier to be generated for your dataset (as a default a DOI is generated). Once you have selected IDA as the data source, a checkbox will appear, which is ticked by default (you will get a DOI) for your dataset. By un-ticking the checkbox, you will get a URN instead. This identifier is automatically generated when you publish your dataset. A DOI identifier will begin to resolve (act as a link in http format) to your dataset right after the dataset’s publication and a URN will begin to resolve the day after publication.
Cumulative dataset
A cumulative dataset is a dataset to which files are to be added after the dataset has been published. When a dataset is marked as cumulative, files can be added to it after publication without the need to create a new version of the dataset with a new identifier.
Cumulative datasets are clearly marked in Etsin and thus users are able to consider the nature of the dataset when referring to it.
NB! Files cannot be deleted from a cumulative dataset after publication.
Remote resources
Use a remote resource to attach files to the dataset which are not stored in the IDA service. Add the files one by one by clicking the button “Add remote resource”.
A popup window will open where you can fill-in the information: Give each file a title and a select a use category from the drop-down menu. It’s also recommended to add either an access URL (link to the page where the link or license information is) or download URL (direct link to download the file) to each file. In the “Access URL” field, you can enter the address of a web page that describes the file or its license. In the “Download URL” field, you can enter a link that, when selected, will start the file download immediately. Files from a remote resource are not saved to Qvain or Etsin, but are downloaded by the user from the specified remote location.
After you have filled in the information related to the remote resource, press the “Add remote resource” button. The added source will then appear under “Remote resources”, and you can edit or remove the sources from the same list, if necessary.
License *: The license defines how the data in the dataset can be used (metadata is automatically CC0 licensed). The recommended and default license for research data is CC BY 4.0 (Creative Commons By Attribution version 4.0). The license information is mandatory. Most common licenses are available. If you do not find the right license for your dataset, you can choose either “Licence Not Specified” or instead of selecting a value from the drop down just type-in the URL to an existing license page.
Access Type *: This field defines how the data in your dataset can be accessed. Whichever option is selected does not affect the visibility of the dataset’s description (metadata) itself. Even if access to data is restricted, descriptive information about the published dataset is displayed in Etsin.
- Open: Anyone can download the files attached to your dataset.
- Embargo: Anyone can download the files attached to your dataset from a certain date onwards. If you leave the date empty, the data cannot be accessed at all. If you select “Embargo”, a field will appear on the form where you can specify when the embargo will end (the data will be available from that time onwards).
- Requires Login in Fairdata service: Users logged in to Fairdata services can download the files attached to your dataset (currently requires authentication with either Haka ID or CSC account).
- Restricted use: Files attached to your dataset cannot be downloaded at all.
If you select any other option than “Open”, a “Restriction Grounds” drowp-down menu will appear on the form to which you must specify the basis on which access is restricted. This information is also displayed in Etsin.
Title *: Title for you dataset. It is possible to add a title both in Finnish and in English. You must enter a title in at least one language.
Description *: Description for your dataset. It is possible to add a description both in Finnish and in English. You must enter a description in at least one language. Description text’s formatting supports most parts of Markdown syntax (https://www.markdownguide.org/basic-syntax/).
You can, for example, include the following information in the free-text description:
- Implementation and subject(s) of the research
- Method of data collection and the tools used, and other information related to data collection
- Structure of the data and also descriptions of the variables used
- Note: you can also include descriptive files, such as README-type files, among the files linked from IDA
Issued date *: Date of formal issuance (publication) of the resource. This value does not affect or reflect the visibility of the dataset itself. If left empty, the current date is used as a default value.
Keywords *: Free-text keywords for your dataset. Please enter at least one keyword. Keywords affect how your dataset can be found in Etsin. You can enter multiple keywords at the same time, separated with commas (,). Click Enter or select “Add keyword” to add keywords.
Subject Headings: Select subject headings for the dataset from the drop-down menu. Qvain suggests subject headings as you type text in the field. You can choose subject headings from the KOKO ontology maintained by the Finnish thesaurus and ontology service Finto, which also has English and Swedish translations of the terms.
Field of Science: Select a field of science for your dataset from the drop-down menu. You can add multiple fields of science. Qvain uses the Ministry of Education and Culture’s official fields of science classification.
Dataset Language: Select the language of the dataset. You can add multiple languages. You can choose from languages in the ISO639-3 code.
Other Identifiers: If your dataset already has a permanent identifier (usually a DOI) created in another service, enter it here. It will appear as a link in Etsin. NB! Qvain automatically generates a permanent identifier (URN or DOI) for your dataset. Enter the identifier here only if the dataset already has an identifier generated elsewhere.
Added Actors: Persons or organizations involved in the research or production of the dataset. A single actor can have several roles. At least one “Creator” and one “Publisher” must be added to the dataset. You can add actors by clicking the “Add new actor” button.
- First select the actor type: either a person or an organization.
- Then select the roles the actor has. You can select multiple roles if the actor has more than one role in the research or production of the dataset.
- Creator: A person or organisation who originally produced the dataset.
- Publisher: Actor that has permission to distribute the dataset or who has made the dataset available. Usually a research organization.
- Curator: A person (or organisation) who is responsible for ongoing maintenance of the dataset and keeping it available. Data curators are specialists who collect, organize, clean and transform data to make it accessible for organizations and individuals.
- Rights holder: A person or organisation who holds the copyright, neighboring rights or moral rights of the dataset; usually the author of the data or the organization of the author.
- Contributor: Any other person or organisation that has contributed significantly in the creation of the dataset (not quite creators but assisted in the process of creating the dataset).
- Fill in all other relevant details.
- Organization information is mandatory if the actor is a person.
- Use, if possible, identifiers (PIDs), for example orcid or organiztion code.
- If you give an email address to an actor, Etsin users are able to send messages via Etsin without seeing the actual email address.
You can edit or delete added actors by clicking the edit and delete icons (see image below).
Publications: Refer to publications that are relevant in understanding this dataset. You can either enter the publication information manually or search for the publication in the Crossref Service (crossref.org) using the search function.
Other materials: Refer to other material that are relevant to this dataset.
The relation types you can use when referring to other materials
Cites / Is cited by | The dataset cites / is cited by other material |
Is supplement to | The dataset is a supplement to other material (this dataset uses some material as its’ background) |
Has next version / Has previous version | The dataset is next / previous versio of the other material |
Has part / Is part of | The dataset has part / is part of the other material |
Is compiled by | The data is compiled by the other material |
Is identical to | The data / dataset is identical to the other material |
Continues | The dataset supplements / continues the other material |
References | The dataset references to the other material |
Is variant form of | The data / dataset is same but for example in different format than the other material |
Was derived from | The data / dataset is derived from the other material |
Relation | Any other relation between the dataset and the other material |
Geographical area (spatial coverage): Area covered by the dataset, e.g. places of observations. You can add multiple areas.
Time period (temporal coverage): Time span that is covered by the dataset, e.g. period of observations. You can add multiple periods. The period is added by clicking the “Add temporal coverage” button. You can enter the information either by selecting a time period from the calendar or by entering the start and end dates directly in the fields in the format “dd.mm.yyyy”, e.g. “23.03.2021”.
Infrastructure: Services or tools that are used to produce the dataset.
- Note! At the moment infrastructures cannot be selected. New infrastructures will be taken into use as soon as possible and they will be integrated with Research.fi.
History and events (provenance): Event or activity that was the subject of the dataset.
Project and funding: A project in which the dataset was created.
- Add title for project: The title or name of the project. It is possible to add a title both in Finnish and in English. You must enter a title at least in one language.
- Project identifier: An unambiguous reference to the resource within a given context. Recommended best practice is to identify the resource by means of a string conforming to a formal identification system.
- Participating organizations: Organization(s) which are participating in the project. You can select an organization from the drop-down menu or enter the organization information manually.
- Funding:
- Funder organization: You can select a funding organization from the drop-down menu or enter the information manually.
- Funder type: You can select the funder type of the dataset from the drop-down menu.
- Project funding identifier: Unique identifier for the project that is being used by the project funder.
Note! If the dataset is going to be Digitally Preserved in Fairdata Digital Preservation Service, there are additional mandatory fields:
- Name (filename as a default) and Use Category for each file
- For CSV file: fileformat text/csv (if not specified, the files are processed as text/plain), character encoding and technical delimiter
- For text files it’s also recommended to specify the text encoding (if not specified, the service will try to determine it but it’s not 100% reliable)
SAVE DATASET AS A DRAFT
You can save the information you have filled in on the form as a draft by clicking “Save” at the bottom of the page. When you save a dataset as a draft, you can preview it when logged in to Fairdata services in Etsin. Drafts are not visible to other users in Etsin.
PUBLISH DATASET
Click “Publish” at the bottom of the page to publish your dataset. After publication, your dataset will appear in Etsin (etsin.fairdata.fi) and you can edit it in Qvain if you wish. Note that only cumulative datasets’ files can be added after publication. In other cases, the files attached to the dataset cannot be changed except by creating a new version of the dataset.
Datasets’ metadata published by Qvain are automatically CC0 licenced. You can however define your own licence for the data itself, this is done in the “Licence” field in Qvain.
EDIT DATASET
You can edit the metadata of published or draft dataset. To edit the dataset, click “Edit” next to the dataset you wish to edit in the dataset list on Qvain’s front page.
You can edit the metadata of your datasets freely. If you have attached files to your published dataset from IDA, you will not be able to delete or add files to your dataset without first making a new version of it. (The exception to this are cumulative datasets to which files can be added.)
To make a new version of your dataset, select “Create new version” from the “More” drop-down menu in the dataset list.
You can publish the changes made or save them as a draft. Changes to a dataset saved as a draft will be visible only to you when logged in to Etsin, whereas published changes appear in Etsin publicly to all users.
EDITING RIGHTS
By default, if a dataset is an IDA dataset (data is stored in Fairdata IDA) all members of a CSC project that uses the IDA service have equal rights to edit the dataset: all project members can edit, publish and remove the dataset. They can also add and remove editing rights for other users.
If the data is not stored in IDA but is a “remote resource”, only the user who created the dataset has editing rights to it.
In the dataset list you can see all datasets which:
- You have created yourself (Owner: Me)
- You have editing rights to, either via project membership or by someone giving you individual editing rights (Owner: )
You have equal rights to edit all these datasets’ metadata, publish the changes, make new versions, add new editing rights and even delete the dataset. If the dataset is an IDA dataset and the data belongs to a CSC project you are not a member of, you can only see the included data but you cannot make any changes to the dataset.
The given editing rights are inherited to a new version of the dataset if a new version is created.
Using a dataset as a template for a new dataset does NOT copy the editing rights.
Adding and removing editing rights
In addition to the default editing rights, you can add editing rights for individual users. In the dataset list click the button “Editors” and a modal window will open.
In the modal window you can either add a new user (Invite tab) or review/remove existing users (Members tab).
- Add a user (Invite tab) by starting to type his/her name: Qvain will auto-fill the user’s name once it finds a match. Click “Invite” to add editing rights for that user.
- Remove a user (Members tab) by selecting “Remove” from the Editor dropdown.
- You cannot remove the original creator of the dataset nor the users that have been added as editors via CSC-project membership.
CREATING A NEW VERSION
You can create a new version of a published dataset. To create a new version of a dataset, click “More” next to the dataset you wish to create a new version of in the dataset list on Qvain’s front page.
The new version of the dataset is visibly linked to the old version of the dataset in Etsin and it gets a new persistent identifier. The old version of the dataset will still be accessible through its persistent identifier.
Note: If you have attached files to your published dataset from IDA, you will not be able to delete or add files to your dataset without first making a new version of it. (The exception to this are Cumulative datasets to which files can be added.)
DELETING DATASET
You can delete a draft dataset and a published dataset. To delete a dataset, click “More” next to the dataset you wish to delete in the dataset list on Qvain’s front page.
If you have more than one version of a dataset, each version is deleted separately.
Deleting a published dataset will remove it from Qvain, and Etsin Search cannot find it anymore. The files included to the dataset can no longer be downloaded. Landing page for a published dataset will NOT be removed and it can still be accessed using its persistent identifier, but the landing page will have a notice saying “The dataset has been removed”.
Note: After deleting a dataset you can’t make changes to its metadata anymore, because it is removed from Qvain.