FAQ | Fairdata

Content

Fairdata Services in General
IDA – Research Data Storage
Etsin – Reseach Data Finder
Qvain – Research Dataset Description Tool
Metax – Metadata Warehouse
Digital Preservation Service for Research Data

1.Fairdata Services in General

1.1 What are the Fairdata services and why should I use them to manage my research data?

1.2 What is the development status of the services?

1.3 Who are the Fairdata services for? Can I use them?

1.4 I'm from a Finnish higher education institution/state research institute and want to start using the services. How do I get started?

1.5 Is international collaboration possible when using the Fairdata services?

1.6 Why do I need a PID for my published dataset, what is it?

A persistent identifier or PID is a unique and unambiguous machine readable name for an object, in this case a specific research dataset. It is also a permanent link, that will always take you to the landing page of the dataset, where the description and for example the license of the dataset can be found. Usually a PID is a DOI or a URN, identifiers provided by two different systems and they can be recognized by the first letters as either.

When your dataset has a PID, which it is allocated by the Fairdata services, you can use it in data citation. All citations can be traced back to you and the link will always take you to a landing page, even if the data is not available or the services have moved or changed over time. Using persistent identifiers is one of the corner stones in the FAIR principles.

If the dataset is available from a source that is outside the Fairdata services, you can create a dataset using Qvain and get a URN for the landing page in Etsin. When the data is outside Fairdata services, you are responsible for the integrity of the data yourself.

Editing the metadata of a dataset that has been already published with Qvain does not create a new PID for the dataset. However, changing the files or folders included to the dataset requires creating a new version of the dataset, with a new PID. An exception to this are cumulative datasets to which files can be added. A new PID is allocated when the data included to the dataset changes to ensure data integrity. A PID is a promise, and it should always give the user a possibility to access or at least find information about a specific dataset. Always consider reproducibility.

If you delete (or unfreeze) files linked to a published dataset in IDA, the published dataset is shown as deprecated in Etsin, because the files originally linked to the published dataset are no longer available. If you wish, you can create a new version of the dataset and link new files to it. The landing page of the new version and files linked to it will be shown in Etsin normally.

1.7 What kind of support and training is available? Can I test or demo the services?

1.8 I want to share my research data, what should I do?

2. IDA – research data storage

2.1. General questions about IDA service

2.1.1 How long can data be stored in the IDA service?

2.1.2 Who are entitled to use IDA?

2.1.3 How does one become an IDA user?

2.1.4 What is a CSC project and a Project Manager?

2.1.5 I wish to add/remove a project member, how can I do that?

2.1.6 I'm changing my home organisation, can I still use IDA?

2.1.7 Can a foreign research associate be given access to IDA?

2.1.8 Can students use IDA, for example when working on their thesis?

2.1.9 How can we change the IDA Project Manager?

2.1.10 Can the home organisation or its faculty claim access to the data in IDA service if needed?

2.2 Questions about using IDA and the data stored in IDA

2.2.1 What kind of data is IDA meant for? Why is IDA not suitable for all research data?

2.2.2 What are the different IDA user interfaces suitable for?

2.2.3 How can I share a file stored in IDA outside the service?

2.2.4 How fast are the file transfers to IDA?

2.2.5 How can I copy a file stored in IDA to CSC's supercomputer? I don't want to upload the file to my own computer first.

The IDA CLI tools are available to you on Puhti and Mahti, and can be used for uploading, modifying, and downloading content in the staging area as well as downloading content in the frozen area of your project.

Example: To download a particular file from the staging area of project 2001234 with the relative pathname ‘/somefolder/somefile.txt’ to the local filename ‘file_on_puhti.txt’ in the current directory on Puhti, you would use the command:

ida download -p 2001234 /somefolder/somefile.txt file_on_puhti.txt

The IDA CLI tool will prompt you for your IDA credentials (if you have not already defined them e.g. in your .netrc file).

Example: To download a particular file from the frozen area of project 2001234 with the relative pathname ‘/somefolder/somefile.txt’ to the local filename ‘file_on_puhti.txt’ in the current directory on Puhti, you would use the command, including the parameter ‘-f’ to indicate that the relative pathname corresponds to the frozen area of the project:

ida download -p 2001234 -f /somefolder/somefile.txt file_on_puhti.txt

See the online IDA CLI guide for more details about what you can do with the CLI tools and for additional examples of the most common operations.

2.2.6 Why do I need separate command line tools for IDA? Why can't I mount the storage space directly and use mkdir, cp, mv etc.?

2.2.7 How can I check if a file was successfully uploaded to IDA?

2.2.8 I froze a large amount of data in IDA, and the action is still pending. Can the data be included in a dataset description in Qvain already?

3. Etsin – Reseach Data Finder

3.1 Who can use Etsin?

3.2 Where does Etsin get its data from?

3.3 What does the metadata harvesting mean and how to enlist for harvesting?

4. Qvain – Research Dataset Description Tool

4.1. General questions about Qvain Tool

4.1.1 Why use Qvain?

4.1.2 Can I use Qvain?

4.2 Questions about using Qvain and the metadata saved in Qvain

4.2.1 What happened to my datasets in old Etsin when Qvain was opened?

4.2.2 What do I need to consider when I want to publish data?

4.2.3 Do I have to use IDA for file storage if I want to publish a dataset with Qvain?

4.2.4 Why can't I see all my datasets in Qvain?

4.2.5 I made changes to my dataset. Why are the changes not visible in Etsin?

4.2.6 Why do I need to create multiple versions of the same dataset?

4.2.7 Can I get persistent identifier (DOI or URN) and add data afterwards?

5. Metax – Metadata Warehouse

5.1 What is Metax, where can I see it?

6. Digital Preservation Service for Research Data

6.1 What is Digital Preservation Service for Research Data?

6.2 What's the difference between long-term storage and digital preservation?

6.3 How does one become as a partner organization with the Digital Preservation Service for Research Data?

6.4 What kinds of datasets are eligible for digital preservation?

6.5 Who carries out the appraisal and selection of datasets for digital preservation?

6.6 How does one package datasets to the Digital Preservation Service?

6.7 Where can I find more information about the Digital Preservation Service for Research Data?

If you have any questions about particular features, or any other issue relating to the services, please contact CSC customer support at servicedesk@csc.fi