Text versions of the videos – Introduction to the Research Data Storage Service IDA

Introduction to the Research Data Storage Service IDA, part 1/2

Link to the video version of part 1.

General introduction to the service

  • This tutorial introduces you to the main features of the research data storage service IDA. IDA is a continuous service organised by the Ministry of Education and Culture, which offers safe storage for research data as a part of the Fairdata-services.
  • IDA enables publishing the data that has been stored in the service. Data stored in IDA can be described as a dataset, published and opened for others to download with the other Fairdata-services. More information available at www.fairdata.fi
  • IDA can be used with a new web browser UI and with command line tools. This tutorial shows you the web browser UI. The web browser UI is user friendly and includes help texts. The full user guide is available on the service web site.
  • Every IDA user belongs to an IDA project group. Each IDA project has a manager, who adds and removes project members. It’s important to make agreements on the data stored in the service and on its use within the project group before starting to use the service.
  • All members of an IDA project group have equal rights to add, remove and do other actions to all of the data stored in the project’s IDA space.
  • One IDA project can have users from several organisations.
  • IDA enables storing data also during the active phase of the research when data are still being collected and some of them may be overwritten or moved to another folder in the service. The main feature of the service is storing the data, that the user has marked as frozen, in an immutable state and enabling publishing the frozen data.
    • IDA does not enable editing the content of files that are stored in the service so it’s not designed for the active processing of data.
    • IDA is also not meant for storing classified data. This includes for example the special categories of personal data that are defined in the GDPR.
    • Also note that IDA’s frozen area is not the same as Fairdata PAS / Digital preservation for Research Data (http://digitalpreservation.fi/). IDA offers bit level data preservation. Digital preservation aims to preserve the stored data basically forever and it utilizes for example file format conversions to ensure the usability of the data.
  • The second part of the tutorial shows you how to use the IDA service.

Introduction to the Research Data Storage Service IDA, part 2/2

Link to the video version of part 2.

This video shows you in more detail how to start using Fairdata service IDA and how to use IDA’s browser UI.

Becoming an user and logging in

  • After the research group has chosen a manager for the IDA project group, the project manager creates a CSC account and project in CSC’s customer portal.
  • IDA storage space is applied for with an e-form in CSC’s customer portal. The IDA contact person of the applicant’s home organisation processes and approves the application or the application may be approved based on an Academy of Finland funding decision. The project manager can invite users to the IDA project. Invited users need to create CSC accounts to use the IDA service.
  • IDA’s web user interface is available at ida.fairdata.fi.
  • You can log in with your CSC account credentials of with Haka.

Storing data

  • After logging in, you will see your project folders. There are 2 folders for each project: one for adding data and for organising it and one for storing the data in an immutable state.
  • Data can be added to the project folder which ends in a plus sign (+). This is the project’s Staging area.
  • After navigating to that folder, data can be added either through the plus icon in the top of the view or simply with drag and drop.This enables adding entire folders and their contents in the service.
  • Another way to add the contents of a folder to the service, is by creating a new folder in the service and by selecting for example all the files in a folder by pressing ctrl+a

Organising data

  • Files should be organised in the service in a logical folder structure. Use meaningful and unique file and folder names. Agree on a system for naming files and folders. Note, that when the data files are linked to a dataset and published, the file names are always public.
  • If the project data consists of e.g. thousands of small files, it’s recommended to zip them into bigger file packages before transferring them to the service. However, if the size of the file package is tens or hundreds of gigabytes transferring and using the data may be difficult. Consider what kind of file sizes are optimal for downloading and for using the data.

Freezing data

  • When your data files are ready to be stored in an immutable state, you can freeze them by clicking the snowflake icon next to a file or a folder and by clicking “Freeze”. The selected data is moved to your other project folder, which does not end in a plus sign. The frozen data can’t be overwritten or moved to another folder.
  • When you freeze your data, the service does several operations to ensure that your data is stored persistently . These include generating a file replica on another physical media and the calculation of file checksums to ensure the integrity of the stored files. This is done as a background operation, which may take hours depending on the amount of the frozen data. You can view the progress of the initiated background process in the “Actions” tab.
  • The frozen data can be described with Qvain and published in Etsin. This gives the dataset a persistent identifier. The published data can be set as openly downloadable or downloadable with a use permission. It’s also possible to publish only the dataset metadata. Remember, that publishing your dataset with a persistent identifier is promise to others that your dataset will not change and it’s safe to cite it.

Using temporary share links

  • Data can also be shared outside the service with temporary share links. The share links are valid for a maximum of 30 days. The temporary share links are specific to each IDA user and they cannot be seen by other project members.
  • To share your reserach data with a permanent download link, you need to describe the data with Qvain and publish it in Etsin, so that the defined dataset gets a persistent identifier, which enables citations.

More information about freezing and unfreezing

  • Note, that files are not stored persistently in the IDA service until they are frozen. The IDA service performs preservation actions on the data which has been marked as frozen. The frozen files are visible to other Fairdata services and for that reason they can be included in a dataset with Qvain.
  • If you decide to unfreeze your data, note that it cancels the preservation actions that were performed on the data when it was frozen. This means that if you have described and published you frozen data, the published dataset becomes deprecated, when the data is unfrozen. Its persistent identifier becomes deprecated and the citations to the dataset are no longer complete, because the data are no longer available. If you have gotten for example an URN identifier for your dataset, don’t make changes to the data included in the dataset because the deprecation of a dataset can’t be reversed.
  • Please note that deleted data will be permanently and irretrievably removed from the IDA service