A Fair(y)data service user tale

You may have heard this story before… It was originally published in MetaX-blog in March 2018. Back then, the story presented the idea of a user’s journey in the Fairdata services. Now it’s reality. There are a few minor changes to the original story, as the services have matured.

Buckle up: Once upon a time…

1. There are three wise researchers from University of Turku, Tampere and from the Jyväskylä University of applied sciences. They have gathered amazing data about one special flea species that lives in house sparrows. They (the researchers, not the sparrows) are now finalizing an article and they want to include a data citation to their data, to give it the visibility it deserves. Therefore, they need a persistent identifier for their dataset.

The researchers have a common storage space in the far-famed IDA service. To gather the data they use their IDA project’s staging area, which is a folder with full editing rights for all project members. Each researcher, of course, uses sensible file names and well-organised folder structures to make it easy to keep track of data files.

However, when the researchers are ready to publish their final results they feel that they could reorganize their data once more. No worries: all project members are free to rename and rearrange data in the staging area. After deciding to publish the sparrow-flea-data, the project members carefully arrange the data under one root folder in the staging area. After they are happy with the new folder structure and file names one of the researchers chooses the root folder of the ready data and clicks on the button “Freeze”.

Photo by Dollar Gill on Unsplash

2. The freezing feature moves all data under the chosen root folder to the project’s frozen area and makes it read-only. The file metadata are stored in Metax metadata warehouse in a background operation, which makes the file metadata available for other services in the Fairdata ecosystem. The other two researchers go to check the files in the project’s freezing area and download it on their own computers. They both see that it is the final version of the data and everything’s good to go. The researchers are now ready to publish the data. Hooray!

3. One of the researchers logs into Qvain. She is presented with a metadata editor where she can fill out metadata about the dataset she’s about to publish. She fills the required fields and even adds geospatial data about the locations where the data was gathered. “Pretty neat”, she thinks and selects the “Files” tab in Qvain.

4. The researcher is now presented with a file system view similar to the one she has in IDA. One big difference is that she only sees the data that is in the project’s frozen area. The data that the researchers had stored in the staging area are not visible. The file picker is actually not showing IDA, but the file metadata (file path, name, size etc…) that was stored in Metax when the file was frozen in IDA. The user selects the root folder of the frozen data, which automatically selects all files and subfolders that are under it. In the “Rights and licenses” tab, she has set the dataset’s “Access type” as “Open”. This means that once the dataset’s metadata are published, anyone browsing the dataset can download the files linked to it on their own computer.

5. The researcher is a bit unsure about what license they should use for the dataset. She hits “Save” (and not “Publish”) which saves a local copy of the dataset description in Qvain. She goes talk to her colleague in the next room. The colleague tells her that the default in Qvain called CC-BY-4.0 is a good and recommended option for research data.

6. The researcher is happy with the way the dataset description looks and clicks the “Publish” button. She is presented with a link to Etsin research data finder to view the published data. What she does not see, is that the dataset metadata and links to IDA file metadata have now been stored in Metax. The dataset metadata, including links to file metadata that Metax knows, are shown by Etsin. However, the metadata about files in IDA’s frozen area that are not linked to any dataset metadata are neither shown nor searchable in Etsin.

7. The researcher clicks the link that takes her to Etsin and sees a page that is called a dataset landing page. The page shows the metadata and the file download links that she created using Qvain. Next to the information about the data files, there’s a button that says “Download all”. The researcher clicks the button and her browser starts to download the files. When she clicked the “Download all”-button, the information about the dataset and included files were sent to Metax to check the access rights and identifiers. If all checks pass, Metax tells the Fairdata download component that it is ok to proceed with the download.

8. The researcher now sees that anyone can download their data on their own computer and knows how to use and cite it. Great!