Adding datasets to Old Etsin

It is possible to add datasets to the old Etsin, but IDA files cannot be linked. This can be done when the new metadata editor Qvain is ready. Datasets created in old Etsin will be migrated to the new services by CSC when the new Fairdata.fi services officially go into production in 2019.

Datasets can be added to the new Etsin only through the Metax API. Please, first try the test environment. Avoid creating redundant persistent identifiers and notice that datasets and identifiers will break and cannot be repaired if files are changed!

It is possible for the description of your datasets to get published in Etsin in several ways:

A researcher can describe his or her data directly in Etsin service. As a part of the description the metadata can be attached to his or her home organisation.
The researcher can store his or her metadata to a discipline specific or some other data archive which is harvested by Etsin. Thus the dataset is preserved in the archive and is simultaneously discoverable in Etsin. This method is already applied to the Finnish social science data archive and to the Language Bank of Finland.
The researcher can store the descriptive metadata to his or her research organisation’s service, for example research data service, from where the description can be harvested to Etsin. The organisation is encouraged to propose harvesting of metadata to Etsin to ensure better visibility and discoverability of data.

To log in you need Haka credentials that are given to persons affiliated to many higher education organizations in Finland. If you are logged in you can find the “Add dataset” button in the top right corner. This button will take you to the Add Dataset form. The form will guide you to discribing your dataset as well as possible so that your metadata would be of good quality. Next to the fields you will find button marked with a question sign, “?”, from which short instructions are found. We recommend reading these instructions and contacting us or the information services or library at your own institution if you have any further questions.

Describing a dataset

The descriptions of the dataset, its metadata fields, are divided into several tabs
Some fields are compulsory and they are marked with an asterisk (*) following the field label. It is beneficial to describe your data as thoroughly as possible. You cannot save the record unless you have given some information to the fields marked with two asterisks (**).
Pressing the ‘?’ button next to any field will open instructions about it.
Suggestions to fields describing the dataset’s languages, geographical coverage and keywords are retrieved from Finto, the Finnish thesaurus
and ontology service
Many features in the Add Dataset page require JavaScript to be enabled in your browser.
If you do not have a persistent identifier (for example a DOI) for your dataset, Etsin will allocate a URN-identifier for you. This should be used when citing your data, which means you should not make any changes to your dataset, at least not without documenting them in the data lifecycle events in Etsin.
On the Identifiers tab you have the possibility to link datasets to each other. For instance you can create new datasets by making other datasets part of a new dataset.

1. Basic informaton

Dataset title*	Fill in the titles of the data in different languages. You can add more languages from the ‘+’ button and fill in the different titles by selecting the corresponding tab. You can also remove a language from an ‘x’ button in the tab. You must add at least one title.
Free description	Write a free description of the data. You can add more languages from the ‘+’ button and fill descriptions in different languages by clicking the corresponding tab. You can remove descriptions from the language tabs’ ‘x’ button. The description must be added in at least one language.
Languages*	Fill in the language or languages used in the data to the text field. When you start typing, you will receive automatic suggestions. Select the suggestion you want by clicking it or selecting it with arrow keys and then pressing the Enter button. You can remove an inserted language from the ‘x’ button. Optionally you can write yourself languages in ISO 639 format separated with a comma, for example “eng, fin, swe”. It the data doesn’t contain linguistic data you can select the checkbox ‘This dataset contains non-textual data’. The automatic suggestions are retrieved from the Finto ontology service (http://finto.fi/en/)
Keywords*	Select and insert keywords which describe the data. When you start typing, you will receive automatic suggestions. Select the desirable suggestion by clicking it or selecting it with arrow keys and then pressing the Enter button. The suggested keywords are retrieved from the Finto service, from the KOKO ontology.
Disciplines	Select one or more scientific disciplines whose data collection has produced the dataset. When you start typing, you will receive automatic suggestions. The suggested keywords are retrieved from the Finto service’s okm-tieteenala ontology.

2. Actors

Authors*	The person or organisation that originally created the dataset should be added as author. You can give an identifier (e.g. ORCID) for the author. You can add more authors by pressing the ‘+’ button and remove authors by clearing their fields. For each author you must give atleast a name or an organisation.
Contributors	Persons who have contributed significantly in creating the dataset can be added as contributors with their organisation information. You can give an identifier (e.g. ORCID) for the contributor. You can add more contributors by pressing the ‘+’ button and remove contributors by clearing their fields.
Distributor*	A person or an organisation that has the right to distribute the dataset should be added as distributor. You can give an identifier (e.g. ORCID) for the distributor. You can add more distributors by pressing the ‘+’ button and remove distributors by clearing their fields. If you are using the default Resource Entitlement System (Reetta) the distributor needs to have Haka credentials and log in to Reetta to activate the service.
Project	If the dataset is produced in an externally funded project, choose “This dataset was produced in a project.” and fill the project’s name, funder, funding identifier and project homepage. You can add more funders by pressing the ‘+’ button and remove funders by clearing their fields.
Owner information*	A person or an organisation who decides upon the use of the dataset should be added as owner. Type the owner’s name (firstname familyname) or identifier (e.g. ORCID). Also select an organisation to which this dataset will be linked. When you start typing, you will get suggestions for matching organisations. Select the correct organization with a mouse click or by selecting the correct one and pressing enter. If you can not find your organisation from the selection list, you can add a new organisation by writing it’s name to this field. <br /><br />Once the dataset is published it is visible in the organization’s page. The organization’s editors can also modify this dataset.

3. Access information

Dataset is available for use*	Choose how the data can be accessed. For the two uppermost choices, type an URL address where the dataset can be found.
License*	Choose a license from the dropdown menu. A license defines how your dataset can be used and for what purposes.
Copyright notice	You may provide detailed information about copyright notice, use constraints, referencing etc. Note that the field is required if license is set to License Not Specified or any variant of Other. In that case the field can be used to input the actual rules describing the dataset’s reuse, i.e. the license text.

4. Additional information

Spatial coverage	You can specify the geographical areas or locations that your dataset pertains to. This could be for example the location where the data has been collected, or the region where the research subjects live. Once you start typing, you will automatically get suggestions for matching geographical names. You can then select the appropriate location from the list of suggestions with a mouse click, or you can enter one of your own. You can remove a location by clicking the ‘x’ next to it. Suggestions are retrieved from the Finto vocabulary service (see http://finto.fi/en/).
Temporal coverage	Temporal coverage can be filled to include the starting and ending time of the time period during which your data was collected or for example to tell what time period your data is linked to. Click the calendar icon to select a date with a date picker tool. If you wish to enter only year or month you can do so by typing directly e.g. 2015 or 2015-01. You can optionally enter time with timezone information by choosing “use exact time”. Note that this tool requires JavaScript to be enabled in the browser.
Dataset lifecycle events	You can describe the lifecycle of your data by adding important events related to it. A single event consists of four fields: event, by whom, when and description. Fill all four fields when adding an event. You can add more events by clicking the ‘+’ button. To remove an event, leave it’s contents empty.
Most recent data modification date	The modification date tells when the data has last been modified. Click the calendar icon to open a calendar tool to choose the modification date for this dataset. Time can be entered if “use exact time” is chosen.
Technical details of the data file	You can add techical details of the file that contains your data. MIME-type: give the type of the file’s contents (MIME-type). A full list can be found at http://www.iana.org/assignments/media-types, Format: file format Checksum and algorithm: checksum and algorithm can be used to verify integrity of the data file.

5. Identifiers

Identifier*	The dataset should have a unique permanent identifier. If you leave the field empty, Etsin will generate a new permanent URN identifier for the data.
Related data (identifiers)	If you have several identifiers for the dataset, you can add identifiers to these fields.