Browser does not support script.
You should think about how you are going to organise your files and folders from the outset. Files can very quickly become disorganised and unmanageable if file names and folder structures are not organised in a consistent and logical way. Well organised files and folders make it easier to locate and retrieve your data and save time and frustration.
Naming files in a consistent and logical way will help you to distinguish between similar files and make finding your data easier. To ensure consistency and avoid confusion you should choose a system of naming conventions at the outset of your project and stick with it.
Your files will probably go through various drafts and versions. If you are engaged in collaborative research your files may be revised by more than one person. How will you keep track of who made which changes or identify which is the current or final version? Version control allows you to manage and record the changes your documents go through as they are redrafted and amended.
Data documentation provides information about how and why data files were created, their content and structure, and what processes and transformations the data have undergone during the lifetime of the project. Data documentation also provides information that enables the data to be accessed and interpreted by future users.
A crucial part of making data user friendly, shareable and with long lasting usability is to ensure they can be understood and interpreted by any user. This requires clear data description, annotation, contextual information and documentation.– Document Your Data, UK Data Service
Metadata is sometimes defined as "data about data" or "information about data". Although the terms documentation and metadata are sometimes used interchangeably, metadata is also used in the more restricted sense of structured information that is both human and machine readable.
Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource.– Understanding Metadata, NISO, 2004
Providing adequate documentation and metadata to your data is essential. Documentation and metadata add context to your data and provide information necessary for its discovery and reuse. Adding metadata makes it easier for you to find and understand your own data as well as enabling your data to be accessed and shared - where appropriate - with others.
Exhaustive documentation and metadata compliant with your discipline's standards and schema will make your data Findable, Accessible, Interoperable and Reusable. In one word: they will make them FAIR.
Without adequate documentation and metadata your data are potentially meaningless. Ideally, you should begin documenting your data as it is created or collected rather than leaving it until the end of the project.
This will vary according to the type of data being described and the level of description. For most datasets you will usually be required to provide at least some basic descriptive information. Ideally, you should provide enough contextual information to allow others to discover, understand, access and reuse your data.
... the metadata must be sufficient to allow others to understand what research data exists, why, when and how it was generated, and how to access it.– EPSRC, Clarifications on Research Data Management
Source: Archaeology Data Service: Guide to Good Practice
Source: Van Van den Eynden, et al (2012) Managing and Sharing Data, UK Data Archive, p9.
If you deposit your data in a repository you will almost certainly be expected to provide a minimal amount of project level metadata and some repositories might ask you to add file level descriptions as well.
For example, researchers who wish to deposit their data with the King's RDM System are asked to fill in a Data Deposit Request form. Information collected in the form will be used to create a public metadata record that will make it easier for others to discover and make sense of the date.
Documentation and metadata can be added to data in a variety of ways:
If you are depositing your data in a domain or disciplinary specific repository you might also be asked to provide information about your data that is specific to your domain or discipline. It is a good idea to make sure that you are familiar with any metadata standards that are widely used within your area of research.
See the Digital Curation Centre's web site for a more comprehensive list of disciplinary metadata standards as well as information about disciplinary metadata tools.
Examples of published metadata records for datasets and data files can be found by browsing the catalogues of data repositories or data centres (e.g. UK Data Archive, Dryad, Archaeology Data Centre). Re3data.org is a registry of research data repositories.
Quality assurance and quality control are the measures which researchers can adopt to prevent errors from entering or remaining in a dataset. Ensuring the quality and integrity of research data is an integral part of good research data management across the whole research life cycle, from collecting data to preparing data for analysis and publication. In addition, many funders expect researchers to include details of the measures they will adopt to safeguard data quality and integrity in their data management plan.
Here are some examples of best practices for assuring data quality and integrity adapted from guidance provided by the UK Data Archive.
You can use the following procedures to make sure that the data recorded reflect the actual facts, events, responses and observations:
You can use the following methods to ensure accurate, standardised and consistent data transcription, digitisation or entry in a database or spreadsheet:
Guidance on interview transcription methods and quality control can be found on the UK Data Archive website.
Checking your data is a vital stage of ensuring quality usually performed after data are edited, cleaned, verified, cross-checked and validated. The following procedures can apply at this stage:
The acronym FAIR stands for Findable, Accessible, Interoperable, Reusable. It is a set of principles for research data and metadata, to improve its discovery and access, how it can interact with other datasets and systems, and ultimately how it can be reused by others.
The term ‘FAIR’ was launched at a Lorentz workshop in 2014, and the resulting FAIR principles were published in 2016. They have been widely accepted and promoted by researchers, institutions, funders, publishers, and political leaders. There are a number of initiatives committed to developing, understanding and meeting them. The principles are summarised by the GO FAIR initiative as:
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.
Once the user finds the required data, she/he needs to know how can they be accessed, possibly including authentication and authorisation.
The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
Within each principle there are several steps to work towards - and work towards is a good way of looking at it, as achieving FAIR isn’t a binary state. Rather, it is a spectrum along which you can meet different aspects or degrees of making data FAIR. Realistically you might not meet every measure of FAIR, but all that you meet will help to enable data reuse. As nicely described in the Turing Way handbook for reproducible data science, FAIR applies not just to data files or datasets themselves, but to different entities in the storing and sharing infrastructure:
“The FAIR principles refer to three types of entities: data (as any digital object), metadata (information about that digital object), and infrastructure (i.e. software, repositories). For instance, the findability principle F4 defines that both metadata and data are registered or indexed in a searchable resource (e.g. a data repository).”
Furthermore, the responsibility and contribution towards making data FAIR are shared by researchers, institutions, technology providers, funders and publishers, with some examples being:
Individual Researchers will strive to: Document data to agreed community standards that describe provenance and enable discovery, assessment of reliability, and reuse
Funding agencies and organizations will strive to: Review data management plan requirements regularly to validate support of open and FAIR standards and promulgate leading practices.
Societies, communities, and institutions will strive to: Promote open and FAIR data activities as important criteria in promotion, awards, and honours.
Publishers will strive to: Adopt a shared set of author guidelines that support FAIR principles, providing a common set of expectations for authors
Repositories will strive to: Ensure that research outputs curated by repositories are open and FAIR, have essential documentation, and include human-readable and machine-readable metadata (e.g. on landing pages) in standard formats that are exposed and publicly discoverable.
These are among the principles contained in the Commitment Statement in the Earth, Space, and Environmental Sciences, which is one of many initiatives by groups with a disciplinary or process driven approach.
The How FAIR are your data? checklist is a good place to start, to think in advance about what you might need to do to make your data FAIR, and to assess it before it is archived and shared.
If you’d like advice about making your data FAIR, please get in touch with us at email@example.com.
+44 (0)20 7848 1030