There are a number of factors which can influence your decision whether to retain or dispose of your data once your research project has ended.
Many funders expect data generated during a research project to be preserved and shared (where appropriate) beyond the lifetime of the research project. This does not necessary mean you have to keep all of the data you create or collect. Requirements will vary across funders, but as a general rule you will be expected to preserve:
Just what constitutes "long term value" can be difficult to quantify and will depend upon your own understanding of the data and its potential value to others within and outside your research community.
Some journal publishers also have data sharing policies which require researchers to retain and make available data that underpins published research (see the Deposit your data tab above and Publish your data webpage for additional information about publisher data policies).
The university also requires research data to be kept beyond the lifetime of the research project, though retention rates vary depending on the nature of the data and the research for which it was created. See the College Records and Data Retention Schedule.
Sometimes it is clear that your data will have value beyond the lifetime of your project, e.g. if it supports published research. But it is not always easy to determine in advance which data will have long term value or significance. Data may be used for purposes other than those for which they were created or have value beyond the original discipline or research community.
Again, there are a number of factors which will determine how long you should keep your data for:
Most research councils and many other funding bodies have published guidelines on how long data should kept once a project has ended. These vary from funder to funder. For an overview of funder policies and links to relevant policy documents and guidance see our webpage Funder Policies on Managing and Sharing data.
University of Southampton: Funder retention requirements
Timescales for retention and disposal of research data are included in the King's Records and Data Retention Schedule (see pages 35-37 for guidance on research data). Periods of retention vary according to how the data are classified. The Corporate Records Management team can provide further assistance if you are unsure of which category your data belongs to: email@example.com
NOTE: Where the university's retention schedule differs from funder policy requirements, the latter takes precedent.
The General Data Protection Regulation requires that data holding sensitive information should not be kept for longer than is necessary or for purposes other than those for which it was collected. However, Article 89 of the Regulation does provide an exemption for personal data "that is processed for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes". Data should not be disposed of prematurely where it might damage the interests of the data subjects.
If you are planning to hold on to data which you are otherwise not required to retain, keep in mind that datasets held by public authorities, including universities, are subject to requests for access under the Freedom of Information Act (2000). Access can only be withheld where one of the Act's exemptions applies. Exceptions include personal data, information intended for publication and information subject to a confidentiality agreement such as a signed consent form. Once an FOI request has been made the data must not be deleted. Destroying or deleting information once an FOI request has been made is a criminal offence.
JISC - Freedom of Information
The Environmental Information Regulations (2004) gives the public access rights to environmental information held by public authorities including universities. Again, there are exceptions, including for data which contains personal information.
ICO - Guide to the Environmental Information Regulations
The most reliable way to dispose of data is physical destruction. Shredders certified to an appropriate security level should be used for destroying paper and CD/DVD discs. Computer or external hard drives at the end of their life can be removed from their casings and disposed of securely through physical destruction.– UK Data Archive - Data Disposal
For guidance on records and data disposal contact the Corporate Records Management team: firstname.lastname@example.org. For help with disposing of hardware and media contact the IT help desk.
Choosing open file formats is important for data archiving and long-term preservation, but it is also a good idea to think about which formats you will use to store your data even before you start collecting your data as your choice of file formats will have a significant impact on whether you will be able to access your data files at a later date and also on the ability of other, future users to access the data in the long term.
Ideally you should use formats that allow for the long term preservation and accessibility of your data, but your choice will most likely be determined by a range of factors. These might include...
Where possible you should store your data in open or standard rather than closed or proprietary file formats.
With most proprietary formats the specification is privately owned and subject to restrictions due to copyright or other intellectual property rights.
With open formats however, the code supporting the format is publicly available and free to use by anyone. This allows others to develop software that can access the files and reduces the risk of software or hardware obsolescence.
Open formats are also more likely to be backwards compatible with previous versions and well supported by a user community over a longer period of time.
Examples of open formats include CSV, XML, JPEG 2000, Open Office Document, Tar, ZIP.
Some proprietary formats have become standard and are also supported by open documentation (e.g. PDF, TIFF) or are widely used and likely to be around of a long time (e.g. SPSS, MS Office software applications) and are therefore considered acceptable for short term preservation.
It is generally best to avoid “lossy” compressed file formats. Lossy compression has the advantage of allowing smaller file sizes but some important information might be lost. For image files, TIFF is lossless, JPEG lossy; for audio files, WAV is lossless, MP3 lossy. Lossless compression produces larger files but every bit of the original data is restored when the file is uncompressed (e.g. PNG, GIF, and ZIP files). For important files, best practice is to keep a master copy in a lossless format.
Sometimes using proprietary formats is unavoidable, particularly if they are widely used within a discipline or research community (e.g. crystallographic information files (CIF)). It may well be the case that the formats you need to use to collect and analyse your data are not always the most suitable for preserving your data beyond the lifetime of your research project. Where possible you should consider converting your data to open formats to allow long term preservation and access. However, when converting data files from one format to another it is always advisable to check for any errors or loss of information such as missing such as fonts, footer and headers, footnotes, hyperlinks, image resolution, sound quality and colour fidelity. Also, some open formats lack the functionality and formatting of proprietary formats, so there may be occasions when it is a good idea to retain copies of important data in their original format while also using open formats to enable future access and sharing.
The UK Data Archive provide useful guidance and information on managing quality control for data collection, migration and transcription.
Provides specialist experience and knowledge
Likely to be very selective, setting high standards regarding the quality of your data and metadata
If using a data centre that isn't supported by your funding agency, there could be costs involved*
Zenodo and Figshare offer free storage accounts
Accessible and easy to use
Dryad, Figshare and Zenodo issue DOIs for datasets
Restricted file size and storage space if using a free account (Figshare: file size limit 5GB, private space 1GB, unlimited public storage space; Zenodo: size limit is 50GB per dataset, but you can have multiple datasets).
Check that the terms and conditions are compatible with your funder or journal policies and that there are sufficient backup strategies in place to preserve the data should the service disappear.
Enables compliance with funder or journal data sharing policies if no other suitable repository can be found
Provides long-term storage, open access to datasets (where appropriate), DOIs for datasets and a published metadata record
Restrictions on file size and storage space - 1TB per dataset and maximum 10GB per data file
Not suitable for patient data. Acceptance of data containing sensitive or confidential information is subject to review on a case-by-case basis
Makes software management easy
Facilitates file sharing and collaboration
Most repositories support the release of software under open source licences
Some sites might not be suitable for code that are closed source or released under mixed licences
*Some funders support domain or disciplinary specific repositories or data centres and expect researchers to deposit their data in these where appropriate
**You can ensure the long-term preservation of code/software by linking your GitHub account with Zenodo. An integration between the services allows you to log in to Zenodo using your GitHub account. Guidance is available here.
The UKRI (UK Research and Innovation), Wellcome Trust, and Cancer Research UK all allow for preservation and archiving costs to be included in grant applications as long as the costs are incurred or allocated before the award has ended. The UKRI''s web page 'Supporting research data management costs through funding' includes a link to a pdf document 'Guidance on best practice in the management of research data' which contains information on how to include costs in a grant application (see pp10-12).
Creating a data management plan will help you budget for these costs in advance.
What is the King's Research Data Management System?
The King's RDM System is a research data repository service providing long term storage and public access for datasets that support published research and/or have long term value.
If you deposit your data with King's we will publish a metadata record for your dataset to increase its public discoverability allow others to access and reuse your data (where appropriate) issue a DOI for your dataset to support data citation
Depositing your data with the RDM System can also help you meet any funder's data policy requirements for data archiving and sharing.
How can I deposit my data with the King's RDM System?
Files that are less than 20MB can be emailed to us at email@example.com
Files up to 1GB in size can be sent using the file transfer service. Log on using your King's address and send your files to firstname.lastname@example.org
For files greater than 1GB but less than 15GB, we can use a portable, external storage device or a shared One Drive for Business folder which we can set up and into which you can deposit your data.
You can use one or more of these options. When you are ready to transfer the files, could you please let us know and we will then make the necessary arrangements.
If you require any further help or assistance please contact the Research Data Management team:
Email - email@example.com
Telephone - 020 7848 1030 or 1303
*Datasets containing sensitive or confidential data will be reviewed on a case by case basis.
**If you have data files larger that 15GB and/or the total volume of your data is greater than 25TB please contact the Research Data Management team before filling in the data deposit form.
+44 (0)20 7848 1030
Browser does not support script.