Data Management Guide

A plant that is not watered, fed, or given sunshine will wither and die. So too, data that is not properly tended may be lost, forgotten, made unusable, and ultimately rendered valueless. Researchers take the utmost care in locating and collecting data to be used in carefully planned and executed studies. However, less time and attention may be devoted to the care and stewardship of the collected data, possibly making it difficult or impossible to use again in future work or use by different scientists. Futhermore, researchers who do not publish their data also deny themselves additional citation opportunities.

Even though implementing a data management strategy may seem daunting, it can be beneficial for a variety of reasons:

  • More opportunities may be available for easily reproducing research results.
  • New questions can be answered through sharing and reuse of the data.
  • Opportunities may arise for meta-analysis.
  • Discovery and access may be improved, now and in the future.
  • Accuracy may be improved through open sharing and community review.
  • Researchers may be assured that the data is accurate, authentic, and complete.
  • Researchers may save time and energy using the data.
  • The data may be protected from loss, unwanted alteration, or being made unusable.
  • It may be easier to comply with regulations and satisfy other ethical concerns.
  • Costs (in terms of effort and dollars) may be reduced.
  • The data may be more easily used for educational purposes.
  • And, it may be easier to comply with journal and sponsor guidelines.

Fortunately, researchers do not have to invent the wheel when it comes to data management. Many guides exist to walk researchers and curators through the process of constructing a data management plan and executing it for the life of the data. Below is a bibliography listing some of the many guides available, as well as scholarly articles that recommend certain best practices. Each guide or set of recommendations covers a variety of aspects of data management, from security to description to preservation and more. Guides and articles may offer recommendations for:

  • Security (e.g., passwords, encryption, firewalls, virus protection)
  • Description and documentation (e.g., metadata, code books, defining specialized terminology)
  • Storage and backup (e.g., backups, replication, synchronization, version control)
  • Intellectual property (e.g., copyright, licensing, data use agreements)
  • Data access, use, and reuse (e.g., file formats, data types, naming conventions, unique identifiers, quality assurance such as validation, machine readability, transmission/distribution, bulk/single file access)
  • Privacy and sensitive data (e.g., Institutional Review Board considerations, HIPAA, FISMA, informed consent, access control, anonymization)
  • Preservation (e.g., retention periods, version tracking, disposal, non-proprietary formats)
  • Publication and citation

Guides

Australian National University Library. (2015). ANU Data Management Manual: Managing Digital Research Data at the Australian National University, http://anulib.anu.edu.au/_resources/training-and-resources/guides/DataManagement.pdf

Australian National University Library. (2015). Data Management. http://libguides.anu.edu.au/datamanagement

Borer, E. T., Seabloom, E. W., Jones, M. B., & Schildhauer, M. (2009). Some simple guidelines for effective data management. Bulletin of the Ecological Society of America, 90(2), pp 205-214.

Corti, L., Van den Eynden, V., Bishop, L., & Woollard, M. (2011). Managing and sharing research data: A guide to good practice. UK Data Archive. https://us.sagepub.com/en-us/nam/managing-and-sharing-research-data/book240297

 Greenberg, J., White, H. C., Carrier, S., & Scherie, R. (2009). A metadata best practice for a scientific data repository. Journal of Library Metadata, 9(3-4), pp. 194 – 212.

Hook, L. A., Vannan, S. K. S., Beaty, T. W., Cook, R. B., & Wilson, B. E. (2010). Best practices for preparing environmental data sets to share and archive. Oak Ridge National Laboratory Distributed Active Archive Center. https://daac.ornl.gov/PI/BestPractices-2010.pdf

Ingram, C. (2016). How and why you should manage your research data: A guide for researchers. Jisc, https://www.jisc.ac.uk/guides/how-and-why-you-should-manage-your-research-data

Inter-university Consortium for Political and Social Research (ICPSR). (2012). Guide to social science data preparation and archiving: Best practice throughout the data life cycle (5th ed.). https://www.icpsr.umich.edu/icpsrweb/content/deposit/guide/

Michener, W. K., Brunt, J. W., Helly, J. J., Kirchner, T. B. and Stafford, S. G. (1997), Nongeospatial metadata for the ecological sciences. Ecological Applications, 7: 330–342. doi:10.1890/1051-0761(1997)007[0330:NMFTES]2.0.CO;2

Oak Ridge National Laboratory Distributed Active Archive Center for Biogeochemical Dynamics. (n.d.) Data management for data providers. https://daac.ornl.gov/PI/pi_info.shtml

Strasser, C., Cook, R., Michener, W., & Budden, A. (n.d.)  DataONE Primer on data management: What you always wanted to know. https://www.dataone.org/sites/all/documents/DataONE_BP_Primer_020212.pdf

UK Data Archive. (n.d.). Create & manage data. http://www.data-archive.ac.uk/create-manage

Whitlock, M. C. (2011). Data archiving in ecology and evolution: Best practices. Trends in Ecology and Evolution, 26(2), pp. 61 – 65.

Related Articles

Borgman, C. L. (2012). The conundrum of sharing research data. Journal of the Association for Information Science and Technology, 63(6), pp. 1059-1078.

Jacobs, J. A., & Humphrey, C. (2004). Preserving research data. Communications of the ACM, 47(9), pp. 27-29.

Witt, M. (2008). Institutional repositories and research data curation in a distributed environment. Library Trends, 57(2), pp. 191 – 201.

Witt, M. (2012). Co-designing, co-developing, and co-implementing an institutional data repository service. Journal of Library Administration, 52(2), 172 – 188.