Data Management

Main principles

 


  • The LipLab data management policy is guided by a set of main principles:


    1. Data often have a longer lifespan than a research project or employment.

      Researchers may continue to work on data after funding has ceased, follow-up projects may analyze or add to the data, and data may be re-used by other researchers.
      Therefore, data should be reusable, well organized, well documented, and preserved; results should be reproducible.

    2. Data management is conducted at multiple levels.
      Aside from each individual researcher’s responsibilities when it comes to handling data, data management is also overseen both at the level of the lab and the faculty.
    3. Highest standards regarding open science, transparency and research integrity.
      Research is preregistered an materials and data are made publicly available (GitHub and/or Open Science Framework, OSF) when possible.

Data and material storage


For archiving purposes, two secured university network drives are used to store data and other research-related materials: one is used actively by the lab (i.e., for ongoing projects), whereas the other serves as a permanent storage (i.e., archive) on faculty level. Each lab member has their own folder on these shares. At the very least, all materials and data related to a manuscript accepted for publication, of which a LipLab member is the first author, should be stored on these shares. However, lab-members are encouraged to install a program that automatically backs up any workflow (published or not) to these secure drives. One lab-member is tasked with periodically checking whether all data are properly stored in line with the requirements below.


Here are the more specific guidelines we use:


Data and materials are stored in separate folders for each project, named
[1st Author’s Initials]_[year]_[meaningful project name]”, which includes the following:

  • A “ReadMe.txt” file providing information about the project: (1) its purpose, (2) a short description of different experiments, (3) a link to the project on OSF, (4) the current status of the manuscript and (5) any other relevant information.
  • The Data Storage Fact Sheet (DSFS) for the project as required by the faculty:
    • This document serves to univocally provide information about the all raw, processed and meta-data collected, stored or described in a manuscript. It also serves to identify who should be contacted with request for information about said data, and as a data management checklist for the researchers involved. Each DSFS is connected to one study and is written after its completion.
  • The current draft/pre-print of the manuscript
  • A folder for each experiment, each of which includes the following sub-folders:
    • Materials: should include all the experiment files/scripts – all the files needed to run the study
    • Data: Should include raw data and processed data files + data processing script:
      • Raw data - should include all the raw data files
      • Processed data – should include any outputs of the data processing (prior to analysis) and any script(s) that was(were) used to process that data
      • Note that any personal or identifying information of participants is removed from the raw data files before storage
    • Analysis: should include the R (or other) scripts that were used to analyse the data and optionally documents
    • Study information: should include any relevant documents like the preregistered plan for this study and the summary of the results

 

When a project involves the collection of non-digital data, we to digitize them if possible. If this is impossible, the data are stored in the faculty archive for research data.


More generally, lab members are encouraged to use R Markdown to improve readability of analyses scripts or results, and thus facilitate sharing them, and GitHub, for version-control.

Open science


As a lab, we highly value research integrity and transparency. We strongly recommend all members to use the Open Science Framework (OSF) for several reasons. We aim to be as open and transparent as possible when it comes to research, and preregister all studies unless there is a clear reason not to (e.g., ethical constraints). Aside from transparency (preregistration), storing project materials and data (cf. the file structure described above) on OSF repositories also has the benefit of having all relevant materials in one centralized location with built-in version-control and it facilitates collaboration and the sharing of research with researchers outside of the lab.