Data Management
In the 21st century, data is everything. Your research data is crucial as it is the evidence base for your research findings. Your research data is also a valuable resource that has taken a lot of time and money to create. With massive volumes of data generated daily, data management is a key performance indicator of a successful research project.
Data management is a group of activities related to the planning, development, implementation, and administration of systems for the collecting, keeping, and using data in a cost-effective, secure, and efficient manner. These activities are an integral part of the research process itself, so you probably already perform them. Effective data management is carried out for the entire lifecycle of the data, from the point of creation through to dissemination, publication, and archiving. But where to start and what components should be distinguished?
Prior to data collection
Before you can start collecting or creating data, you should have a plan about what kind of data will be generated in your research and how this data will be managed and organized. A data management plan (DMP) will help you specify this and put those intentions on paper. Aspects like data storage / cloud storage, data provenance, information security, distribution of data management roles, sharing data with colleagues and partners, and data ownership and rights can be described in detail in the plan how they will be organized for the project.
During data collection
Before the data collection can start, the DMP needs to be finalized, reviewed by a data specialist (for help and support) and submitted to your funder and relevant privacy and ethical requirements should be met. In doing so, it is important to define a good data structure and make use of best practices for tabular data and qualitative vs quantitative data. To make sure your (personal) data is stored safely, data encryption should be considered.
During your research while you are busy generating and organizing your data, make sure you also accompany your data with proper data documentation and metadata. For this purpose, it is useful to decide what to document for data and have a plan or template ready, so the document files can be created during and as part of data generation.
After data collection
When data are finalized, you should decide what data should be kept and preserved and what data is no longer needed. Data that are no longer needed because they are not useful for reproduction and re-use should be removed. When data is not suitable to be made publicly available, deposit this data in the Data Archive Geo (DAG). Data that is directly related to published scientific articles or papers or that meets the conditions for public release, should be published in a repository with a suitable data license, preferably one which is commonly used in your research domain. When there are no best practices or you don’t have a preference, please use YoDa as your repository.