Data, in its simplest form, refers to information that can be collected, stored, and analyzed. It can take various forms, including numbers, text, images, and more.

Virtually anything can be data:

Measurements
Simulations
Books
Transactions
Diaries
Musical scores
Algorithms
X-Rays
Historical logs
Recipes
Geographies
etcetera

It can be common to think of data as only hard numbers or categorical data, but data does include so much more. There is so much that can be considered data, it becomes hard to explain without diving in on specific types of data, so we found a couple of short videos online that explain the types of data in succinct ways, watch them on YouTube:

What is Data? Univ of Houston What is Data? Univ of Guelph

While “data” is a broad term encompassing any form of information, “research data” is a more specialized subset specifically collected and analyzed within the structured context of a research project to address specific research questions or objectives. The table below summarizes the key differences between data and research data.

	Data	Research Data
Purpose	Various purposes, from personal information to business metrics	Aim of answering research questions, testing hypotheses, or contributing to scholarly knowledge
Collection Process	Can be collected in various ways	Collected using systematic research methods tailored to the study’s objectives
Context	Exists in numerous contexts	Inherently linked to the research process, contributing to the academic or scientific exploration
Application	Widely applicable in everyday life, business decision-making, technology, and more	Specifically used for academic purposes, contributing to the advancement of scientific knowledge

Qualitative vs Quantitative

Quantitative data involves numerical measurements and quantifiable variables and is typically expressed in terms of numbers and statistics. On the other hand, qualitative data comprises non-numerical information, such as text, images, or observations. Some data can be categorized as either quantitative or qualitative, but many are both.

More explanation about qualitative and quantitative data you can find in this short video; Qualitative and Quantitative Data – Nucleus Biology

Examples of How Data Can be Qualitative or Quantitative

Data Type	Quantitative Aspects	Qualitative Aspects
Measurements	The numerical values measured	Methods used to operate the machine
Simulations	Numerical results of the simulation	Who wrote the simulation software?
Books	How many times each word is used Number of chapters	What kind of narration style used? Motives and relationships between the characters?
Transactions	How much money was transacted? Dates of the transactions	What kinds of products were purchased? Payment method
Diaries	The date range in the diary	What was the author doing on a given date?
Musical Scores	How many key changes? What are the frequencies of the harmonies?	What is the cultural context of the music score? What kind of instrumentation and orchestration has been used?
Algorithms	How many lines of code does it take to implement this algorithm? Performance metrics	What language was used to write the algorithm? What biases are captured in the algorithm
X-Rays	The amount of X-Ray energy captured by the sensor or film	Medical diagnoses that can be determined from the X-Ray
Historical Logs	Measurements (temperature, number of sunspots, transactions) Skew of the measurement device to modern standards	Who recorded the logs? What equipment was used for the logs
Recipes	How much of each ingredient is used Cooking time and temperature	What ingredients are used? Units used to describe time and temperature
Geographies	Coordinates of features	Types of features studied

Primary vs Secondary

The distinction between primary and secondary data lies in their origin and the method through which they are collected.

Primary data is collected by the researcher directly from the source. It can include data gathered through surveys, experiments, interviews, or observation. Researchers collect these data for the specific purpose of addressing the research question at hand. The focus on collecting primary data ensures that the data is current and highly relevant to the topic.

Secondary data is collected by others than the researcher. It can include data from sources such as government reports, academic journals, or industry publications. This data tends to be less specific, but it can also be more extensive, providing broader context to a research area. Secondary data is often used to supplement or support primary data or to provide context for a research project.

We also found a short video on YouTube which explains the differences between primary and secondary data:

Primary and Secondary Data – Prof. Essa

Data vs. Statistics

While the terms ‘data’ and ‘statistics’ are often used interchangeably, there is an important distinction between them.

Data are individual pieces of factual information recorded and used for the purpose of analysis. It is the raw information from which statistics are created. Statistics are the results of data analysis – its interpretation and presentation. In other words some computation has taken place that provides some understanding of what the data means. Statistics are often, though they don’t have to be, presented in the form of a table, chart, or graph.

Both statistics and data are frequently used in research. Statistics are often reported by government agencies – for example, unemployment statistics or educational literacy statistics. Often these types of statistics are referred to as ‘statistical data’.

Difference between Data and Statistics – Univ of Guelph

What is Personal Data?

Data becomes personal data when it is collected from, linked to or related to a living individual. What makes data personal depends on the context and the content of the data. Data like phone number, age or height do not become personal until it is linked to an individual. A date like “12 December 1980” becomes personal data if the data context indicates that it is an individual’s birthday.

The individual linked to the data does not need to be identified, they just need to be identifiable. This means that data is still personal even if the identity of the linked individual is not known. The number assigned to customers in a supermarket loyalty program is considered personal data even if it is not linked to the customer’s name, address or any other identifying information. But the individual using this number is indeed identifiable, because this unique number is used to exclusively track their shopping profile, they can be uniquely identified from the pool of the other customers in the loyalty program. Therefore, all the information captured, like the customer number and their shopping data, is considered personal data.

But personal data can become anonymous, when the content of the data refers to people as a group and not individually, so that it is no longer possible to identify a single individual from the group. From the previous example, customers are identifiable due to the uniqueness of the customer number, and the uniqueness of their shopping profile. But if this ‘uniqueness’ is removed, by removing the customer number and aggregating shopping profiles of all customers, then the whole dataset can be considered anonymous.

You can learn more about personal and sensitive data on our website, find it here:

Personal Data

Geo data – support for researchers

What is Research Data?

Qualitative vs Quantitative

Examples of How Data Can be Qualitative or Quantitative

Primary vs Secondary

Data vs. Statistics

What is Personal Data?

Feedback? Please share with us using this form.

Geo data – support for researchers

Do you need support?

What is Research Data?

Qualitative vs Quantitative

Examples of How Data Can be Qualitative or Quantitative

Primary vs Secondary

Data vs. Statistics

What is Personal Data?

Feedback? Please share with us using this form.