All About Data: The Basics

All About Data: The Basics

This is the first article in the series and it helps with explaining what exactly data is, how to get it, and the types of data that exist.

Data is everywhere.

Gif of Buzz Lightyear with his arm around Woody and a caption that says, 'Data Data Everywhere

From the food we consume and the animals around us, to practically anything you can think of. As you read this article, you're generating data. Even you are a source of data. Now now, hear me out by continuing with this article.

What Is Data?

According to Techterms.com, Data refers to a collection of individual values that, when processed, convey information. Let's analyze an example from earlier to explain better. Think about your favourite meal. What data can be gotten from food? We can get the name of the food, the ingredients used in preparing the food, the quantity of each ingredient, the nutritional value of the food..., and more! That's data! Once it's processed, we get more information about the food. Say the total nutritional value, the nutritional value with the highest per cent contained in the meal, the essential ingredients, and so on.

There's only so much one can do with such little data. To get more data, you can collect it from other types of meals. Now you'll have a dataset. A dataset is a collection or set of data. This ensures that you get better and more accurate information from your data.

Forms of Data

Data can exist in four basic forms:

  • Text: This type of data is represented with a series of alpha-numeric and special characters. For example, this article. It's written in a text format.

  • Audio: This is sound transmitted as a wave of pressure through the air. A computer stores it in the binary format of 0s and 1s. Take your favourite song as an example.

  • Image: The image is broken down into pixels. A number from 0 to 255 is assigned to each pixel. This is then matched to a series of binary digits, 0s and 1s, which is how a computer stores the data of images. That quick selfie you took the other day? That's an image and this is how your devices store it.

  • Video: This data type is also stored in a format similar to images.

To understand this even better, refer to the following video:

Sources of Data

Depending on your use case, these are some of the ways you can collect data:

  • Surveys: Sometimes, the data you need might not exist. It could be responses to a question. You may want to know how users of a product feel about that product. A way to achieve this is through surveys. This could be by creating a survey form online for users to respond to, or an offline method, where you can ask the question directly and record the responses to your questions. You can use surveys for other purposes besides questionnaires.

  • Websites: These contain and store lots of data. This data can be collected via a process known as Web Scraping. You can scrap a website by using APIs, a programming language, and other mediums.

  • Experiments: An experiment is a scientific process that is carried out to make a discovery, prove a hypothesis, or certify a fact. In carrying this out, data is generated in different stages and can be used for the required purpose.

  • Personal Investigation: This method involves you personally collecting the data. Let's go back to our food example. To get the data you need, you would have to record the ingredients used, the nutritional value of each ingredient, the time spent cooking, and whatever data will help you get the information you need.

  • Third-Party Sources: This is when you get or collect data from a pre-existing source. Last month, January, I was assigned to a team in the ADA Software Engineering Internship Bootcamp. We were tasked with creating a product aimed at solving a problem, and we created a product that helps to verify medicine in Nigeria. To make this work, we used data that NAFDAC had already collated and made readily available on their website instead of manually collecting it ourselves.

  • Social Media: This is a mine of data. The use of social media is so prevalent these days. Every second, new content is uploaded on a social media platform. If this type of data ticks your boxes, it's also a source of getting data.

There are many other sources to get data from, but they all fall under either of the following two categories:

  1. Primary Data: This is also known as Raw data, and is data that is collected straight from the source of the data. E.g. Surveys, experiments, interviews and so on.

  2. Secondary Data: This is data that is gotten from already existing data. E.g. Essays, Textbooks, Websites and other third-party sources.

Classes of Data Sources

Data can be obtained from numerous sources, but they are classified under these two fields:

  1. Internal Sources: This is when data is collected from the materials within an organization. This includes reports and records from the organization.

  2. External Sources: This is when data is collected from a different organization.

Usefulness of Data

We've talked about what data is and how to get it. Now, we'll talk about why data is so important.

  • Solve Problems: Data helps with figuring out the cause of a problem and employing the best method to solve that problem.

  • Making Well-Informed and Strategic Decisions: Data is knowledge. With data, you know a subject matter and can make strategic decisions thanks to that. In the case of an organization, this helps with making decisions that have a huge impact on the growth of your company.

  • Monitor Progress: Data can be used to keep track of the results of an action or decision. For organizations, it helps with monitoring the results of a business decision, keeping tabs on the profits or losses a company makes, deciding if that choice is healthy for the business, and more.

  • Improve Lives: Data helps with good decision making which in turn helps to improve the lives of people.

  • Making Informed Arguments: Having data helps one in developing fact-based opinions.

Sources