Before starting any model building, every data scientist should explore the data to its maximum reach.
Exploratory Data Analysis (EDA) and Feature Engineering for Machine Learning models require a thorough understanding of various data types.
From a Data Science perspective most of the Data can be categorized into 2 basic types:
1. Numeric Data
Numeric data or quantitative data have value as a measurement, such as a person’s height, weight, IQ, or blood pressure; or as a count, such as the number of stock shares a person owns, the number of teeth a dog has, or the number of pages you can read before falling asleep in your favorite book.
Types of Numeric Data :
● Continuous – Continuous data are measurements that can’t be counted and can only be described using intervals on a real number line.
Examples – A person’s height, Time in race, weight of a child, Temperature of a freezer.
● Discrete – Discrete data represent items that can be counted; they take on possible values that can be listed out .
Examples -Number of languages an individual speaks,The number of test questions you answered correctly, the number of kids in a class, number of states in a country.
2. Categorical Data
Categorical data or qualitative data are the data type which takes categories as an input. It can be numbers also but without mathematical meanings, we cannot add them.
They will be categories in number form. Like for Binary (0 & 1 ) are two categories.
Types of Categorical Data :
● Ordinal – Ordinal data is a type of categorical data that has a predetermined order or scale. They are the categorical data but some mathematical calculations are possible due to the presence of order.
Examples – Ratings given by people of a restaurant (0-5). In this example we can find the average rating, or we can order them from highest to lowest.
● Nominal – Nominal data is a type of categorical data that has no order in it. These are just labels used for variables.
Examples – Gender, race, eye color, political party.