- What is EDA?
EDA stand for Exploratory Data Analysis.
It is a process of examining or understanding the data and extracting insights or main characteristics of the data.
It has four types:
- Univariate graphical
- Multivariate graphical
- Univariate Non-graphical
- Multivariate Non-graphical
- Univariate graphical:
Univariate data visualization plots help us comprehend the enumerative properties as well as a descriptive summary of the data variable.
- Multivariate graphical:
Graphics are used in multivariate graphical data to show the connections between two or more knowledge sets.
- Univariate Non-graphical:
this is the most straightforward type of data analysis because we only consider one variable when researching the data.
- Multivariate Non-graphical:
Multivariate non-graphical EDA technique is usually want to show the connection between two or more variables within the sort of either cross-tabulation or statistics.
EDA is crucial since, before getting your hands dirty, it’s a good idea to comprehend the problem statement and the numerous connections between the data characteristics.
Basically, the primary motive of EDA is to
- Examine the data distribution.
- Handling missing values of the dataset.
- Handling the outliers.
- Removing duplicate data.
- Encoding the categorical variables.
- Normalizing and Scaling.
- REAL LIFE EXAMPLES OF EDA
- Healthcare:
EDA is useful for identifying organic patterns concealed within big collections of medical data. Additionally, hospitals, health agencies, and healthcare networks keep a lot of information in electronic medical records (EMRs). Although there are stringent compliance guidelines in place to protect patients’ privacy, those in the healthcare industry constantly look for novel ways to use this data without having to associate it with specific people. EMRs can be tested using data mining tools, which can provide crucial insights into how chronic diseases like renal disease grow.
- Marketing:
EDA provides insights into a range of purchasing situations, such as the reasons why people are stopping to purchase a product or the reasons why a specific marketing campaign is effective. EDA gives Analysts a plan of action for the future by helping them comprehend the context around those variables.
- Professional sports:
EDA is used by sports analysts to identify the most successful players and teams as well as to identify the factors that affect a team’s success or failure. Sports data insights can also assist those who bet on sports online at sites like DraftKings to create more intelligent wagers. EDA is a useful tool for choosing which athletes or groups a business should support.
- History:
EDA can be used to generate fresh information about the past. By using data collected from sources such as archaeological digs, digitized photos, and text, it’s possible for Data Analysts to have a more robust understanding of past events that have remained a mystery for millennia.
- Fraud detection:
When EDA data mining techniques are used on Medicare datasets, it is possible to evaluate the risk of a given individual for fraudulent activity.