Statistics Essentials

Understanding Outliers in Data Analysis: Insights from R

Normal distribution and normality are a rare event in data analytics. It happened almost never that you have a perfect mound, Gaussian distribution of a variable in a dataset. When that occurs, we are talking about outliers. Outliers are data points in a dataset that deviate from the rest of the data. In this article, we will discuss why does it happen and how to deal with it, in order to apply methods and techniques which have a normal distribution as a prerequisite.