alternative
  • Home (current)
  • About
  • Tutorial
    Technologies
    C#
    Deep Learning
    Statistics for AIML
    Natural Language Processing
    Machine Learning
    SQL -Structured Query Language
    Python
    Ethical Hacking
    Placement Preparation
    Quantitative Aptitude
    View All Tutorial
  • Quiz
    C#
    SQL -Structured Query Language
    Quantitative Aptitude
    Java
    View All Quiz Course
  • Q & A
    C#
    Quantitative Aptitude
    Java
    View All Q & A course
  • Programs
  • Articles
    Identity And Access Management
    Artificial Intelligence & Machine Learning Project
    How to publish your local website on github pages with a custom domain name?
    How to download and install Xampp on Window Operating System ?
    How To Download And Install MySql Workbench
    How to install Pycharm ?
    How to install Python ?
    How to download and install Visual Studio IDE taking an example of C# (C Sharp)
    View All Post
  • Tools
    Program Compiler
    Sql Compiler
    Replace Multiple Text
    Meta Data From Multiple Url
  • Contact
  • User
    Login
    Register

Statistics for AIML - Regression Metrics - Outlier Tutorial

What is outlier?

An outlier is an observation that lies an abnormal distance from other values in a random sample from a population. In a sense, this definition leaves it up to the analyst to decide what will be considered abnormal.

 

Common Causes of Outliers

  1. Data entry errors (human errors)
  2. Measurement errors (instrument errors)
  3. Experimental errors (data extraction or experiment planning/executing errors)
  4. Intentional (dummy outliers made to test detection methods)
  5. Data processing errors (data manipulation or data set unintended mutations)
  6. Sampling errors (extracting or mixing data from wrong or various sources)
  7. Natural (not an error, novelties in data)

Common methods of determining an Outlier

1.   Sort the data and see for the extreme values

2.   Plotting – Boxplot, Scatterplot

3.   IQR Method

4.   Z-Score Method
 






 

Why do we need to treat outliers?

Outliers can impact the results of our analysis and statistical modeling in a drastic way.

IQR Method

A Data value is considered to be an outlier if

Data Value < Q1 - 1.5(IQR)

OR

Data Value > Q3 + 1.5(IQR)

 

Q. Can you identify the outliers from the below dataset, using the IQR method?

26.0 ℃ , 15.0 ℃ , 20.5 ℃ , 31 ℃ , -350.0 ℃ , 31.0 ℃ , 30.5 ℃

Arranging in ascending order - -350,15,20.5,26,30.5,31,31

minimum = -350, maximum = 31

median = 26 (Q2), Q1 = 15,Q3=31

Q1-1.5*(IQR) = Q1 - 1.5(Q3-Q1) = 15 -1.5(31-15) = 15 - 1.5(16) = -9

Q3+1.5*(IQR) = Q3 + 1.5(Q3-Q1) = 31 + 1.5(31-15) = 31 + 1.5(16) = 55

 

-350 is an outlier as it is not in the range of (-9,55)

 

Statistics for AIML

Statistics for AIML

  • Introduction
  • Data Visualization
    • Overview
  • Descriptive statistics
    • Overview
    • Calculate Z Score
    • Covariance and Covariance matrix
    • Covariance vs. Correlation
    • QQ-Plot
    • Central Limit Theorem
  • Inferential Statistics
    • Overview
    • Hypothesis Testing
    • Statistical Test and there types
    • Bias Variance Trade Off
  • Regression Metrics
    • Overview
    • Accuracy
    • PR Curve (Precision-Recall Curve)
    • AUC-ROC Curve
    • Different types of Sampling
    • Skewness
    • Kurtosis
    • Degree Of freedom
    • Different Types Of Probability Distribution
    • Outlier
    • Bayes Theorem
    • Probability

About Fresherbell

Best learning portal that provides you great learning experience of various technologies with modern compilation tools and technique

Important Links

Don't hesitate to give us a call or send us a contact form message

Terms & Conditions
Privacy Policy
Contact Us

Social Media

© Untitled. All rights reserved. Demo Images: Unsplash. Design: HTML5 UP.