Statistics for AIML - Inferential Statistics - Statistical Test and there types Tutorial
Statistical tests are used in hypothesis testing. They can be used to:
• determine whether a predictor variable has a statistically significant relationship with an outcome variable.
• estimate the difference between two or more groups.
Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.
Common Tests in Statistics: https://www.youtube.com/playlist?list=PL1-AEK7gI02MMwOtMG9x6gPrxZE125m1-
a. T-Test/Z-Test
b. ANOVA
c. Chi-Square Test
d. MANOVA
ANOVA-
ANOVA stands for analysis of variance, it is described as a statistical technique used to study significance of the difference in the means of two or more sample, by examining the amount of variation within the samples corresponding to the amount of variation between the samples. It bifurcates the total amount of variation in the dataset into two parts, i.e. the amount ascribed to chance and the amount ascribed to specific causes.
Anova F(N,D)=Variation between sampleVariation within sample
The Anova should be used when,
- Population from which samples are drawn is normally distributed.
- Samples are independent and random.
- Each one of the population has same variance
It is of two types:
One way ANOVA: When one factor is used to investigate the difference amongst different categories, having many possible values.
Two way ANOVA: When two factors is used to investigated simultaneously to measure the interaction of the two factors influencing the values of a variable.
ANCOVA-
ANCOVA stands for Analysis of Covariance, is an extended form of ANOVA, that eliminates the effect of one or more interval-scaled extraneous variable, from the dependent variable before carrying out research. It is the midpoint between ANOVA and regression analysis, wherein one variable in two or more population can be compared while considering the variability of other variables.
When in a set of independent variable consist of both factor (categorical independent variable) and covariate (metric independent variable), the technique used is known as ANCOVA. The difference in dependent variables because of the covariate is taken off by an adjustment of the dependent variable’s mean value within each treatment condition.
This technique is appropriate when the metric independent variable is linearly associated with the dependent variable and not to the other factors. It is based on certain assumptions which are:
There is some relationship between dependent and uncontrolled variable. The relationship is linear and is identical from one group to another. Various treatment groups are picked up at random from the population. Groups are homogeneous in variability.
Difference between ANOVA and ANCOVA?
The points given below are substantial so far as the difference between ANOVA and ANCOVA is concerned:
• The technique of identifying the variance among the means of multiple groups for homogeneity is known as Analysis of Variance or ANOVA. A statistical process which is used to take off the impact of one or more metric-scaled undesirable variable from dependent variable before undertaking research is known as ANCOVA.
• While ANOVA uses both linear and non-linear model. On the contrary, ANCOVA uses only linear model.
• ANOVA entails only categorical independent variable, i.e. factor. As against this, ANCOVA encompasses a categorical and a metric independent variable.
• A covariate is not taken into account, in ANOVA, but considered in ANCOVA.
• ANOVA characterises between group variations, exclusively to treatment. In contrast, ANCOVA divides between group variations to treatment and covariate.
• ANOVA exhibits within group variations, particularly to individual differences. Unlike ANCOVA, that bifurcates within group variance in individual differences and covariate.
Chi – Square Test
Chi-square is a statistical test that examines the differences between categorical variables from a random sample in order to determine whether the expected and observed results are well-fitting.
Here are some of the uses of the Chi-Squared test:
- The Chi-squared test can be used to see if your data follows a well-known theoretical probability distribution like the Normal or Poisson distribution.
- The Chi-squared test allows you to assess your trained regression model's goodness of fit on the training, validation, and test data sets.