Analysis of Variance (1/2)

Analysis of Variance (1/2)

In this lesson, you’re expected to:
– understand what the ANOVA test is used for
– learn about the One-Way ANOVA test
– discover how to conduct an F-test and analyze p-values

There is a wide range of scenarios where we are interested in comparing the means of two or more populations. 

The manager of Inditex (https://en.wikipedia.org/wiki/Inditex) may want to know whether T-shirt’s with graphics are more demanded than those without graphics. She might also be interested in whether pants displayed in the first row of their online store generate more sales than those displayed at the bottom.

At university, students usually think that getting a higher grade in Professor A’s class is easier than Professor B’s class, but is this really true?

In all these scenarios, we want to assess whether there is a significant difference between the means of different groups.

What is ANOVA?

The ANOVA, or analysis of variance, is a methodology used to directly compare the means of different groups.

Hence, when we have categorical data, it is used to test if there is a difference in the mean of numerical variables among the categories.

What is ANOVA used for?

The ANOVA methodology allows to study the variation of a random numerical variable with the values taken by other variables called factors.

Factors must take discrete values*. However, the original variable may not be categorical.

* Discrete data can only take particular values. It can be numeric — like numbers of apples — but it can also be categorical — like red or blue, male or female, good or bad etc.
For example, imagine an insurance company wants to analyze how the accident rate of their policies varies with the age group of the policyholder. Originally, the age variable is continuous, and can take any positive value. However, since we are interested in the age range, we can turn the variable into factors, by computing the age range.

Thus, we will transform the exact numerical age into the corresponding age ranges: 0-10, 10-20, 20-35, 35-45, 45-65, 65+.

ANOVA studies how the expected value of a numerical variable is conditioned by the groups of the factors.

The ANOVA test is closely related to the fact that equality in means does not imply equality in medians. *

[Optional] Analysis of Variance (ANOVA)
Watch this 5-minute video to learn more:
https://www.youtube.com/watch?v=ITf4vHhyGpc
ANOVA is based on the means model that expresses the expected value of Y of an observation belonging to group i as a combination of:

• the mean for all groups
• the deviation from the mean for group i
• the random noise

Enlarged version: http://bit.ly/2nGBUhW
One-Way ANOVA

In one-way ANOVA, the analysis is limited to evaluating how the expected value of the dependent variable is conditioned by a single factor.

For example, the sales of pants might be affected not only by the color but also the type (cargo, chinos, jeans etc.). Thus, with one-way ANOVA, we are limited to analyzing each factor separately.

We could use one-way ANOVA to evaluate how the color affects the number of sales. In this case, the factor would be the color and the groups: yellow, black, white, and beige.

[Optional] One-Way ANOVA
Comparing with the Null Hypothesis

The null hypothesis, usually denoted by H0, represents the hypothesis that sample observations result purely from chance.

By contrast, the alternative hypothesis, denoted by H1, is the hypothesis that sample observations are influenced by some non-random cause.

Thus, in our case, the null hypothesis is that there is no significant difference in means among the groups and the population means for the groups are the same.

For example, in the case of the relation between the sales of pants and the color of the pants:

The null hypothesis is that the means of the number of sales for the different colors is the same.

The alternative hypothesis, H1, would be that at least one of the means of the four colors is different. Note that three of the means could be the same, and if just one significantly differs, the alternative hypothesis would become true and we have to reject the null hypothesis.

Null Hypothesis (H0):
Type of Errors and Significance Levels

When we are testing a hypothesis, we can make two types of errors:

Type I Error: Reject the null hypothesis when it is true.
This involves asserting a difference that does not exist and is called a False Positive.

Type II Error: Accept the null hypothesis when it is false.
In this case, we are failing to assert a difference that is really present in the data. This is called a False Negative.

When we test a hypothesis, we need to choose a level of significance. The level of significance, denoted as α, represents the probability of rejecting the null hypothesis when it is actually true.

The F-Test & P-Value
Rejecting the Null Hypothesis

ANOVA uses the F statistic to evaluate the hypotheses in what is called an F test.

In addition, a p-value is computed from the F statistic using an F distribution.

This F-test is used for comparing the factors of the total deviation. For example, in one-way, or ANOVA, statistical significance is tested by comparing the F test statistic.

The F statistic is calculated using the formula below.

The greater the value of the test, the more unlikely that the null hypothesis is true, as the numerator increases proportionally to the between-group variability, and the denominator represents the within group variability.

Thus, a sufficiently large value of this test statistic results in accepting our alternative hypothesis and asserting difference among the groups.

MSG represents a measure of the between-group variability
MSE measures the variability within each of the groups
SSG = the sum of squares between groups
SSE = the sum of squared errors
n = the number of observations
k = the number of groups
We accept the null hypothesis with a significance level alpha if:
And reject it and accept our alternative hypothesis in the other case. The larger the observed variability (MSG) relative to the within-group observations, the larger F will be and the stronger the evidence against the null hypothesis.

Because larger values of F represent stronger evidence against the null hypothesis, we use the upper tail of the distribution to compute a p-value.

Disadvantages of using the F-test

The main problem with the F-test value is that it cannot be interpreted immediately. Once computed, to know whether we can reject the null hypothesis or not, we need to go to the F-tables.

To accept the hypothesis or not, depends not only on the value of F but also on the sample size and the number of groups.

Let’s go back to the example of the number of sales and the number of pants. 

We had four groups (the categorical variable color had 4 different values). Imagine that we get a result of F=5 for the F-test.

That result would mean very different things for a sample of 10 and for a sample of 100. In the case of 100 samples, we will probably reject the null hypothesis and state a difference in means.

In contrast, for a sample of 10, we would have to accept the null hypothesis. Note that for k=4, the threshold of F-test value for a significance level of 0.05 is approximately 10 for a sample of size 10 and approximately 4 for a sample of size 100.

The p-value

The p-value represents the probability of having the observed values under the null hypothesis. Hence, the probability that we should accept the null hypothesis (H0) is equal to the p-value. The probability that we should not reject H0 is equal to the p-value.

So what does a very small p-value mean? 
It indicates that differences in means between groups are significant and that we can reject the null hypothesis.

The p-value quantifies the probability of making a Type 1 error. For research and many business applications, the level of significance is chosen as 0.05 (5%). However, other frequent choices include 0.001 (0.1%), 0.01 (1%), and 0.10 (10%).

Jim Rohn Sứ mệnh khởi nghiệp