Mastering hypothesis testing: A beginner’s guide — theoretical knowledge

This blog post is part of a “Statistical Hypothesis Essentials”series of stories about the basics of hypothesis testing, and its vocabulary. To immediately receive a post when it’s out, subscribe to my Substack.

📮 Make sure you don’t miss out! Follow this blog and subscribe to an e-mail list to ensure you are among the first to get the article! Please check the rest of the website for detailed articles, cheat sheets, glossaries, and case studies.

Make sure to also download the cheat sheet based on this – click here.

The field of statistics is usually divided into two baskets – descriptive and inferential statistics. Using methods and techniques of descriptive statistics, you get introduced to the dataset and its variables, their distribution, etc. With inferential statistics, you go more deeply into relationships between variables and make inferences – or decisions. One part of the inferential statistics which is used very often is hypothesis testing.

In this article, I will write about the basics of hypothesis testing, its key parts, and its underlying assumptions. This article will include theoretical knowledge only. If you are interested in practical knowledge (using R programming), please visit my blog to see the continuation of this article.

HYPOTHESIS TESTING – WHAT DOES IT STAND FOR?

Hypothesis testing is one of the most used statistical methods, mainly used to make decisions (inferences) about a population parameter based on a sample of data. Since it is not smart to research the whole population (mainly because of cost and time, but oftentimes because of availability of one), we research a small portion of that population (in order words, we sample it). Based on that sample and its statistics, we try to find out something about the population where the sample comes from.

Hypothesis testing is a method that helps us determine whether the evidence in a sample supports a specific hypothesis about a population.

KEY COMPONENTS OF HYPOTHESIS TESTING

There are a couple of components in hypothesis testing, like:

null hypothesis (H0) — it represents a statement of no effect, or status quo. It is the hypothesis that researchers typically go/test against.
For example — the average weight of the population is 70 kilograms.
alternative hypothesis (H1, Ha) — it represents a statement of a change, it goes against the null hypothesis. That is a statement which researcher expects or wants to test for.
For example — the average weight of the population is not 70 kilograms.
Test statistic — it is calculated from the sample data, and it is used to determine the likelihood of observing the sample data under the null hypothesis, or that the null hypothesis can’t be rejected. Usually, we use z-score or t-score (using t-test or z-test). The choice of which test we will use depends on the type of data, sample size, and the assumptions of each test. Common tests are t-test, z-test, chi-square test, and ANOVA.
Significance level (alpha) — it is the threshold or boundary for deciding whether to reject the null hypothesis. The most common alpha is 0.05, but 0.01 and 0.10 can also be used, depending on how sturdy the boundary has to be.
Example — 0.05 means that there is 5% risk of rejecting the null hypothesis when it is true.
p-value — it is the value/probability of obtaining test results (gotten with the test statistics) as extreme as the observed results, which assumes that the null hypothesis can’t be rejected.
For example — a p-value of 0.03 means that there is a 3% chance of observing the sample data if the null hypothesis is true.
decision rule — this rule puts together the p-value and significance level. Based on those two, a decision can be made whether we reject or fail to reject the null hypothesis.
(CAUTION — you can never say that you accept the null hypothesis, you can only FAIL to reject it)
For example — if the p-value is lower than alpha, you can reject the null hypothesis. If the p-value is higher or equal to alpha, you fail to reject the null hypothesis.

ASSUMPTIONS OF HYPOTHESIS TESTING

Before you can decide if your data is ready for hypothesis testing, you have to check the assumptions or prerequisites for test statistics used in the testing process. Two tests that are mostly used are z- and t-tests, so we will check the prerequisites for the t-test mainly:

independence — observations would have to be independent of each other. To check that, you would have to be introduced to the study design.
normality — the data should be approximately normally distributed. To check that, you can draw a boxplot and histogram to visualize the variables. Test-wise, you can use the Shapiro-Wilk test or additionally use the Q-Q plot.
homogeneity of variances — for two sample t-tests, the variances in the two groups should be equal, for the standard t-test. To check this, use Levene’s test or Bartlett’s test.

STEPS IN HYPOTHESIS TESTING

If you have checked your data, and it fits the prerequisites mentioned above, you can perform hypothesis testing. Here are the steps:

state the hypotheses — formulate your null and alternative hypotheses, according to your data plan and collection, goals.
choose your significance level — choose your alpha, which is the probability of making an error to reject a null hypothesis when it is true. Most of the time, alpha is 0.05.
compute the test statistics — based on the chosen test, on sample data which you have collected.
determine the p-value — calculate the p-value corresponding to the test statistic.
make a decision — compare the p-value that you have gotten to the significance level. Decide whether to reject or fail to reject the null hypothesis.
write a conclusion — take together the p-value and significance level, and draw/write a conclusion about it, putting it together with the original question or goal of the research. If the p-value is lower than the significance level, then we can reject the null hypothesis, and vice versa.

CONCLUSION

Hypothesis testing is one of the fundamental parts of inferential statistics, to make decisions about the population, based on sample data in the dataset. It is extremely important that you know your data, the study design behind it, and how the collection went, to not make biases and type I/II errors while performing this kind of testing. Additionally, ensuring that the assumptions of your chosen tests are met is crucial for accurate results and good decision-making based on those results.

Follow this blog and subscribe to an e-mail list to ensure you are among the first to get the article!