Hypothesis testing in statistics is a statistical method or procedure of making inferences (logical conclusions) about the unknown population parameters under study from the sample observation by testing a hypothesis weather it is to reject or to accept.
In other words, it is a statistical test commonly used to make decision on the estimates of population parameters by testing the validity of the estimation made. The main objective of the hypothesis testing is to make decision whether to reject or to accept the hypothesis being tested, on the basis of sample data.
What is hypothesis testing?
In its elementary stage, a hypothesis may be any hunch, guess, imaginative idea which becomes the basis for investigation. A hypothesis is a tentative generalization, the validity of which remains to be tested. In other words, a statistical hypothesis is a statement about the parameter(s) of the population from which the samples are drawn. A hypothesis may be true or may not be true.
According to Webster, hypothesis is defined as “ A tentative theory or supposition provisionally adopted to explain certain facts and to guide in the investigation of others”.
The 2 types of Hypothesis are:
- Parametric and non-parametric hypothesis (according to nature of distribution).
- Null and alternative hypothesis (Statistical or according to action decision procedure).
Definition and example of Parametric hypothesis?
A statistical hypothesis which is tentative statement about the estimated value of the population parameters is called parametric hypothesis. Example: The average marriage age of Nepalese women is 20 years, Rice eaters and wheat eaters are equally popular in Nepal.
Non-parametric hypothesis testing in statistics?
A statistical hypothesis which is tentative statement about attributes is called is called non parametric hypothesis. Example: The mother’s eye color is associated with the daughter’s eye color, There is significant association between smoking habit and cancer disease.
What is Null-hypothesis (H_{0})?
- Null hypothesis is the hypothesis of no difference between sample statistics and population parameter.
- Null means no difference or zero difference. Thus, it is also the hypothesis of no difference which is under test or verification.
- More precisely, If the difference between the true value (hypothetical value) and expected value of the parameter is equal to zero, then the hypothesis is called null-hypothesis.
- Generally, the null-hypothesis is expected to reject because the rejection of null-hypothesis helps in supporting the researcher’s interest.
- If the sample data do not provide sufficient evidence against H0, then we accept null-hypothesis and if provide sufficient evidence
- against H0 , we reject the H0 and accept alternative hypothesis.
- According to Prof. R. A. Fisher “Null-hypothesis testing is the hypothesis which is tested for the possible rejection under the assumption that it is true”.
What is Alternative hypothesis (H_{1}) in statistics?
- Alternative hypothesis is the hypothesis of difference between sample statistics and population parameter.
- If the difference between true and expected value of the population parameter is not equal to zero then the hypothesis is called alternative hypothesis.
- In other words, any hypothesis which is mutually exclusive and complimentary to the null-hypothesis is called alternative hypothesis.
- It is usually denoted by H1.
- In hypothesis testing procedure, we always test the null hypothesis against the alternative hypothesis. If the null hypothesis (H0) is rejected, then the alternative hypothesis (H1)is accepted.
- The alternative hypothesis may be one tailed (left tailed or right tailed) and two tailed hypothesis.
One tailed and two tailed test of hypothesis testing in statistics
- A test of statistical hypothesis depends upon the alternative hypothesis used in the test.
- A test is said to be one tailed test if the alternative hypothesis is used is one tailed. Similarly, a test is said to be two tailed test if two tailed alternative hypothesis is used.
- More precisely, let θ be the parameter being tested and θ_{0}be the some specified value of the θ then
- A test of null-hypothesis, H0 : θ = θ_{0}, against alternative hypothesis, H1: θ ˂ θ_{0}is called left tailed test.
- A test of null-hypothesis, H_{0}: θ = θ_{0}, against alternative hypothesis, H1: θ˃ θ_{0} is called right tailed test.
- A test of null-hypothesis, H_{0} : θ = θ_{0} , against alternative hypothesis, H1 : θ ≠ θ_{0}is called two tailed test.
How to we use One tailed and two tailed test of hypothesis statistics
- If the direction of difference is not given/fixed in the statement of the hypothesis then we should use two tailed test.
- If the direction of difference like increase, decrease, superior, inferior, majority, minority, more than, less than, high, low, more effective, less effective etc. is included in the statement of the hypothesis then we should use one tailed test.
- In one tailed test, if the value of sample statistic (like sample mean) is greater than population parameter ( like population mean) then we use right tailed test and if the value of sample statistic ( like sample mean) is less than population parameter (like population mean) then we use left tailed test.]
Level of significance in hypothesis testing
- The probability of rejecting null-hypothesis H0 when it is true is called level of significance of the test.
- Level of significance is generally expressed in percentage.
- It is denoted by α.
- The frequently used level of significance in testing of hypothesis is 5% and 1%. It means, if it is fixed at 5% (α =5%) , then we are ready to take a 5% risk of rejecting true null-hypothesis, H_{0} and (1- α) = (1-0.05)% = 95% confident that our decision to accept H0 is correct.
- Higher the level of significance, higher will be the probability of type-I error and lower the confidence level.
- Therefore, we would fix the value of α in advance at a certain small level before applying any test so that probability of type- II error is minimized.
Type-I and Type-II Errors in hypothesis testing
When the test procedure is applied to test the null hypothesis, H0 against alternative hypothesis, H1 , we may
find the following two types of errors.
- Type-I Error.
- Type-II Error.
What is Type-I Error?
- The error committed in rejecting null-hypothesis, H0 when it is true, is called type-I error.
- The probability of committing type-I error is called level of significance or the size of the test or area of the critical region.
- It is usually denoted by α. ie, α = P (type-I error). = P (reject H0 when H0 is true).
Type-II error:
- The error committed in accepting null-hypothesis, H0 when it is false, is called type-II error.
- The probability of committing type-II error is denoted by β. ie, β = P (type-II error). = P (accept H0 when H0 is false).
- It should be noted that the degree of impact of type-II error is more dangerous or harmful than that of the type-I error.
- For example; the accepting of rotten egg is more harmful than the rejecting of a good egg.
Critical region and acceptance region hypothesis testing
- All the possible information given by all the possible samples (statistic) represent a sample space. Sample space is the area covered sample statistic. This sample space is divided into two region, namely critical region and acceptance region.
- The region which leads to the rejection of null hypothesis is called rejection region (RR) or critical region (CR).
- In other words, the region of rejecting null-hypothesis H_{0} when it is true, is called critical region.
- If the computed value of the test statistic falls in the critical region, we reject H_{0 }and then accept H1.
- The mutually exclusive and complementary region to the critical region under the sample space is called acceptance region (AR) and usually denoted by .
- In other words, the region of sample space which leads to the acceptance of null hypothesis is called acceptance region.
- Thus, if the computed value of the test statistic falls in the acceptance region we accept H0 and reject H1
If the test statistic used is followed a normal distribution, then the critical region and acceptance region can be illustrated with the following standard normal curves.
what is Critical value in hypothesis testing?
The value of the test statistic which separates the acceptance region and rejection region is called critical value (tabulated value).It is also known as significant value. The critical value depends upon the following point:
i. Nature of the sampling distribution of the test statistic (t-, F-, Z-, χ2distribution).
ii. Level of significance (α).
iii. Types of alternative hypothesis used (right tailed, left tailed or two tailed).
iv. Degree of freedom.
Degree of Freedom:
- Degree of freedom refers to the number of free variables in a system. It is defined as the total number of observations minus number of independent constraints (restrictions) imposed on a set of observations.
- In other words, the number of independent variates which make up statistics is called degree of freedom. It is denoted by d.f.
- Generally, for n observations, there is supposed to be one restriction. As such, the degree of freedom is n-1.
Steps of hypothesis testing
To test the null hypothesis testing in statistics against the alternative hypothesis, the
following steps are followed.
- Setting of hypothesis:
At the first step, we set up null hypothesis and alternative hypothesis. - Level of significance:
We fix the level of significance (α), 5% or 1% unless or otherwise stated. - Test statistic:
Under null-hypothesis, H0 the test statistic is,
4.Critical Value:
The critical value of test statistics, at given level of significance and types of alternative hypothesis testing used, is obtained from the table of respective distribution.
Decision:
- For two tailed test: If |Tcal.|> Ttab. then H_{0} is rejected (H1 is accepted) and if |Tcal.|≤ Ttab. then H0 is accepted.
- For right tailed test: If Tcal. > T tab. then H_{0 }is rejected (H1 is accepted) and if Tcal. ≤ Ttab. then H0 is accepted.
- For left tailed test: If T cal. < -Ttab. then H_{0} is rejected (H1 is accepted) and if Tcal. ≥ -Ttab. then H0 is accepted.
Overall, if calculated value is less than or equal to the tabulated value then null hypothesis testing is accepted and if calculated value is greater than tabulated value then null hypothesis is rejected.
Conclusion:
- If H_{0} is rejected then statement of H1 is accepted and the statement of H1 is written as a conclusion.(test is significant).
- If H_{0} is accepted then statement of H0 is accepted and the statement of H0 is written as a conclusion. (test is not significant or insignificant).