Heteroskedasticity: Difference between revisions

From CEOpedia | Management online
No edit summary
No edit summary
Line 31: Line 31:


=How to find out if there is heteroskedasticity?=
=How to find out if there is heteroskedasticity?=
#Plot in R
To find out if there is heteroskedasticity, there exist several ways.
The fastest way is with the use of statistical programs for example R studio.
The plot shows you the residuals against the fitted values on the graphic.
If there is any kind of trend or pattern, then it is very likely that your assumption of the OLS model is violated and there is heteroscedasticity. If you have a random distribution of your values your assumption is not violated.
2 GRAPHICS MISSING
Those two graphics illustrate the difference between homoscedasticity and heteroscedasticity. In the first illustration, there is a random distribution, no trend, and the errors are independent, that is the case of homoscedasticity.
The second illustration shows a trend, which implies that it is heteroscedasticity.
#Breusch-Pagan Test
There exists also the possibility to run the Breusch-Pagan Test to identify if there is heteroscedasticity. (In R the lmtest package.)
This test checks whether our independent variables affect the error terms by regressing the squared residuals (an easier approximation to the Variance of u) on our regressors and checking the significance.
#White Test
This test is more general and does not only test for homoscedasticity.
In general, adds squares and interaction terms to catch all interdependence between the variance of residuals and the independent variables. Easier and fewer degrees of freedom: Use fitted valuers and their squared form. (het.test in R)
In both cases, the test's null hypothesis is that your residuals are homoscedastic, and your alternative hypothesis maintains the opposite. If your p-value is lower than your significance level, then there is heteroskedasticity.
H0: Residuals are homoscedastic
Ha: Residuals are not homoscedastic (Heteroscedasticity is present)
If the null hypothesis is rejected, then your residuals are heteroscedastic.
If you fail to reject the null hypothesis the residuals are homoscedastic.

Revision as of 13:12, 26 October 2022

Heteroskedasticity Heteroscedasticity is the case if homoscedasticity is not fulfilled, which is one of the most important assumptions of the ordinary least squares (OLS) regression.

One of the assumptions of the OLS regression is that the errors are normally and independently distributed. The assumption regarding the OLS regression assumes that the variance of the error terms stays constant over periods. In doubt, you should adopt that in your regression is heteroscedasticity and test if it is true or not regarding the reality.

At work.png

This is an article stub.
If you are able to help improve this article, please feel free to edit it.

Definition:

Heteroskedasticity is defined as the residuals that don’t have the same variances in the model. That means that the difference in the true values of the residuals is not the same in every period. This causes the variance of the errors to depend on the independent variables, which causes an error om the variance of the OLS estimators and therefore in their standard errors. Var(ε)=σ_i^2≠ σ^2


Consequences:

If you run your regression under the fact that there is heteroscedasticity you get unbiased values for your beta coefficients. That means there is no correlation between the explanatory variable and the residual. So, consistency and unbiasedness are still given if only the homoscedasticity assumption is violated. Overall, there is no impact on the model fit.

But you get an impact on other parts:

  1. The estimates of your coefficients are not efficient anymore
  2. The standard errors are biased as the test statistics

Due to wrong standard errors, our t-statistic is wrong, and we make any valid statement about their significance. For example, if the standard errors will be too small then it’s more unlikely to reject the null hypothesis. Thus, the inference, as well as efficiency, are affected. The results won’t be efficient anymore because they don’t have the minimum variance anymore. It’s very important to correct heteroskedasticity to get a useful interpretation of your model and to have a correct interpretation of statistical test decisions.

Reasons for Heteroscedasticity:

Often found in time series data or cross-sectional data. Reasons can be omitted variables, outliers in data, or incorrectly specified model equations.


How to find out if there is heteroskedasticity?

  1. Plot in R

To find out if there is heteroskedasticity, there exist several ways. The fastest way is with the use of statistical programs for example R studio. The plot shows you the residuals against the fitted values on the graphic.

If there is any kind of trend or pattern, then it is very likely that your assumption of the OLS model is violated and there is heteroscedasticity. If you have a random distribution of your values your assumption is not violated.

2 GRAPHICS MISSING

Those two graphics illustrate the difference between homoscedasticity and heteroscedasticity. In the first illustration, there is a random distribution, no trend, and the errors are independent, that is the case of homoscedasticity. The second illustration shows a trend, which implies that it is heteroscedasticity.


  1. Breusch-Pagan Test

There exists also the possibility to run the Breusch-Pagan Test to identify if there is heteroscedasticity. (In R the lmtest package.) This test checks whether our independent variables affect the error terms by regressing the squared residuals (an easier approximation to the Variance of u) on our regressors and checking the significance.

  1. White Test

This test is more general and does not only test for homoscedasticity. In general, adds squares and interaction terms to catch all interdependence between the variance of residuals and the independent variables. Easier and fewer degrees of freedom: Use fitted valuers and their squared form. (het.test in R)



In both cases, the test's null hypothesis is that your residuals are homoscedastic, and your alternative hypothesis maintains the opposite. If your p-value is lower than your significance level, then there is heteroskedasticity.

H0: Residuals are homoscedastic Ha: Residuals are not homoscedastic (Heteroscedasticity is present)

If the null hypothesis is rejected, then your residuals are heteroscedastic. If you fail to reject the null hypothesis the residuals are homoscedastic.