Multiple regression analysis: Difference between revisions
m (Infobox update) |
m (Infobox5 upgrade) |
||
Line 1: | Line 1: | ||
'''Multiple regression analysis''' is a [[method]] of valuing connection between two or more independent and one dependent variables<ref>Lefter C. 2004, p. 364</ref>. | '''Multiple regression analysis''' is a [[method]] of valuing connection between two or more independent and one dependent variables<ref>Lefter C. 2004, p. 364</ref>. | ||
Line 39: | Line 24: | ||
Sometimes, nonlinear models are transformed to linear domain, making analysis linear, thus much easier to perform (as it does not require iterative optimization). This transformation changes influences of data values and distribution of errors in model, so it must be used with caution and preceded with careful data examination <ref>Oosterbaan R.J. 2002, p. 33</ref>. | Sometimes, nonlinear models are transformed to linear domain, making analysis linear, thus much easier to perform (as it does not require iterative optimization). This transformation changes influences of data values and distribution of errors in model, so it must be used with caution and preceded with careful data examination <ref>Oosterbaan R.J. 2002, p. 33</ref>. | ||
{{infobox5|list1={{i5link|a=[[Two-way ANOVA]]}} — {{i5link|a=[[Sensitivity analysis]]}} — {{i5link|a=[[Parametric analysis]]}} — {{i5link|a=[[Adjusted mean]]}} — {{i5link|a=[[Descriptive statistics]]}} — {{i5link|a=[[Control chart]]}} — {{i5link|a=[[Cluster analysis]]}} — {{i5link|a=[[Box diagram]]}} — {{i5link|a=[[Histogram]]}} }} | |||
==References== | ==References== |
Revision as of 22:58, 17 November 2023
Multiple regression analysis is a method of valuing connection between two or more independent and one dependent variables[1].
The most common model used in multiple regression analysis is linear regression model. In mathematical terms, performing regression analysis of this model is finding "coefficient of multiple correlation (R) that defines the amount of linear correlation in between the dependent variable y and the independent variables Failed to parse (syntax error): {\displaystyle x_1, x_2,…x_n} "[2].
Linear multiple regression model
Relation between dependent variable y and n independent variables can be expressed as linear regression model:
Failed to parse (syntax error): {\displaystyle y = β_0 + β_1 x_1 + β_2 x_2 + ... + β_n x_n + ε}
where Failed to parse (syntax error): {\displaystyle ε} is residual factor, Failed to parse (syntax error): {\displaystyle β_k (k = 0, 1, 2, ..., n)} are regression factors (coefficients). Failed to parse (syntax error): {\displaystyle β_0} is a constant called regression intercept, while Failed to parse (syntax error): {\displaystyle β_1, β_2, ...} are regression slope parameters [3].
The goal of multiple regression analysis is finding all factors Failed to parse (syntax error): {\displaystyle β_k} in above equation.
Calculating regression coefficients
In linear models calculating regression coefficients means finding linear function (in case of two dependents - represented by straight line) that fits given data set best. Depending on what "best fit" is described as (in statistics this problem is called goodness of fit), different methods can be used to perform regression.
The most popular method of finding regression coefficients of linear models is ordinary least squares method. It minimizes the sum of squared distances of all of the points from the data set to regression surface. It's popularity derives from effectiveness and ease of calculations[4].
It must be noted that correlations between independent variables of the model are possible (and sometimes they are hard to determine before analysis). To determine connections between them, analysis of dependencies of correlations must be performed. If one correlations between any two variables is high, one of them must be eliminated from model[5].
Nonlinear regression
When model is nonlinear, regression must be performed by iterative procedure. Nonlinear regression analysis aims to find best nonlinear function that fits given data set. With two dependents, this function is a curve. To find (nonlinear) coefficients of this model, usually numerical optimization algorithms are used. When dependent value has a constant variance, ordinary least squares method may be used to minimize sum of squared residuals. Otherwise, weighted least squares method that aims to minimize sum of weighted squared residuals.
Sometimes, nonlinear models are transformed to linear domain, making analysis linear, thus much easier to perform (as it does not require iterative optimization). This transformation changes influences of data values and distribution of errors in model, so it must be used with caution and preceded with careful data examination [6].
Multiple regression analysis — recommended articles |
Two-way ANOVA — Sensitivity analysis — Parametric analysis — Adjusted mean — Descriptive statistics — Control chart — Cluster analysis — Box diagram — Histogram |
References
- Alma Ö.G. (2011), Comparison of Robust Regression Methods in Linear Regression, "International Journal of Contemporary Mathematical Sciences", vol. 6
- Anghelache C., et al. (2013), Multiple Regression Used in Macro-economic Analysis, Revista Română de Statistică - Supplement Trim II/2013
- Kulcsár E. (2009), Multiple Regression Analysis of Main Economic Indicators in Tourism, "Revista de turism-studii si cercetari in turism"
- Lefter C. (2004), Marketing Researches, Infomarket, Brasov
- Oosterbaan R.J. (2002), Drainage research in farmers' fields: analysis of data. Part of project “Liquid Gold” of the International Institute for Land Reclamation and Improvement (ILRI), International Institute for Land Reclamation and Improvement, Wageningen
- Shyti B., Isa I., Paralloi S. (2017), Multiple Regressions for the Financial Analysis of Alabanian Economy, "Academic Journal of Interdisciplinary Studies", vol. 5
Footnotes
Author: Karolina Próchniak