Multiple regression analysis

From CEOpedia | Management online

Multiple regression analysis is a method of valuing connection between two or more independent and one dependent variables[1].

The most common model used in multiple regression analysis is linear regression model. In mathematical terms, performing regression analysis of this model is finding "coefficient of multiple correlation (R) that defines the amount of linear correlation in between the dependent variable y and the independent variables "[2].

Linear multiple regression model

Relation between dependent variable y and n independent variables can be expressed as linear regression model:

where is residual factor, are regression factors (coefficients). is a constant called regression intercept, while are regression slope parameters [3].

The goal of multiple regression analysis is finding all factors in above equation.

Calculating regression coefficients

In linear models calculating regression coefficients means finding linear function (in case of two dependents - represented by straight line) that fits given data set best. Depending on what "best fit" is described as (in statistics this problem is called goodness of fit), different methods can be used to perform regression.

The most popular method of finding regression coefficients of linear models is ordinary least squares method. It minimizes the sum of squared distances of all of the points from the data set to regression surface. It's popularity derives from effectiveness and ease of calculations[4].

It must be noted that correlations between independent variables of the model are possible (and sometimes they are hard to determine before analysis). To determine connections between them, analysis of dependencies of correlations must be performed. If one correlations between any two variables is high, one of them must be eliminated from model[5].

Nonlinear regression

When model is nonlinear, regression must be performed by iterative procedure. Nonlinear regression analysis aims to find best nonlinear function that fits given data set. With two dependents, this function is a curve. To find (nonlinear) coefficients of this model, usually numerical optimization algorithms are used. When dependent value has a constant variance, ordinary least squares method may be used to minimize sum of squared residuals. Otherwise, weighted least squares method that aims to minimize sum of weighted squared residuals.

Sometimes, nonlinear models are transformed to linear domain, making analysis linear, thus much easier to perform (as it does not require iterative optimization). This transformation changes influences of data values and distribution of errors in model, so it must be used with caution and preceded with careful data examination [6].


Multiple regression analysisrecommended articles
Two-way ANOVASensitivity analysisParametric analysisAdjusted meanDescriptive statisticsControl chartCluster analysisBox diagramHistogram

References

Footnotes

  1. Lefter C. 2004, p. 364
  2. Shyti B., Isa I., Paralloi S. 2017, p. 301
  3. Anghelache C., et al. 2013, p. 134
  4. Alma Ö.G. 2011, p. 409-411
  5. Kulcsár E. 2009, p. 63
  6. Oosterbaan R.J. 2002, p. 33

Author: Karolina Próchniak