The concept of a correlation-regression analysis implies a series of operations, namely, the determination of the closeness of the relationship, its direction and the establishment of an equation describing the form of communication. This type of analysis contains two separate components: correlation and regression analysis.
The meaning and major steps in the process of correlation and regression analysis of economic phenomena
Correlation and regression analysis is one of the ways to solve problems and find information. It allows you to determine the joint effect of many interrelated and simultaneously operating characteristics, and separate the influence of each characteristic of the economic phenomenon (the process). Thanks to this type of analysis can assess the degree of interaction between multiple characteristics, between characteristics and results obtained, as well as to simulate the regression equation describing the shape of the relationship.
Stages of analysis
Correlation and regression analysis of economic processes is divided into several stages:
- Definition of arguments and preliminary processing of conditional information.
- Definition of closeness and forms of interrelation between several signs.
- Modeling the presented economic process and analysis of the resulting model.
- Application of the final results to improve the planning and management of the model.
Statistical homogeneity of information can be determined using two techniques. First we need to define and to discard the importance of factors differ sharply from all quantities. Then carried out a statistical study of homogeneity with the test of independence of sampling and belonging to the only together with a normal distribution.
The regression model is determined through the least squares method, which provides the best approximation of the result estimate, determined through the regression equation, to its factors.
Correlation and regression analysis: the parameters of the created model
The most important factors determining the characteristics of the model are considered to be:
- The coefficients of pair correlation (demonstrate the strength of the relationship of the two factors).
- The coefficient of multiple correlation (determines the relationship of the result and factors).
- The private coefficients of determination (show the effect of variation of the argument for variation of the desired characteristic).
- The coefficient of multiple determination (indicates the proportion of all arguments on the variation of the desired characteristic).
- Private elasticities (characterize the influence of factors on the result, expressed in the same scale in percent).
Purpose of the analysis
The main objectives of the correlation-regression analysis is to identify factors that significantly affect the economic outcome of a phenomenon or process, and the use of the information obtained to improve the planning of an economic process or phenomenon.
Parametric analysis methods
All production processes are closely related. This relationship is stochastic (the result depends on many factors) and functional (the result changes by the same magnitude as the factor). Stochastic dependence is often correlative in nature, that is, the value of a factor simultaneously corresponds to several values of the result, having completely different directions.
The correlation relationship can have one or several factor-signs, have a positive or negative directionality, be straight or curvilinear (depending on the expression). It is possible to determine which type of relationship is related by using the correlation lattice. It is built within the rectangular axes of coordinates.
Frequencies placed close to the diagonals indicate a high correlation of signs. Frequencies placed close to the diagonal passing through the lower left and upper right corners indicate the positive direction, while those passing through the upper left and right lower corner indicate the opposite direction. Frequencies located in the form of an arc indicate a curvilinear relationship, and randomly scattered - about the absence of a relationship in general.
The main method of correlation analysis is the linear correlation coefficient. It can take values from -1 to 1. The closer the value is to 1, the stronger the Association between the factor and the result. Positive values show direct relationship, and negative otherwise. The coefficient takes the value “zero” in that case, if between the signs is not the relationship.
Non-parametric methods of analysis
A number of methods allows to evaluate the relationship of phenomena without a quantitative expression of a trait and, accordingly, distribution parameters. They are called non-parametric. Among them are:
- The rank correlation coefficient of Kendall (defines the relationship of the quantitative and qualitative values of indicators in case they are subject to ranking).
- The Spearman rank correlation coefficient (assigns grades to each argument and result, on the basis of which are determined by difference and calculated index).
- The correlation coefficient signs Fechner (determines the number of matches and mismatches of the variances of the arguments and of results from their mean value).
- Another important method of correlation-regression analysis — least squares Method, to determine analytical expression of the relationship of resultant variable and its factor. It is the construction of a system of equations and determine the parameters of these equations.
Correlation and regression analysis: an example
In statistics and economics, a variety of types and objects of analysis are used. Statistical methods of analysis are aimed at studying repetitive processes in order to make long-term predictions of the behavior of economic phenomena.
For example, in order to analyze the socio-economic development of a territory, it is necessary to study the indicators of the standard of living of the population. Correlation and regression analysis in statistics allows you to create a regression equation and determine correlation coefficients, demonstrating the relationship between living standards and development of the territory. The standard of living is determined by income, and the main source of income is salary. In this case, the factor is the level of wages, and the result is the population with low incomes.
To facilitate calculations, you can conduct correlation analysis in Excel. In this program, there are a number of tools to help facilitate the calculations. Among them, the function "Correlation", which allows to form a matrix of coefficients and different parameters. She is depicted in the form of a table. Correlation coefficients are used as columns and rows. Based on the data obtained from the table, it will be necessary to conduct a correlation analysis. An example of the sequence of analysis:
- In the "Service" command, select the "Data Analysis" item.
- As a tool for analysis, select the "Correlation" item.
- In the window that appears in the line "Input interval" to specify a range of data to be analyzed, select "Group" in the "output Settings", enter the output range of the results and click OK.
The result is a correlation matrix located in the output range. Inside, the linear correlation coefficient will be indicated, assessing the tightness and the form of the relationship between the indicators.
Analysis in Excel
In MS Excel, the “Correlation” function is used to conduct correlation and regression analysis. An example of the calculation of the coefficients will be considered later. This function forms a matrix with the closeness coefficients of the relationship between different parameters. As a result, a square table is formed containing the correlation coefficients at the intersection of the rows and columns.
For the analysis will need to perform a number of specific actions:
- Open the "Service" command, and in it the "Data Analysis" item.
- In the window that appears, indicate in the list of "Analysis Tools" item "Correlation".
- In the opened window "Correlation" to indicate the input interval range of cells containing the analyzed information (it needs to be at least two columns), put a checkmark in "Group" and in the "output Settings" to choose the top-left cell where to begin on the correlation matrix.
- Click the OK button.
As a result of the calculations, a square table with correlation coefficients will appear.
Regression analysis in MS Excel
In order to calculate the linear regression equation describing the relationship between the factors and the result in MS Excel to apply the function "LINEST". In order to use it, you must:
- Select an empty area in which the analysis results will be displayed.
- Open the “Master of Functions”, find the category “Statistical” in it, and in it the function “Lineine” and click OK.
- In the field "Known Values for» enter the range of the analyzed results in the "Known values of x» - range of factors analyzed.
- In the field "Constant" indicates the presence of the free term of the equation (1 – Yes, 0 – no), and the "Statistics" – whether to display additional information (1 – there will be more information, a 0 will appear only parameter estimates). By default, you can specify in both fields 1.
- Click OK.
At the top of the previously selected area, the initial element of the table will appear. In order to reveal all the data, you must press F2, and then simultaneously the key combination Ctrl + Shift + Enter.
As a result, the regression information will be displayed as a table of two columns and five rows: