What’s the Meaning of Regression?
Regression is a statistical modeling approach that analysts use to determine relationships between multiple variables. Regression analysis starts with a single variable you’re trying to analyze and independent variables you’re testing to see if they affect that single variable. The analysis looks at changes in the independent variables and attempts to correlate those changes with resulting changes in the single (dependent) variable. This may sound like advanced statistics, but Excel makes this complex analysis available to anyone.
Performing Linear Regression in Excel
The simplest form of regression analysis is linear regression. Simple linear regression looks at the relationship between only two variables. For example, the following spreadsheet shows data containing the number of calories a person ate each day and their weight on that day. Since this spreadsheet contains two columns of data, and one variable could potentially have an affect on the other, you can run a regression analysis on this data using Excel.
Enabling Analysis ToolPak Add-On
Before you can use Excel’s regression analysis feature, you need to enable the Analysis ToolPak add-on in the Excel Options screen. Now that Analysis ToolPak is enabled, you’re ready to start doing regression analysis in Excel.
How to Perform Simple Linear Regression in Excel
Using the weight and calories spreadsheet as an example, you can perform a linear regression analysis in Excel as follows. As you can see, in this example, calories have a strong correlation to total weight. Each of these numbers has the following meanings:
Multiple R: The Correlation Coefficient. 1 indicates a strong correlation between the two variables, while -1 means there’s a strong negative relationship. 0 means there’s no correlation.R Square: The Coefficient of Determination, which shows how many points between the two variables fall on the regression line. Statistically, this is the sum of the squared deviations from the mean.Adjusted R Square: A statistical value called R square that’s adjusted for the number of independent variables you’ve chosen.Standard Error: How precise the regression analysis results are. If this error is small then your regression results are more accurate.Observations: The number of observations in your regression model.
The remaining values in the regression output give you details about smaller components in the regression analysis.
df: Statistical value known as degrees of freedom related to the sources of variance. SS: Sum of squares. The ratio of the residual sum of squares versus the total SS should be smaller if most of your data fits the regression line. MS: Mean square of the regression data. F: The F statistic (F-test) for null hypothesis. This provides the significance of the regression model. Significance F: Statistical value known as P-value of F.
Unless you understand statistics and calculating regression models, the values at the bottom of the summary won’t have a lot of meaning. However the Multiple R and R Square are the two most important.
Multiple Linear Regression Analysis in Excel
To perform the same linear regression but with multiple independent variables, select the entire range (multiple columns and rows) for the Input X Range. When selecting multiple independent variables, it’s less likely you’ll find as strong a correlation because there are so many variables. However a regression analysis in Excel can help you find correlations with one or more of those variables that you may not realize exists just by reviewing the data manually.