File:Effect of multicollinearity on coefficients of linear model.png

Effect_of_multicollinearity_on_coefficients_of_linear_model.png(688 × 509 pixels, file size: 61 KB, MIME type: image/png)

Summary

Description
English: The true parameters are a_1= 2,a_2 =4 which are reliably estimated in the case of uncorrelated X_1 and X_2 (black case) but are unreliably estimated when X_1 and X_2 are correlated (red case). 1000 linear fits on 1000 training data sets are performed.
Date
Source https://stats.stackexchange.com/a/435988
Author Demetri Pananos

   library(tidyverse)    
   sim <- function(rho){
     #Number of samples to draw
     N = 50
     #Make a covariance matrix
     covar = matrix(c(1,rho, rho, 1), byrow = T, nrow = 2)
     # Append a column of 1s to N draws from a 2-dimensional  
     # Gaussian 
     # With covariance matrix covar
     X = cbind(rep(1,N),MASS::mvrnorm(N, mu = c(0,0), 
                 Sigma = covar))
     # True betas for our regression
     betas = c(1,2,4)
     # Make the outcome
     y = X%*%betas + rnorm(N,0,1)
     # Fit a linear model
     model = lm(y ~ X[,2] + X[,3])
     # Return a dataframe of the coefficients
     return(tibble(a1 = coef(model)[2], a2 = coef(model)[3]))     
   }
   #Run the function 1000 times and stack the results
   zero_covar = rerun(1000, sim(0)) %>% 
                bind_rows
   #Same as above, but the covariance in covar matrix 
   #is now non-zero
   high_covar = rerun(1000, sim(0.95)) %>% bind_rows
   #plot
   zero_covar %>% 
     ggplot(aes(a1,a2)) +
     geom_point(data = high_covar, color = 'red') +
     geom_point()

Licensing

w:en:Creative Commons
attribution share alike
This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
  • share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.

Captions

The true parameters are a_1= 2,a_2 =4 which are reliably estimated in the case of uncorrelated X_1 and X_2 (black case) but are unreliably estimated when X_1 and X_2 are correlated (red case)

Items portrayed in this file

depicts

14 November 2019

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current09:15, 30 July 2022Thumbnail for version as of 09:15, 30 July 2022688 × 509 (61 KB)Biggerj1Uploaded a work by Demetri Pananos from https://stats.stackexchange.com/a/435988 with UploadWizard
The following pages on the English Wikipedia use this file (pages on other projects are not listed):

Global file usage

The following other wikis use this file:

Metadata