 # Solutions to Algoritmo Lab’s Data Science Challenge – November 2021 on Linear Regression

Avilash Ghosh December 25, 2021 0

Q1. For a good model, R-squared takes a value close to 1.0

Ans. True

Q2. The n-1 dummy encoding needs to be performed in all-numeric predictor variables

Ans. False

Q3. If the p-value of the F-Statistic is less than the significance level,

Ans. Data provide evidence that the regression model fits the data well

Q4. When can R-Squared be negative?

Ans. When regression model fit is worse than average line

Q5. Build an intercept model with 7 numeric predictors & 1 numeric target variable. Build 2nd intercept model after z-score standardizing the predictors.

Ans. Both Multiple & adj r-squared will be the same for models 1 & 2

Q6. An intercept model is built with X as a predictor. Change X to Z where Z is 2021-X. Build the 2nd intercept model.

Ans. If the coefficient of X in model 1 is 121, the coefficient of Z in model 2 is -121

Q7. Is it necessary to standardize variables before using Lasso and Ridge Regression?

Ans. Yes

Q8. Errors from a linear regression model should be normally distributed with zero mean. If error terms are not normally distributed, it implies

Ans. Confidence Intervals will be too wide or narrow

Q9. The parameters of a linear regression model can be estimated using

Ans. Both least squares and MLE procedure

Q10. In linear regression, we can calculate the importance of variables by ranking predictors based on the

Ans. Descending order of absolute value of the standardized coefficient.

Q11. In the linear regression model, when an interaction is created from two variables that are not centered on 0,

Ans. Some amount of collinearity will be induced

Q12. In the linear regression model, is it helpful to standardize a variable when you include polynomial terms like X2 or X3

Ans. Yes, Standardization helps remove collinearity

Q13. Which of the following enforces sparsity in models?

Ans. L1 Norm

Q14. The more able a model is to ignore extreme values in the data, the more robust it is. Which of the following is correct?

Ans. L1-norm is more robust than L2-norm

Q15. A closed-form solution for a linear regression model is given by β=(XT.X)-1.XT.Y. In case perfect multicollinearity exists, (XTX)-1 may lead to

Ans. Singular matrix error

Q16. One-Hot encoding can lead to multi-collinearity and should be avoided in linear regression analysis

Ans. True

Q17. The Durbin Watson (DW) statistic is a test for autocorrelation in the residuals of a regression model. DW can take values between 0 and 4.

Ans. A value of 2 indicates there is zero autocorrelation

Q18. Ridge regression can reduce the coefficients to zero values

Ans. False 