**GATE 2024 Regression Analysis MCQs**

1. **What is the main objective of Regression Analysis?**

A) Classification

B) Prediction

C) Grouping

D) Aggregation

**Answer: B) Prediction**

2. **In a simple linear regression, how many variables are involved?**

A) One

B) Two

C) Three

D) Four

**Answer: A) One**

3. **What is the term for the variable that is being predicted in regression analysis?**

A) Independent variable

B) Dependent variable

C) Explanatory variable

D) Covariate

**Answer: B) Dependent variable**

4. **Which of the following is an assumption of linear regression?**

A) Homoscedasticity

B) Multicollinearity

C) Heteroscedasticity

D) Outliers

**Answer: A) Homoscedasticity**

5. **What does the coefficient of determination (R-squared) represent in regression analysis?**

A) The proportion of variance in the dependent variable explained by the independent variable(s)

B) The p-value of the regression model

C) The slope of the regression line

D) The standard error of the regression

**Answer: A) The proportion of variance in the dependent variable explained by the independent variable(s)**

6. **What is the range of possible values for R-squared?**

A) -1 to 1

B) 0 to 1

C) 0 to ∞

D) -∞ to ∞

**Answer: B) 0 to 1**

7. **Which of the following is used to assess the statistical significance of the coefficients in a regression model?**

A) p-value

B) R-squared

C) Standard error

D) Variance

**Answer: A) p-value**

8. **What does a p-value less than 0.05 typically indicate in regression analysis?**

A) The coefficient is statistically significant at the 5% level

B) The coefficient is not statistically significant

C) The model is a good fit

D) The model is overfit

**Answer: A) The coefficient is statistically significant at the 5% level**

9. **What is multicollinearity in regression analysis?**

A) When there is a linear relationship between the dependent variable and one of the independent variables

B) When two or more independent variables are highly correlated

C) When there are outliers in the data

D) When the regression line does not pass through the origin

**Answer: B) When two or more independent variables are highly correlated**

10. **What is the purpose of residual analysis in regression?**

A) To check for normality of the residuals

B) To identify influential observations

C) To assess the linearity assumption

D) To validate the regression model

**Answer: D) To validate the regression model**

11. **Which of the following is NOT a type of regression analysis?**

A) Linear Regression

B) Logistic Regression

C) Exponential Regression

D) Categorical Regression

**Answer: D) Categorical Regression**

12. **What is the difference between simple linear regression and multiple linear regression?**

A) The number of independent variables used

B) The type of data used (continuous vs. categorical)

C) The type of dependent variable (continuous vs. categorical)

D) The presence of outliers in the data

**Answer: A) The number of independent variables used**

13. **What is the purpose of the Durbin-Watson statistic in regression analysis?**

A) To test for autocorrelation in the residuals

B) To test for heteroscedasticity in the residuals

C) To test for multicollinearity in the independent variables

D) To test for normality in the dependent variable

**Answer: A) To test for autocorrelation in the residuals**

14. **Which of the following statements about outliers in regression analysis is true?**

A) Outliers have no effect on regression results

B) Outliers can significantly influence the regression model

C) Outliers only affect the intercept term

D) Outliers are always due to data entry errors

**Answer: B) Outliers can significantly influence the regression model**

15. **What is the purpose of the F-test in regression analysis?**

A) To test the overall significance of the regression model

B) To test for multicollinearity

C) To test for normality of the residuals

D) To test for homoscedasticity

**Answer: A) To test the overall significance of the regression model**

16. **What does the slope coefficient represent in a simple linear regression model?**

A) The change in the dependent variable for a one-unit change in the independent variable

B) The intercept of the regression line

C) The standard error of the regression

D) The p-value of the regression model

**Answer: A) The change in the dependent variable for a one-unit change in the independent variable**

17. **Which of the following is NOT a method to deal with multicollinearity in regression analysis?**

A) Removing one of the correlated variables

B) Combining the correlated variables into a single variable

C) Using a different regression technique

D) Ignoring multicollinearity as it has no impact on the model

**Answer: D) Ignoring multicollinearity as it has no impact on the model**

18. **What is the formula for calculating the residuals in regression analysis?**

A) Observed value - Predicted value

B) Predicted value - Observed value

C) (Observed value)^2 - (Predicted value)^2

D) (Predicted value)^2 - (Observed value)^2

**Answer: A) Observed value - Predicted value**

19. **In logistic regression, what type of variable is the dependent variable?**

A) Continuous

B) Categorical

C) Dichotomous

D) Ordinal

**Answer: C) Dichotomous**

20. **What is the purpose of the Akaike Information Criterion (AIC) in regression analysis?**

A) To compare the goodness of fit of different models

B) To test for normality of the residuals

C) To assess multicollinearity

D) To calculate the p-value of the regression model

**Answer: A) To compare the goodness of fit of different models**

21. **Which of the following statements is true about Ridge Regression?**

A) It adds a penalty term to the loss function to prevent overfitting

B) It is used for classification problems

C) It is a non-parametric regression technique

D) It is only applicable to simple linear regression**Answer: A) It adds a penalty term to the loss function to prevent overfitting**

22. **What does the term "heteroscedasticity" refer to in regression analysis?**

A) Unequal variances of the residuals across different levels of the independent variable

B) Equal variances of the residuals across different levels of the independent variable

C) A linear relationship between the dependent and independent variables

D) The presence of outliers in the data

**Answer: A) Unequal variances of the residuals across different levels of the independent variable**

23. **Which of the following is an example of a non-parametric regression technique?**

A) Polynomial Regression

B) Support Vector Regression (SVR)

C) K-Nearest Neighbors (KNN) Regression

D) Ridge Regression

**Answer: C) K-Nearest Neighbors (KNN) Regression**

24. **What is the purpose of cross-validation in regression analysis?**

A) To assess the performance of the regression model on unseen data

B) To calculate the p-value of the regression model

C) To test for multicollinearity in the independent variables

D) To identify influential observations in the data

**Answer: A) To assess the performance of the regression model on unseen data**

25. **What is the primary goal of feature selection in regression analysis?**

A) To identify the most important independent variables for the model

B) To increase the complexity of the regression model

C) To introduce multicollinearity

D) To reduce the sample size

**Answer: A) To identify the most important independent variables for the model**

26. **Which of the following regression techniques is specifically designed for dealing with time series data?**

A) Time Series Regression

B) ARIMA (AutoRegressive Integrated Moving Average)

C) Lasso Regression

D) Principal Component Regression (PCR)

**Answer: B) ARIMA (AutoRegressive Integrated Moving Average)**

27. **What is the purpose of the Cook's Distance statistic in regression analysis?**

A) To identify influential observations

B) To test for normality of the residuals

C) To assess multicollinearity

D) To test for heteroscedasticity

**Answer: A) To identify influential observations**

28. **Which of the following statements is true about polynomial regression?**

A) It models a linear relationship between the variables

B) It can capture nonlinear relationships between the variables

C) It can only be applied to simple linear regression

D) It is not affected by outliers in the data

**Answer: B) It can capture nonlinear relationships between the variables**

29. **In regression analysis, what is the purpose of transforming variables (e.g., taking the logarithm)?**

A) To make the data more interpretable

B) To improve the visual representation of the data

C) To meet the assumptions of the regression model

D) To increase the complexity of the model

**Answer: C) To meet the assumptions of the regression model**

30. **What is the primary drawback of using stepwise regression for variable selection?**

A) It may select variables that are not actually relevant

B) It is computationally intensive

C) It cannot handle categorical variables

D) It is prone to overfitting

**Answer: A) It may select variables that are not actually relevant**