GATE 2024 Regression Analysis MCQs

1. What is the main objective of Regression Analysis?

A) Classification

B) Prediction

C) Grouping

D) Aggregation

Answer: B) Prediction


2. In a simple linear regression, how many variables are involved?

A) One

B) Two

C) Three

D) Four

Answer: A) One


3. What is the term for the variable that is being predicted in regression analysis?

A) Independent variable

B) Dependent variable

C) Explanatory variable

D) Covariate

Answer: B) Dependent variable


4. Which of the following is an assumption of linear regression?

A) Homoscedasticity

B) Multicollinearity

C) Heteroscedasticity

D) Outliers

Answer: A) Homoscedasticity


5. What does the coefficient of determination (R-squared) represent in regression analysis?

A) The proportion of variance in the dependent variable explained by the independent variable(s)

B) The p-value of the regression model

C) The slope of the regression line

D) The standard error of the regression

Answer: A) The proportion of variance in the dependent variable explained by the independent variable(s)


6. What is the range of possible values for R-squared?

A) -1 to 1

B) 0 to 1

C) 0 to ∞

D) -∞ to ∞

Answer: B) 0 to 1


7. Which of the following is used to assess the statistical significance of the coefficients in a regression model?

A) p-value

B) R-squared

C) Standard error

D) Variance

Answer: A) p-value


8. What does a p-value less than 0.05 typically indicate in regression analysis?

A) The coefficient is statistically significant at the 5% level

B) The coefficient is not statistically significant

C) The model is a good fit

D) The model is overfit

Answer: A) The coefficient is statistically significant at the 5% level


9. What is multicollinearity in regression analysis?

A) When there is a linear relationship between the dependent variable and one of the independent variables

B) When two or more independent variables are highly correlated

C) When there are outliers in the data

D) When the regression line does not pass through the origin

Answer: B) When two or more independent variables are highly correlated


10. What is the purpose of residual analysis in regression?

A) To check for normality of the residuals

B) To identify influential observations

C) To assess the linearity assumption

D) To validate the regression model

Answer: D) To validate the regression model


11. Which of the following is NOT a type of regression analysis?

A) Linear Regression

B) Logistic Regression

C) Exponential Regression

D) Categorical Regression

Answer: D) Categorical Regression


12. What is the difference between simple linear regression and multiple linear regression?

A) The number of independent variables used

B) The type of data used (continuous vs. categorical)

C) The type of dependent variable (continuous vs. categorical)

D) The presence of outliers in the data

Answer: A) The number of independent variables used


13. What is the purpose of the Durbin-Watson statistic in regression analysis?

A) To test for autocorrelation in the residuals

B) To test for heteroscedasticity in the residuals

C) To test for multicollinearity in the independent variables

D) To test for normality in the dependent variable

Answer: A) To test for autocorrelation in the residuals


14. Which of the following statements about outliers in regression analysis is true?

A) Outliers have no effect on regression results

B) Outliers can significantly influence the regression model

C) Outliers only affect the intercept term

D) Outliers are always due to data entry errors

Answer: B) Outliers can significantly influence the regression model


15. What is the purpose of the F-test in regression analysis?

A) To test the overall significance of the regression model

B) To test for multicollinearity

C) To test for normality of the residuals

D) To test for homoscedasticity

Answer: A) To test the overall significance of the regression model


16. What does the slope coefficient represent in a simple linear regression model?

A) The change in the dependent variable for a one-unit change in the independent variable

B) The intercept of the regression line

C) The standard error of the regression

D) The p-value of the regression model

Answer: A) The change in the dependent variable for a one-unit change in the independent variable


17. Which of the following is NOT a method to deal with multicollinearity in regression analysis?

A) Removing one of the correlated variables

B) Combining the correlated variables into a single variable

C) Using a different regression technique

D) Ignoring multicollinearity as it has no impact on the model

Answer: D) Ignoring multicollinearity as it has no impact on the model


18. What is the formula for calculating the residuals in regression analysis?

A) Observed value - Predicted value

B) Predicted value - Observed value

C) (Observed value)^2 - (Predicted value)^2

D) (Predicted value)^2 - (Observed value)^2

Answer: A) Observed value - Predicted value


19. In logistic regression, what type of variable is the dependent variable?

A) Continuous

B) Categorical

C) Dichotomous

D) Ordinal

Answer: C) Dichotomous


20. What is the purpose of the Akaike Information Criterion (AIC) in regression analysis?

A) To compare the goodness of fit of different models

B) To test for normality of the residuals

C) To assess multicollinearity

D) To calculate the p-value of the regression model

Answer: A) To compare the goodness of fit of different models


21. Which of the following statements is true about Ridge Regression?

A) It adds a penalty term to the loss function to prevent overfitting

B) It is used for classification problems

C) It is a non-parametric regression technique

D) It is only applicable to simple linear regression


Answer: A) It adds a penalty term to the loss function to prevent overfitting


22. What does the term "heteroscedasticity" refer to in regression analysis?

A) Unequal variances of the residuals across different levels of the independent variable

B) Equal variances of the residuals across different levels of the independent variable

C) A linear relationship between the dependent and independent variables

D) The presence of outliers in the data

Answer: A) Unequal variances of the residuals across different levels of the independent variable


23. Which of the following is an example of a non-parametric regression technique?

A) Polynomial Regression

B) Support Vector Regression (SVR)

C) K-Nearest Neighbors (KNN) Regression

D) Ridge Regression

Answer: C) K-Nearest Neighbors (KNN) Regression


24. What is the purpose of cross-validation in regression analysis?

A) To assess the performance of the regression model on unseen data

B) To calculate the p-value of the regression model

C) To test for multicollinearity in the independent variables

D) To identify influential observations in the data

Answer: A) To assess the performance of the regression model on unseen data


25. What is the primary goal of feature selection in regression analysis?

A) To identify the most important independent variables for the model

B) To increase the complexity of the regression model

C) To introduce multicollinearity

D) To reduce the sample size

Answer: A) To identify the most important independent variables for the model


26. Which of the following regression techniques is specifically designed for dealing with time series data?

A) Time Series Regression

B) ARIMA (AutoRegressive Integrated Moving Average)

C) Lasso Regression

D) Principal Component Regression (PCR)

Answer: B) ARIMA (AutoRegressive Integrated Moving Average)


27. What is the purpose of the Cook's Distance statistic in regression analysis?

A) To identify influential observations

B) To test for normality of the residuals

C) To assess multicollinearity

D) To test for heteroscedasticity

Answer: A) To identify influential observations


28. Which of the following statements is true about polynomial regression?

A) It models a linear relationship between the variables

B) It can capture nonlinear relationships between the variables

C) It can only be applied to simple linear regression

D) It is not affected by outliers in the data

Answer: B) It can capture nonlinear relationships between the variables


29. In regression analysis, what is the purpose of transforming variables (e.g., taking the logarithm)?

A) To make the data more interpretable

B) To improve the visual representation of the data

C) To meet the assumptions of the regression model

D) To increase the complexity of the model

Answer: C) To meet the assumptions of the regression model


30. What is the primary drawback of using stepwise regression for variable selection?

A) It may select variables that are not actually relevant

B) It is computationally intensive

C) It cannot handle categorical variables

D) It is prone to overfitting

Answer: A) It may select variables that are not actually relevant