การเปรียบเทียบการประมาณค่าสัมประสิทธิ์ในการถดถอยเชิงเส้นพหุคูณ เมื่อเกิดพหุสัมพันธ์ / อัชฌา อระวีพร = A comparison of coefficient estimation in multiple linear regression with multicollinearity / Autcha Araveeporn
The objective of this research is to compare multiple linear regression coefficient estimating methods under multicollinearity conditions by comparing Ordinary Least Square method (OLS), Ridge Regression by Breiman method (RID) and Garrote Linear Regression method (GAR). The criterion of comparison is the ratio of average values of the mean square errors. This study examines the residual distribution from a normal distribution with mean of 1.0, standard deviation of 0.05 and 0.15; contaminated - normal distribution with scale factors of 3 and 10 each with percent contaminations of 5 and 10; Weibull distribution with scale parameter of 1, with shape parameter of 1, 2 and 5; and lognormal distribution with mean of 0, and standard deviations of 0.22, 0.55 and 0.84, respectively. This study uses sample sizes of 10, 30, 50 and 100, respectively. The levels of correlation among independent variables are equal to 0.1 and 0.3 (low), 0.5 (middle), 0.7 and 0.9 (high), and 0.99 (very high), respectively, for the number of independent variables of 3 and 5. The data are obtained through simulation using a Monte Carlo technique with 500 reptitions for each case. The results for comparing the average value of mean square error are as follows: For every level of correlation, the RID method generally gives the best results, except in die cases when the level of correlation is very high. In the case of level of correlation is very high and the residuals have normal and contaminated-normal distribution, the GAR method gives the best results with 3 independent variables (sample sizes = 30, 50) and 5 independent variables, with standard deviation of 0.05 (sample sizes = 30, 50). The OLS method gives the best result with 5 independent variables, and a standard deviation of 0.05 (sample size = 10). The case of level of correlation is very high and the residuals have a Weibull distribution, the GAR method gives the best result with 3 independent variables, and a shape parameter of 5 (sample size = 50). The average value of mean square error varies with (in descending order): levels of correlation, standard deviation, the number of independent variables, scale factor, and percent contamination. The average value of mean square error varies conversly to sample sizes.