Please use this identifier to cite or link to this item:
http://hdl.handle.net/10884/1443
Title: | On the Choice of Linear Regression Algorithms for Biological and Ecological Applications |
Authors: | Vieira, Vasco Creed, Joel Scrosati, Ricardo A. Santos, Anabela Dutschke, Georg Leitão, Francisco Engelen, Aschwin H. Huanel, Oscar R. Guillemin, Marie-Laure Mateus, Marcus Neves, Ramiro |
Keywords: | Model II regression Principal Components Analysis Reduced Major Axis |
Issue Date: | 2016 |
Publisher: | Annual Research & Review in Biology |
Citation: | Vieira, V.; Creed, J.; Scrosati, R. A.; Santos, A.; Dutschke, G. et al.(2016). On the Choice of Linear Regression Algorithms for Biological and Ecological Applications. Annual Research & Review in Biology, Vol. 10(3), 1-9. |
Abstract: | Model II regression (i.e. minimizing residuals obliquely) is the adequate alternative to Model I regression by Ordinary Least Squares (i.e. minimizing residuals vertically) given the absence of well-established dependence relationships or x measured with error. Yet, it has no perfect solution. Determining the true slope from errors-in-the-variables models requires the errors in x and y estimated from higher order moments. However, their accurate estimation requires enormous data sets and thus they are not applicable to most ecological problems. The alternative Reduced Major Axis (RMA) is dependent on a strict set of assumptions, hardly met with real data, making it prone to bias, whereas Principal Components Analysis (PCA) becomes less reliable with decreasing correlations while x and y presenting approximate variances. We used artificial data (allowing for the determination of the true slope) to demonstrate when RMA or PCA should be preferred. Consequently, we propose using PCA whenever r2+s2 x/s2 y is higher than 1.5. Otherwise, we suggest generating artificial data manipulated to match the structure of the original, and to test which method provides closer estimates to the input true slope. We provide a user-friendly script to perform this task. We tested the use of RMA and PCA with real data about intraspecific and interspecific biomass-density relations in algae and seagrass, algae frond growth, crustacean and bird morphometry, sardine fisheries and social sciences data, commonly finding widely divergent slope estimates leading to severely biased parameter estimations and model applications. Their analyses support the suggested approach for method selection summarized above. |
URI: | http://hdl.handle.net/10884/1443 |
Appears in Collections: | A CE/MKT - Artigos |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
document.pdf | 327.6 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.