What’s in a Coefficient? The “Not so Simple” Interpretation of R2, for Relatively Small Sample Sizes
Abstract
There are several misconceptions when interpreting the values of the coefficient of determination, R2, in simple linear regression. R2 is heavily dependent on sample size n and the type of data being analyzed but becomes insignificant when working with very large sample sizes. In this paper, we comment on these observations and develop a relationship between R2, n, and the level of significance α, for relatively small sample sizes. In addition, this paper provides a simplified version of the relationship between R2 and n, by comparing the standard deviation of the dependent variable, Sy, to the standard error of the estimate, Se. This relationship will serve as a safe lower bound to the values of R2. Computational experiments are performed to confirm the results from both models. Even though the focus of the paper is on simple linear regression, we present the groundwork for expanding our two models to the multiple regression case.
Full Text:
PDFDOI: https://doi.org/10.11114/jets.v7i12.4492
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Paper Submission E-mail: jets@redfame.com
Journal of Education and Training Studies ISSN 2324-805X (Print) ISSN 2324-8068 (Online)
Copyright © Redfame Publishing Inc.
To make sure that you can receive messages from us, please add the 'redfame.com' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.
If you have any questions, please contact: jets@redfame.com
-------------------------------------------------------------------------------------------------------------------------------------------------------------