Go Back   CORTEX Forums > Local Happenings > CORTEX Blogs > Fishing In The Bay
Register Blogs FAQ Members List Calendar Search Today's Posts Mark Forums Read

Redefining r-squared

This is a discussion on Redefining r-squared within the Fishing In The Bay forums, part of the CORTEX Blogs category; Few statistics are more oft-quoted by empirical researchers than r-squared. While applauding the value of an intuitive interpretation in principle, it is pretty clear that the interpretation is wrong. Apart ...


Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 24th June 2010, 01:27 PM   #1
Member
 
Join Date: Jun 2009
Posts: 37
Chris Lloyd is on a distinguished road
Default Redefining r-squared

Few statistics are more oft-quoted by empirical researchers than r-squared. While applauding the value of an intuitive interpretation in principle, it is pretty clear that the interpretation is wrong. Apart from honesty, the main reason I care about this is that it gets me into trouble with (the more discerning) students.



Not for the first time a student recently came back to me with a query. I have given him some*data and the task was to draw some kind of a causaility diagram using correlations, partial correlations and commons sense (for the causailty). They had just had the class on r-squared so the idea was to put these on the arrows.

The student was interested in checking the interpretation of r-squared. So he broke the y-variable*down into groups of equal x-values (which was discrete). He looked at the standard deviation of Y for each group (using Pivotables). He compared these with the overall standard deviation and found that the within group standard deviation was, on average, about 40% of the overall. So 60% is explained by X. Yet the correlation was about 0.9 and we say 81% is explained.

I had to tell him that the common interpretation of r-squared is wrong but ubiquitous and that I had hoped*he wouldn’t notice!

The problem of course is that that we can explain 81% of the variance. But variance does not measure variability (or anything sensible?). Standard deviation does. This being the case, it seems that we should re-defined variation explained as

1?(1r2)

which is always way smaller. Not that I am game to try! Maybe if we collectively came up with a better name we could get away with it. One possibility would be to incorporate this adjustment into the adjusted r-squared. In other words, substitute the adjusted r-squared into the above formula and call this variation explained. The downside of this is that the incorrectness of the interpretation of ordinary r-squared would then stand out like the proverbials.



Get More from the original blog...
Chris Lloyd is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiTweet this Post!
Reply With Quote
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +11. The time now is 10:02 AM.

© The Business Intelligence Group

Search Engine Optimization by vBSEO