Go Back   CORTEX Forums > Local Happenings > CORTEX Blogs > Fishing In The Bay
Register Blogs FAQ Members List Calendar Search Today's Posts Mark Forums Read

Why I am in favour of logging

This is a discussion on Why I am in favour of logging within the Fishing In The Bay forums, part of the CORTEX Blogs category; A colleague recently brought to me some alternative fits he had done for*a paper he was writing. The alternative fits looked very strange but had been strongly suggested by a ...


Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 25th March 2010, 06:21 PM   #1
Member
 
Join Date: Jun 2009
Posts: 37
Chris Lloyd is on a distinguished road
Default Why I am in favour of logging

A colleague recently brought to me some alternative fits he had done for*a paper he was writing. The alternative fits looked very strange but had been strongly suggested by a referee. He was fitting a regression model to inter-country trade data and trying to explain patterns in terms of various measures of cultural fit. The referee was pointing to some papers in econometrics that had argued about the relative merits of*multiplicative regression models*fitted on the direct scale, rather than on the log-scale.*The referee wanted a direct fit on the basis that*the random errors may be more normal and additive on the direct scale.



One of the papers he was pointing to*is HERE*which contains the unequivocal recommendation
Overall, except under very special circumstances, estimation based on the log-linear model cannot be recommended.

Sounds like complete bollocks to me. I do not recall ever having a real econometric data set where regression on the log-scale was worse than the direct. I have had some data sets where it did not seem to matter – typically when the mean response was large and the variation of the*noise was small.

Why is the log-transform better? Let me count the ways.

Leverage effects can be huge on the direct scale. This was a case in point with my colleague’s data. He actually had data over about 35 years and was getting crazy results from around 1990. It was the China effect dragging all the other estimates all over the place.

Collinearity, which is pretty closely related to the leverage effect, is also much higher on the direct scale. The model had several surrogates for economic scale and these were almost 100% correlated on the direct scale – but only about 90% on the log-scale. Again, because of China and the US.

Fitting algorithms are so much easier on the log-scale. It is a linear model. The direct model is non-linear. The leverage and collinearity also kick in at the same time. Anyone with experience with non-linear models with almost collinearity will know what a bad thing this is.

The main reason the referee put forward for the direct model was that the errors would be more normal on that scale. This is an empirical matter and, in my experience, residuals usually look more normal on the log-scale. More importantly, we all know that normality of residuals actually does not matter much at all (for moderate sized data sets) yet the misconception persist even within highly quantitative disciplines like econometrics. Especially for regression (or two-sample tests) there is a cancelling out of the first order skewness term that makes even highly skew errors have a limited effect on the fit.

The scale of random errors is surely proportional to the mean in practice for real economic data sets. Does anyone really believe that the intrinsic variability of the GDP of Kirribati is the same as that of China? Therefore, to do the direct non-linear model properly you would have to weight the fit with respect to variance. These variances would have to be estimated from the non-linear almost collinear model. So you are iterating and I can see this procedure never converging at all.

So there you have it. When I get data in an Excel spreadsheet with $-signs anywhere I just log the hell out of everything and only then start snooping around. I would be interested if anyone can tell of experiences where direct modeling was actually successful.

?



Get More from the original blog...
Chris Lloyd is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiTweet this Post!
Reply With Quote
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Branch status out of favour Latest News Headlines 2010 Q1 News Headlines 0 14th January 2010 11:04 PM
Branch status out of favour Latest News Headlines 2009 Q4 News Headlines 0 21st December 2009 09:05 AM
Kettle 4 Logging architecture Latest News Headlines Data Integration News Feeds 0 25th November 2009 11:39 AM
Regulation does not favour consistency Latest News Headlines 2009 Q4 News Headlines 0 13th October 2009 10:39 AM
The weekly wrap: NAB winning favour Latest News Headlines 2009 Q3 News Headlines 0 10th July 2009 10:50 AM


All times are GMT +11. The time now is 10:08 AM.

© The Business Intelligence Group

Search Engine Optimization by vBSEO