| |
| ||||||
Why I am in favour of loggingThis is a discussion on Why I am in favour of logging within the Fishing In The Bay forums, part of the CORTEX Blogs category; A colleague recently brought to me some alternative fits he had done for*a paper he was writing. The alternative fits looked very strange but had been strongly suggested by a ... |
![]() |
| | LinkBack | Thread Tools | Search this Thread | Display Modes |
| | #1 |
| Member Join Date: Jun 2009
Posts: 37
![]() | A colleague recently brought to me some alternative fits he had done for*a paper he was writing. The alternative fits looked very strange but had been strongly suggested by a referee. He was fitting a regression model to inter-country trade data and trying to explain patterns in terms of various measures of cultural fit. The referee was pointing to some papers in econometrics that had argued about the relative merits of*multiplicative regression models*fitted on the direct scale, rather than on the log-scale.*The referee wanted a direct fit on the basis that*the random errors may be more normal and additive on the direct scale. One of the papers he was pointing to*is HERE*which contains the unequivocal recommendation Overall, except under very special circumstances, estimation based on the log-linear model cannot be recommended.Sounds like complete bollocks to me. I do not recall ever having a real econometric data set where regression on the log-scale was worse than the direct. I have had some data sets where it did not seem to matter – typically when the mean response was large and the variation of the*noise was small. Why is the log-transform better? Let me count the ways. Leverage effects can be huge on the direct scale. This was a case in point with my colleague’s data. He actually had data over about 35 years and was getting crazy results from around 1990. It was the China effect dragging all the other estimates all over the place. Collinearity, which is pretty closely related to the leverage effect, is also much higher on the direct scale. The model had several surrogates for economic scale and these were almost 100% correlated on the direct scale – but only about 90% on the log-scale. Again, because of China and the US. Fitting algorithms are so much easier on the log-scale. It is a linear model. The direct model is non-linear. The leverage and collinearity also kick in at the same time. Anyone with experience with non-linear models with almost collinearity will know what a bad thing this is. The main reason the referee put forward for the direct model was that the errors would be more normal on that scale. This is an empirical matter and, in my experience, residuals usually look more normal on the log-scale. More importantly, we all know that normality of residuals actually does not matter much at all (for moderate sized data sets) yet the misconception persist even within highly quantitative disciplines like econometrics. Especially for regression (or two-sample tests) there is a cancelling out of the first order skewness term that makes even highly skew errors have a limited effect on the fit. The scale of random errors is surely proportional to the mean in practice for real economic data sets. Does anyone really believe that the intrinsic variability of the GDP of Kirribati is the same as that of China? Therefore, to do the direct non-linear model properly you would have to weight the fit with respect to variance. These variances would have to be estimated from the non-linear almost collinear model. So you are iterating and I can see this procedure never converging at all. So there you have it. When I get data in an Excel spreadsheet with $-signs anywhere I just log the hell out of everything and only then start snooping around. I would be interested if anyone can tell of experiences where direct modeling was actually successful. ? Get More from the original blog... |
| | |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Branch status out of favour | Latest News Headlines | 2010 Q1 News Headlines | 0 | 14th January 2010 11:04 PM |
| Branch status out of favour | Latest News Headlines | 2009 Q4 News Headlines | 0 | 21st December 2009 09:05 AM |
| Kettle 4 Logging architecture | Latest News Headlines | Data Integration News Feeds | 0 | 25th November 2009 11:39 AM |
| Regulation does not favour consistency | Latest News Headlines | 2009 Q4 News Headlines | 0 | 13th October 2009 10:39 AM |
| The weekly wrap: NAB winning favour | Latest News Headlines | 2009 Q3 News Headlines | 0 | 10th July 2009 10:50 AM |
| | |
| | |