[Home] [Contact] [Science] [Main MBH page]

This is a work in progress. If it survives comment over a week or whatever, I'll consider it reasonably robust. It was started on 2004/11/03.
2004/11/08: still WiP.
2005/03/24: now more of a work-not-in-progress :-( I would slightly emphasise the importance of autocorrelation in the random noise.

As a work in progress, you may like to see earlier versions. Its under rcs, so you can: download index.phtml,v.

Current status:

  1. 2004/11/08: Add the random trends stuff. Hmmm. Also the 1400 stuff. Hmmm #2.
  2. 2004/11/04: no more tonight, only cosmetic changes. Someone sensible says I may have made a mistake, but I don't see what yet. So I downgrade the reliability of this slightly.

Open for comment: this page doesn't allow blogging and stuff, however I welcome comments (use the "contact" link). If you want to mail me something intended to be a public comment, then make that clear please: unless you say otherwise I will treat you comment as private. Public comments will be put up somewhere. I haven't had any yet.

Williams page about MBH and M&M

If you don't know about the history of all this or who MBH are, then I suggest you read the history page, which should also contain some useful links.

But lets get on with it... oops, hold on, first: a disclaimer. Although I've now actually read all of MBH98, I wouldn't say I've fully digested it. I don't think I need to for the purposes of whats below, though.

M&M say that MBH is fatally flawed for any number of reasons - see Tim Lamberts handy guide for some of them. But the one I am looking at here is the one they lay out here (pdf) and perhaps here (warming: lots of extraneous stuff) and which was popularised by Muller.

My summary of their idea is that by doing standardisation over the 1902-1980 period (this is referred to as the "training period"), the MBH procedure has a built-in bias towards producing curves which have a trend over the last century. The idea is plausible (and M&M believe that they have proved it) but:

  1. Looking at the code, it doesn't seem to be true. You can find my post about this here and some stuff from Tim Lambert here. But, looking at code can be tricky (well, try it yourself and see...)
  2. I believe that I can demonstrate by running the code that M&M are wrong.

Updates: 2004/11/08

OK, I've corresponded with some people, but my opinions aren't strongly altered. But this is by no means nailed down yet.
  1. No one has pointed out any technical flaws in what I originally put up.
  2. McI says that the stuff below doesn't really count because I go back to 1000, rather than to 1400. The further back you go, the fewer series you retain, and McI thinks that the series retained at 1000 are so few (27) and of all the same type (bristlecones?) that the graph shape is predetermined. I'm not quite convinced of that, but I put it here so you know.
  3. Actually, I've done a bit more on that. If I start in 1400, and train from 1902-on (black) or 1702-on (red) I get the first PC is shown here and the second here. So... what do these show?
    1. Well (looking at PC1) they do look diff after 1900.
    2. But... there is no hint of the red starting to trend up after 1700, as the M&M theory says it should.
    3. Also, if you look at PC2, it rather looks like the red is taking up the bit missing from PC1... well, these are just handwavings, of course.
  4. I've now done some stuff with random series rather than the MBH proxy series. This has the advantage of allowing you to create as many proxies as you like. I'll hive that off to a separate page: here. What that appears to demonstrate is that M&M are right about one thing: it often does lead to a "hockey stick" shape in random data. But the problem is that the variance-explained of the PC1 done this way is tiny: the first eigenvalue is about 0.03. Whereas when you run it on real data the first eigen value is about 0.55 (back to 1000) or 0.38 (back to 1400). Which means the two problems are very different.

What I did (original stuff from here on)

M&M say that the results depend crucially on the "training period" used - 1902-1980. Fine. Lets just run the code with different periods: 1702-1980, 1303-1980 and 1002-1980 (tech note: this is done by changing the values of itrainmin0 and irawmin). And compare the results. We'll see that there isn't a lot of variation.

Now another caveat. As I understand it, MBH use the PCA/EOF (actually SVD) procedure to reduce a large number of obs down to their principle components. Then, from (some of?) these PC's, a series for that region is constructed. So what I'm going to do is look at the PC's that come out of the program. Since PC1 looks like a 20C warming, and PC2 (and on) look like "noise" (or at least, trendless) I'm going to assume that we are mainly interested in PC1 and that the shape of PC1 is most of the issue.

So. Lets look at the shape of PC1 (ie, contents of pc01.out), with different training periods. Thats the graph there on the left, click on it for a bigger version. The colours are:

  1. Black: training period 1902-1980
  2. Red: training period 1702-1980
  3. Blue training period 1302-1980 (hard to see, because the green mostly overlies it)
  4. Green: training period 1002-1980
The vertical bars show when the training periods start. The curves are plotted with a crude 21-width smoothing (and it is crude, because within 10 points of the edges the smoothing reduces, which is why its spiky at the ends...).

The Green period effectively uses all the data for the training period, except the first 2 points (I did that just in case the code fails with null points outside the training period, but I didn't test that). The black is the MBH standard. Red and Blue are intermediate. I sound a slight note of caution here: I find the curves almost too similar... I expected more differences... but you can check the code in the "files" section at the bottom if you want to.

So, assuming that I've done the coding right, we see that *all* training periods produce the familiar temperature curve (err, of course, its upside down: I think that is because the PC's are indeterminate wrt sign; and of course this is just one region (north america?) so is allow to be different from the full curve).

But the important point is that, although the curves differ, there is no sign at all of the red curve starting to show trends at 1702; or the blue at 1302; or the green from the beginning. And thats what you would expect from M&M's argument.

Now, look at the pic on the right, which is PC2. This is mostly trendless and looks more like "noise". Again, the shapes of all the curves are very similar. Actually there are hints of a 20C trend especially in the green/blue and rather lesser in the red versions. Possibly some leakage from PC1?

You'll also notice that the PC2's are of different magnitudes. Well, yes. This exposes a gap in my knowledge. The PC1's above have been plotted scaled by the first eigenvalue (which is 0.5522 for std; 0.2867 for 1702-; 0.2317 for 1302-; 0.2357 for 1002- (oh, and they have been scaled to zero mean for plotting too)). The second have been plotted scaled by the second evalue, naturally enough. They are: 0.0796; 0.1228; 0.1208 and 0.0753. I expected them to overlie, but they don't. They overlie better if they *aren't* scaled. If anyone can explain this, please let me know. Its possible that we care nothing for the absolute values but only the shape; in which case it doesn't matter.

Err... well there you have it. If I'm right, M&M are wrong, at least for this part of their argument.

So what is M&M's mistake?

M&M think that they are right, which is unsurprising. To be fully convincing, one would have to find the error in their work. They claim to have done extensive monte-carlo simulations of blah wibble (thanks to Ian R). But they haven't put up their code, so we can't go through it (and remember folks, McK confused degrees and radians before! (Thanks Tim L)). And anyway who has the patience?

Update: they *have* put up their code (thanks McI). See www.climate2003.com/data/MM04c/script2.txt and also some new additions to climate2003.


These are the files I'm using. You'll need more to run the code yourself, and you can find them at Mann's ftp site.
  1. mann-pca-noamer.f (should be pretty well equal to MBH's)
  2. mann-pca-noamer-300.f (~300 year training period, from 1702-)
  3. mann-pca-noamer-700.f (~700)
  4. mann-pca-noamer-1000.f (~1000, ie all)
  5. mbh-files.tar.gz - the MBH proxy records, all in one handy tar file (well, its handy if you have tar... and gzip...)
  6. mbh3.pro - IDL program to read in and plot output.

[Page last modified: 24/3/2005] [Home] Page proudly created with vi... or vim... or...