Data in Bondage

Data in Bondage

http://ift.tt/2GtdSO9

Guest Post by Willis Eschenbach

In a recent post here on WUWT, someone yclept “Javier” has written about the Bond Rafted Ice data. In a comment, he said:

This is a frequency spectrum from Bond data. It shows the 980-year Eddy cycle, and the 2400-year Bray cycle I have written so much about.

Bond Debret graph.png

Well, that looks like it establishes the existence of the 960-year cycle in the Bond data beyond any doubt …

Intrigued by that, and knowing next to nothing about the Bond data or the “980-year Eddy cycle” referred to by Javier, of course, I had to go take a look at it. Javier provided a link to the paper by Debret et al. he was discussing, entitled The origin of the 1500-year climate cycles in Holocene North-Atlantic records

The abstract of the Debret et al. paper says:

Since the first suggestion of 1500-year cycles in the advance and retreat of glaciers (Denton and Karlen, 1973), many studies have uncovered evidence of repeated climate oscillations of 2500, 1500, and 1000 years. During last glacial period, natural climate cycles of 1500 years appear to be persistent (Bond and Lotti, 1995) and remarkably regular (Mayewski et al., 1997; Rahmstorf, 2003).

Hmmm … however, I was interested in the data itself, so I dug around and found out that it’s available here.

I found out that what the paper analyzed was called the “stacked ocean record”. What does that mean? Well, they combined a record of hematite grains with two records of Icelandic glass from different drill cores and one record of detrital carbonate … add them together, divide by four, and presto! A “stacked” record.

So I did a Complete Ensemble Empirical Mode Decomposition (CEEMD) analysis of the stacked record. I used the “CEEMD” function in the R package “hht” for the analyses. To begin with, I found what the authors found about the “Eddy cycle”. Their comment was

… the application of a 1000-year filter to the composite series of IRD does not provide very conclusive correlation during 0–5000 years, …

Now, here is the periodogram of the CEEMD analysis:

CEEMD Bond Stacked Data Periodogram

This shows the strengths of the cycles in each of the empirical modes. It sure looks like there is a 960-year cycle in there in empirical mode C3, along with strong cycles at about 2300 and 7000 years, and a weaker cycle at around 1,300 years in length. And this is what we saw in the periodogram provided by Javier above.

But a look at the individual empirical modes gives a deeper understanding of the situation with the 960-year cycle. These empirical modes are the actual signals that when added together recreate the original raw data signal shown in the top panel below.

CEEMD Bond Stacked Data

As the authors found, the putative 960-year cycle in empirical mode C3 only has significant strength in the earliest 6,000 years of data. On the other hand, it weakens and nearly disappears in the most recent 5,000 years. As I’ve said many times, this kind of appearance and subsequent disappearance of “cycles” is quite common in natural datasets.

More to the point, however, upon learning that it was a “stacked” record, my further thought was “Wait a minute, whenever you add different records you can get all kinds of artifacts from constructive and destructive interference”. So I did a CEEMD analysis of the four individual underlying Bond datasets. Here are the pairs of CEEMD graphs for each of the four individual datasets that make up the “stacked” dataset. In each case, as in the “stacked” data, the putative 960-year Eddy cycle is in empirical mode C3. First, the hematite stained grains dataset. Click on the graphic if you’d like a larger image.

 

Curious. In this one, there is no clear peak in C3. There’s still a peak at ~2,300 years, in empirical mode C5, but the 7,000-year cycle is much weaker. And there’s a wide peak around 1,400 years in empirical mode C4. This is very different from the stacked data.

Next, here’s the first of the two Icelandic Glass datasets.

 

In this one, the 7,000-year cycle is back … but there’s even less sign of the putative 960-year cycle in empirical mode C3. However, the previous cycle in C4 has moved down to a weak peak near 1000 years.

Then we have the second Icelandic Glass dataset, from a different drill core.

 

Once again, there is no 960-year cycle. It’s smeared out from 500 to 1,500 years, basically non-existent, but the 7,000-year cycle is strong. Overall, the two Icelandic Glass datasets are nearly identical, which increases the confidence in these results.

Finally, here’s the detrital carbonate data.

 

Most interesting. Almost no sign of the 7,000-year cycle, and once again there’s only a weak 960-year cycle. However, in empirical mode C4, there is a small peak around 1,500 years in length.

So … what have we learned?

Well, the first thing I learned is that the putative 960-year “Eddy Cycle” only exists in the “stacked” dataset. There is little sign of it in the four underlying datasets. It is an artifact of the averaging process was used to stack the four datasets.

Next, Debret et al. say that there is a 1,500-year cycle in the Bond data … however, although it appears to show up in the “stacked” data, it is only found in one of the four individual datasets. Again, this appears to be an artifact.

Next, the 7,000-year cycle is strong in the Icelandic Glass datasets, weak in the hematite data, and basically non-existent in the detrital carbonate data. Go figure.

Next, there is a cycle in all four of the datasets at somewhere between 2,200 and 2,600 years. This is the “Bray Cycle” referred to by Javier. Since it appears in all four datasets, can we believe that it is real?

Well, not so fast. Here’s empirical mode C3 for all four of the datasets:

CEEMD Bond Empirical Mode C3

At about 5,500 years BP, they all line up. And moving towards the present, the adjacent cycle is almost exactly 2,500 years long.

But the next cycle nearer to the present is quite different. For the first three datasets, it’s only about 2,000 years long … and for the detrital carbonate, a bit longer than that. Also, all of the cycles are disappearing as we get nearer to the present.

Going back in time from the 5,500-year peak, however, things get worse rapidly. The correlation of the data falls apart, and by the time we’re back to 10,500 years before present, it’s all over the map. The datasets are completely different. However, instead of decreasing in size as they did near to the present, they maintain their amplitude back to the earliest part of the record.

Note, however, that the two Icelandic Glass datasets (blue and red) stay in very tight lockstep over the whole dataset. This is not good since they are both given equal weight in the “stacked” dataset … meaning that the Icelandic Glass data gets counted twice and thus is given twice the weight of the other two datasets. This is bad practice because it distorts the end result.

So … is there actually a 2,500-year cycle as claimed by Debret et al.?

Well, it’s possible, but the evidence is far from clear. There’s only one 2,500-year complete cycle in the data, from ~ 3,000 to 5,500 years BP … but before and after that, the cycles vary in both length and amplitude in the four datasets.

I see this as yet another cautionary tale of expert analysis gone wrong. Here are the cautions, in no particular order:

As Richard Feynman observed, “Science is the belief in the ignorance of experts”. I can’t tell you how many times I’ve looked at some paper like this one, written by experts, only to have it fall apart under closer examination.

Periodograms and Fourier Analyses are limited in that they can be fooled by a signal that only appears in part of the record, and then disappears.

What I call “pseudocycles” are quite common in nature. These appear to be real cycles, but over time they get larger, or get smaller, or disappear altogether, only to be replaced by some other pseudo cycle.

The fact that four datasets all show say a ~ 2,500-year cycle does not mean that the cycles are in phase.

Using “stacked” datasets can easily create artifacts through both constructive and destructive interference.

Using two basically identical datasets in a “stack” of four datasets will overweight that data, leading to incorrect conclusions.

Humans are very good at detecting patterns and cycles, even where none exist. For example, almost all cultures see “constellations” in random groupings of stars. I hold that this is a result of using our eyes to find predators—there is no penalty for seeing a pattern of stripes that is not there, but there is a huge penalty for not noticing the pattern of stripes that is a tiger. And as a result, we tend to see patterns everywhere, including cycles in natural climate datasets, even though they may not exist at all.

This implies that all such claims of cycles in natural climate datasets need to be investigated very, very carefully. For example, if you are using a periodogram or doing a Fourier analysis, at a minimum it is imperative to divide the dataset in two and see if the claimed cycle exists in both halves. Or, as I have done, you can use a CEEMD analysis to investigate the nature and changes in the cycle visually. However, I see experts all the time do one Fourier analysis on a full dataset and declare victory …

My best wishes to you for all the good things—laughing with your family, walking in the rain, sunlight far-reaching on the sea, gentle breezes …

w.

MY USUAL REQUEST: Please, when you comment, QUOTE THE EXACT WORDS YOU ARE DISCUSSING so that we can all understand who and what you are talking about. In addition, it’s not possible to refute someone’s claim unless you quote it first. Note that while this request is polite, I am likely to get grumpy and say inappropriate things about your ancestry and personal habits if you repeatedly refuse to identify what you’re talking about.

In support of that, I’ve posted this graphic before, and I’ll post it again. Please structure your comments to keep them up near the top of the pyramid.

grahams hierarchy of disagreement

The graphic is based on How to Disagree by Paul Graham, which is well worth reading.

One thing I’d like to highlight is that in the linked article the author says (emphasis mine):

DH5. Refutation.

The most convincing form of disagreement is refutation. It’s also the rarest, because it’s the most work. Indeed, the disagreement hierarchy forms a kind of pyramid, in the sense that the higher you go the fewer instances you find.

To refute someone you probably have to quote them. You have to find a “smoking gun,” a passage in whatever you disagree with that you feel is mistaken, and then explain why it’s mistaken. If you can’t find an actual quote to disagree with, you may be arguing with a straw man.

Words to live by …

Superforest,Climate Change

via Watts Up With That? http://ift.tt/1Viafi3

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s