How Bad is HadCRUT4 Data?
Guest Essay by Renee Hannon
This post is a coarse screening assessment of HadCRUT4 global temperature anomalies to determine the impact, if any, of data quality and data coverage. There has been much discussion on WUWT about the quality of the Hadley temperature anomaly dataset since McLean’s Audit of the HadCRUT4 Global Temperature publication which is paywalled. I purchased a copy to see what all the hub-bub was about, and it is well worth the $8 in my view. Anthony Watts’ review of McLean’s findings and executive summary can be found here.
A key chart for critical study is McLean’s Figure 4.11 in his report. McLean suggests that HadCRUT4 data prior to 1950 is unreliable due to inadequate global coverage and high month-to-month temperature variability. For this post, I subdivided McLean’s findings into three groups shown with added shading: Good data which covers the years post-1950. During this period global data coverage is excellent at greater than 75% and month-to-month temperature variation is low. Questionable data occurs from 1880 to 1950. During this period global data coverage ranged from 40% to 70% with higher monthly temperature variations. Poor data is pre-1880 when global coverage ranged from 14 to 25% with extreme monthly temperature variations.
An obvious question is how does data coverage and data quality impact the technical evaluation and interpretation of HadCRUT4 global temperature anomalies?
The monthly HadCRUT4 global temperature anomaly dataset, referred to as temperature hereafter, was detrended to compare temperature trends, deviations and the impact of noise. The focus was on interannual temperature anomalies. Therefore, it is important to remove underlying longer-term trends from the data.
A common method to detrend temperature data uses a slope of linear regression over a period of time. The linear trend from 1850 to 2017 is used to detrend the HadCRUT4 temperature data shown in Figure 1a. A simple linear regression through the entire dataset leaves a secondary trend in the data. As seen in Figure 1a, the underlying longer-term signal is not completely removed, and the remaining data is not flattened. Several linear regression slopes are required to completely remove secondary trends.
Figure 1: Comparison of commonly used detrending methods on temperature datasets. Gray boxes show statistics associated with each method. a) HadCrut4 annual monthly temperature anomalies detrended using a single linear regression slope. b) HadCrut4 annual monthly temperature anomalies detrended using a 21-year running average. Several temperature anomalies greater than 2 are noted on the graph.
Another method used here is a centered running average of 21 years to detrend the temperature dataset. Since averaging degrades the tail end of the data, the last 10 years used a simple linear regression to extend past the running average. The running average with linear tail is subtracted from the HadCRUT4 temperature anomaly data. This removes the influence of underlying longer-term trends and the remaining data appears flattened. As shown in Figure 1b, temperature data detrended using this method produces an average temperature close to zero indicating a flattened trend. It is now easier to compare temperature spikes, amplitude of key events, or changes in the baseline noise.
Figure 1 enables comparison of the two detrending methods. Temperature standard deviations and ranges between the two methods are different. The detrended data using the running average has a smaller standard deviation and narrower temperature range of 0.4 degrees C versus 0.7 degrees C for the linear detrended data. This narrower range is more indicative of short-term climate variability compared to the linear detrended data which is a combination of both short-term and underlying longer-term trends.
Temperature Spike Frequency
At irregular intervals on the detrended data, a temperature spike lasting approximately 1-2 years emerges through the background noise such as in 1878, 1976, 1998, and 2016 to name a few. These warm and cold temperature spikes are attributed to El Niño and La Niña conditions of the El Niño-Southern Oscillation (ENSO) which effect temperature, wind and precipitation (Trenberth, et. al.). Figure 2 highlights the temperature spikes which exceed two standard deviations from the zero baseline.
Figure 2: HadCRUT4 detrended by running average. Red and blue dots show temperature anomaly spikes greater than 2 from the zero baseline.
Post-1950 there are only two cold and two warm spikes that exceed 2. The 1998 and 2016 warm spikes are associated with well documented El Nino conditions (Trenberth). The period from 1950 to 1998 is unusually devoid of warm temperature spikes. During this interval, there were several large volcanic eruptions at Mount Pinatubo in 1991, El Chichon in 1982 and Mount Agung in 1963 with continued weaker eruptions. Volcanoes produce sulfate aerosols that slow warming by reflecting incoming solar radiation and can cause global surface temperatures to decrease. Studies suggest that volcanic eruptions impact global temperatures about 1 year after the eruption and for approximate 2-3 years depending on the size of the eruptions.
The 1970’s were years of record cold temperatures in parts of North America. These years coincide with the ice age scare and the Cold Wave of 1977 when, in the U.S., the Ohio River froze solid for the first time since 1918 and snow fell in Miami with record low temperatures of less than 28 degrees F. La Nina conditions around 1956 resulted in a European cold wave. The years 1976/77 are commonly recognized as an abrupt climate shift from cooler to warmer global temperatures (Giese, et. al). Post-1976 there is a lack of cold temperature spikes.
In contrast, pre-1950 there are double the number of warm and cold temperature spikes compared to post-1950, especially between 1880 and 1920. One hypothesis is some warm and cold temperature spikes are the result of the increased noise in the data due to sparse data coverage as described by McLean.
Other interpretations suggest the temperature spikes are related to increased frequency of El Nino and La Nina events. For example, the period from 1900 to 1940 has been described as being dominated by the Era of El Ninos based on the Nino3.4 index summarized by Donald Rapp. However, during this period there were also three temperature cold spikes exceeding 2 and several warm temperature spikes immediately preceding the referenced Era of El Ninos. Additional study is warranted to determine if some of the warm and cold temperature spikes are caused by noisy data or are the result of true El Nino and La Nina episodes. It is difficult to distinguish background noise from anomalous spikes during this period.
Temperature Spike Amplitudes
The amplitude of warm and cold temperature spikes on detrended data were compared from 1860 to 2017 and are shown in Figure 3. The 1878 warm temperature spike is the most outstanding anomaly in the past 160 years. This warm spike is an order of magnitude greater than any other temperature excursion from a zero baseline and occurs in McLean’s sparse data coverage zone with the highest amount of noisy data. The 1878 temperature warm spike may indeed be related to an El Nino episode. Several authors have reported large-scale drought in northern China and famine in Southern India during this time. However, the 1878 temperature could be over-estimated because of poor data coverage and quality. This outlier appears to be largely driven by the Northern Hemisphere global temperature record.
Figure 3: Comparison of warm and cold global surface temperature spikes from 1860 to 2017. Data is the detrended data shown in figure 1b using a running average. Spikes were aligned close to maximum departure. Note temperature scale on warm spikes is zoomed out due to the 1878 anomaly. Warm average temperature maximum does not include 1878.
Except for 1878, the other warm temperature spikes appear very similar and all peak just slightly above 0.2 degrees C. This casts additional suspicion upon the accuracy of the 1878 warm spike. Further, the cold spikes do not show any unusual trends. The 1976 cold spike is the coldest over the past 160 years, but not significantly so. Also, note both warm and cold spikes show nearly equal temperature departures of +0.23 and -0.22 degrees C from a zero baseline, respectively. Again, 1878 is an exception.
HadCRUT4 global temperature anomalies were assessed to determine the impact of varying data coverage and noise as described by McLean. It is recognized that sparser data coverage and increased noise have influenced interannual HadCRUT4 temperature anomalies pre-1950.
Interannual temperature spikes, both warm and cold, increase in frequency during the Questionable Data timeframe pre-1950. While several of these spikes may be associated with El Nino and La Nina events, others may be the result of increased noise within the data. This period of questionable data, especially the cluster of warm and cold spikes from 1880 to 1920, warrants further study.
The 1878 warm temperature spike is the most significant interannual temperature anomaly over the past 160 years. The temperature amplitude is almost double in magnitude over all other temperature amplitude spikes. This recorded temperature outlier is most likely erroneously high due to sparse global coverage and increased data noise.
Warm and cold spike maximums do not demonstrate any obvious increasing or decreasing trends from 1880 to present day, except for 1878. The recent 1998 and 2016 warm spikes are within one standard deviation of past warm spikes.
The HadCRUT4 global data pre-1950 may contain useful information but should be used with caution with an understanding that increases in noise due to poor data sampling may create false or enhanced temperature anomalies.
A future post will evaluate the data quality impact on the underlying HadCRUT4 decadal and multi-decadal trends.
Acknowledgements: Special thanks to Andy May and Donald Ince for reviewing and editing this article.
via Watts Up With That? https://ift.tt/1Viafi3