Perhaps it’s not surprising that a political party unable to come to grips with the scientific evidence on global warming would extend its claim that the evidence is politically manipulated to other inconvenient truths. But the attack on the Bureau of Labor Statistics by some Republican supporters last Friday over its report of an improvement in the unemployment rate was still a bit of a shock.
The charge that employees at the BLS manipulated the employment numbers to favor Obama is nonsense as anyone familiar with the calculation of these numbers can attest, but it does bring up a good question. What factors should be considered when assessing the reliability of economic data?
The first thing to consider is how well a particular piece of data accords with what we are actually trying to measure. For example, the total output of goods and services in the economy is relatively easy to define in theory, but does our actual measure give us this information? In developing countries where there is substantial home production that does not get counted, GDP may be a highly misleading measure of total output. Similarly, going back to last Friday, does the unemployment rate adequately reflect the actual unemployment problem in the presence of factors such as discouraged workers and involuntary part-time employment?
Even if a particular piece of data is imperfect, e.g. a measure of total output that excludes significant amounts of home production, it can still provide useful information. A thermometer that is off by a constant, but unknown amount won’t be very helpful if the question is, "Precisely how hot or cold is it today?"
But if the question is about how the temperature is changing over time, whether it’s getting hotter or colder, and if the reading rises three degrees from one day to the next, that’s an accurate measure of the change. Similarly, GDP and unemployment may not be completely accurate reflections of the actual condition of the economy at any given point in time; but if the measurement errors are constant they can still give an accurate picture of whether the economy is doing better or worse relative to the past.
This brings up a second consideration. The usefulness of a particular piece of data depends upon the question being asked. For example, if the question is about the cost of living for a household, the personal consumption expenditures index (PCE) is the best measure we have. But if the question is about the underlying trend rate of inflation that households face, then core PCE – PCE minus food and energy – is a better choice.
It’s also important to realize that when a researcher builds a data set to answer a particular research question, choices are made that, if anything, bias the data against the hypothesis being tested. This avoids accusations that the data were manipulated to produce a result, and it increases confidence to know the hypothesis survived despite the deck being stacked against it. But when the same data are used to answer other questions, these biases can work the wrong way. Thus, care must be taken when data are employed beyond their original research purpose.
What other inaccuracies might there be? One of the more important is the uncertainty associated with sampling. For example, when the Labor Department reports that employment increased by 114,000, as it did in the last report, it is based upon a survey of a subset of the population and there is quite a bit of uncertainty about this estimate. In fact, when the uncertainty is taken into account, there’s a 90 percent chance that the true number lies somewhere between 14,000 and 214,000.
In general, estimates averaged over longer periods of time are better – annual data is usually pretty good while weekly data is, in many cases, so noisy it’s essentially useless – and estimates with larger samples are better than those based upon smaller samples. It’s also good to remember that preliminary estimates are usually based upon incomplete samples, and subsequent revisions can be quite large, sometimes large enough to substantially change the initial picture.
Lastly, there is the problem of seasonal adjustment. The statistical agencies do a good job of adjusting for seasonal fluctuations, but seasonal patterns change over time and seasonal adjustments aren’t always perfect. These errors get smoothed out eventually, but problems with seasonal adjustment can distort the initial picture provided by the data.
There are some countries where economic data that comes from the government should be viewed with considerable suspicion. But in the U.S., there are too many people involved in the data collection and processing effort, and too many checks and balances, for this to be a worry. Sure, transcription and other errors can slip through, though most are caught at some point in the process, the data may be noisy for the reasons discussed above, particularly the initial releases, and politicians often use economic data inappropriately to distort the "facts" in their favor. But there is no reason at all to believe the data have been manipulated to favor a particular point of view.