The necessity of uncertainty: Part 2
OK, so error bars are good things, if you believe the last post. So what else is there?
Simply put, in many cases uncertainty is not a scalar, it is a matrix, and often a really big matrix. To know the absolute uncertainty on one point, you have to know how it covaries with other points. To be able to manipulate results, you really are helped by knowing the full set of covariances.
In tomography, for instance, it is matrix the size of the model space for each point; how the uncertainty at one point covaries with other points matters. Consider for starters a trivial case: a block of rock with two parts and measurements are made of the travel time of a seismic wave through the block:
The time for the seismic wave to travel through the blocks, t, is (w/v1) + (w/v2). Seismologists often work in slownesses, which are 1/velocity, so an equivalent expression is t = w*s1+w*s2. Obviously with one equation and two unknowns, we cannot say much of anything about s1 or s2. But let’s say we have an idea for v1 and an uncertainty, so then we have s2=t/w-s1, and if we stick in that uncertainty for s1 we can get an uncertainty for s2. But here’s the catch: if s1 is actually higher than the value we estimated, then the value for s2 must be lower: they are not independent.
This might be slightly more apparent with numbers. If w=12 and t is 7, let s1 be 0.33 +/- 0.08. From this we get s2=0.25 +/- 0.08. But here’s the thing: if s1 is actually 0.25, then s2 must be 0.33 if t is perfectly known. The errors are correlated–they covary. If s1 was estimated to be higher than its actual value, then s2 is lower.
OK, well how is that helpful? Imagine that somebody does the inversion and reports s1=0.33 +/- 0.08 and s2= 0.25 +/- 0.08 and some other worker just needs the average velocity. Well, they take the published numbers and get 0.29 +/- 0.06, where the uncertainty simply reflects the assumption that the individual uncertainties are independent–but they are not. The thing is, we know very precisely what the average slowness is: it is 7/24 or a bit above 0.29–but the uncertainty is near zero, not ~20%. Because of the high correlation (or covariance) of the uncertainties, the uncertainty of the total is far smaller than of individual pieces.
This exact logic can be applied to receiver functions (which, for those who haven’t seen them, are kind of a spike train representing P to S converted energy in a seismogram); long ago GG generated receiver functions with uncertainties. These were quite large and indicated there were no signals outside of noise in the receiver function. But when you applied a moving average and kept track of these covariances, the uncertainties rapidly dwindled, revealing a number of significant signals that were actually present in the receiver functions.
Not surprisingly, similar issues exist in tomography. Say we have a tomographic model based off of local earthquake travel times with nodes every 100m. Odds are pretty good that in parts of the model, you can fit all the data just as well if you increase the wavespeed at one node and decrease it at an adjacent node. Your uncertainty at that point might be awful. But if you want the average wavespeed over a volume of points, that might prove to be pretty robust as the covariances collapse.
Unfortunately the creation of these kinds of covariance matrices is not common (they fell out of favor in tomography, for instance, once fast and compact matrix inversions made the creation of these awkward and time consuming), but even if they were around, they are not trivially presented in a paper. The replacement in tomography has been “checkerboard tests” and spike tests, where a simple anomaly is introduced, observations from such an anomaly are calculated, and then these fake observations are inverted to see how well the inversion recovers them. But the degree to which the anomalies are recovered depends on the geometry, and it is impossible to test all geometries, so in some cases the tests make it seem that the inversion is far worse than it is, and in some cases far better. (A test GG has never seen or done himself would be the opposite of the spike test: introducing a smooth gradient across a model and seeing if the inversion could catch it. GG suspects there is a minimum gradient necessary to produce a signal. Maybe something to try one day when there is spare time…)
This isn’t limited to inversions of observations. Complex models often have internal dependencies that effectively produce covarying signals from a common error. Of course, these can in theory be estimated from the underlying mathematics, but many modern models are so complex that it isn’t remotely obvious in many cases just how things are tangled up. For instance, dynamic topography models essentially depend upon a load and a rheology, but if both are being estimated from a seismic model and the seismic model depends upon an a priori crustal model, then errors in the crustal model can bleed all the way through to topography in a fairly non-intuitive manner.
So GG suggests that a frontier associated with Big Data and Big Models and Big Computing is the understanding of, ahem, Big Uncertainty. Because without that understanding we are left with…yeah, you can see it coming, big uncertainty.