Showing posts with label error propagation. Show all posts
Showing posts with label error propagation. Show all posts

Sunday, February 18, 2007

Can't we all just propagate our errors?

This weekend I continued to work on a lab software upgrade for one of the labs I work in. This is all part of an effort on my part to streamline sample analysis, automate as much as possible, and make the data reduction smoother and more versatile. For those of you who care, I am doing everything with LabView, my programming platform of choice. My first step in the process is to go through the existing sets of code, figure out what does what, and decide if I want to copy the process or change things around.

This got me thinking, again, about error propagation. Most of the significant errors in the existing code are dealt with appropriately, but there are some very easy ones that are just ignored. Truth is, most calculations are fairly easy to propagate errors through, and I am a firm believer that uncertainty must be dealt with honestly, even if in the end it is insignificant. The most popular geochronologic data reduction tool (I am guessing) is Ken Ludwig's Isoplot. This does an excellent job with uncertainties, assuming of course the users report them all. But, Isoplot doesn't interface with the machines, and it can only work with the data it is given.

For this reason I think that the book An Introduction to Error Analysis by John R. Taylor (a physicist from the University of Colorado) should be required for anyone who works in a lab or with the data from a lab. All of the basic methods to propagate uncertainties are covered and explained very well. This doesn't discuss some of the more complicated methods common in geochronology (e.g. the MSWD), but for a starter and reference text it is well worth it. It also isn't directed towards the earth sciences, but again, the basics are the basics. For example, to calculate the size and correct for the depletion of a spike or standard shot of gas, you need to know the reservoir tank and pipette volumes, and the pressure of the gas in the tank (partial pressure if your gas isn't pure). Associating uncertainties with each of these measurements is fairly straightforward, so any number that comes out of these values, say size of spike, or a correction for ionization efficiency, should include those propagated uncertainties. Numbers without uncertainties really don't exist in geology, but they show up in data reduction all the time.

Dealing with errors in a realistic and representative way is the first step. Getting people to report them, even when they are crappy, is the second.

Wednesday, January 10, 2007

Fitting line through Data #2




So here is the published version of the graph I posted previously, this time you can see the y-axis value and the line drawn through the data points. As I mentioned in my fist post on this topic, my biggest beef is that the uncertainties associated with the salinity measurements, or even the time measurements, are not included in the graph or mentioned in the article or figure caption. One can imagine that if the uncertainties associated with these measurements are large, maybe 1 or 2 ppt, then drawing any kind of curve, let alone one with so many precise dips and peaks, is very sketchy.

This will hopefully be a regular feature of this blog. As I mentioned, I left graduate school with a very heightened awareness of error propagation and a very critical eye for how data are presented. Just because something is published does not mean it is flawless.

It is entirely possible that the shape of this curve is based on other factors, perhaps other data sets, or more likely some known and predictable relationship between time and salinity that these data happen to agree with. If that is so, it was not mentioned in the text of the article.

Show your uncertainties!

Saturday, January 06, 2007

Drawing lines through data, predetermined patterns, and good imaginations


One of the topics my PhD advisor was the most concerned about was user bias in presenting data. I spent a lot of time learning how to propagate errors through calculations, and how to decide what kind of curves are appropriate to fit to various 2D data sets. I came across an article in a recent publication (pub and author name withheld for now) that got me thinking. Above I have posted a modified figure from this publication, the x-axis is time, the y-axis is a value measured in sediment cores. Take a few minutes and decide how you would draw a line through this data if you needed to, I will come back with their version in a few days. I have erased everything but the data from this figure.

Correlating data with lines or curves can of course be informed by expected patterns, where one value has been shown to vary regularly with the second. I am not an expert in the branch of earth science discussed in this article, so perhaps I am missing something. That being said, enjoy.

One more note, my first beef is the lack of uncertainties associated with the data points. They should either be on the figure or there should have been a disclaimer in the figure caption mentioning that the uncertainties are smaller than the size of the symbol. You will have to take my word that it doesn't.