ASA Connect

 View Only
Expand all | Collapse all

Significant figures

Michael Mout

Michael Mout11-08-2017 11:13

  • 1.  Significant figures

    Posted 10-30-2017 10:55

    All,

     

    I'm involved in an in-house discussion on significant figures. I've searched this community for related posts, but did not come up with anything useful. I know the rules for determining how many digits in a number are significant and am familiar with rounding rules. My question has to do with reporting means and standard deviations.

     

    Somewhere along the line, I heard/read/made up the notion that if your raw data had, say, 3 significant digits, report the mean with one more digit than the raw data and the standard deviation with one more digit than the mean. Does anyone know of any official/reputable publication that verifies or refutes this?

     

    Thank you in advance!

     

    Bruce White

     

    Statistician

    Computare in aeternum

    CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.


  • 2.  RE: Significant figures

    Posted 10-31-2017 04:42

    Hi Bruce,

     

    For preclinical tox type data, we have an SOP in place that uses exactly the same reporting rules that you mention.  I don't have a reference for it, as it was in place before I started.

     

    Dr. Steven C. Denham
    Senior Director, Bioinformatics Sciences

    MPI Research
    54943 North Main Street Mattawan, MI 49071
    Tel: +1.269.668.3336 x1455

    steven.denham@mpiresearch.com

     

    This communication, including attachments, is for the exclusive use of addressee and may contain proprietary, confidential and/or privileged information. If you are not the intended recipient, any use, copying, disclosure, dissemination or distribution is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this communication and destroy all copies.





  • 3.  RE: Significant figures

    Posted 11-01-2017 11:08
    I like Ehrenberg's ideas about tables, and his "The Problem of Numeracy" seems related to your question. See http://www.stat.washington.edu/pds/stat423/Documents/Ehrenberg.numeracy.pdf. His fourth guideline seems brutal compared to stuff I learned in school, which is nearer to your guidelines, but it does seem to communicate well (or, perhaps, it sure seems to do what we humans likely do anyway unless we really need to get into the detail, and that can save us mental energy).

    In his "Tables and Graphs" presentation (http://users.stat.umn.edu/~gary/classes/5303/lectures/TabularGraphical.pdf), Gary Oehlert draws on Ehrenberg and Wainer. Slide 3 gives Wainer's four types of tables, which is probably where some of the disagreement arises: communication may demand Ehrenberg's rule, while storage may suggest following Aubrey Magoun's suggestion here.

    And, of course, there's apparently a standard for this (https://shop.bsigroup.com/ProductDetail/?pid=000000000000285987). For the reference, see "When to use numeric tables and why: Guidelines for the brave" by Sally Bigwood and Melissa Spore (http://www.plainfigures.com/downloads/when_to_use_tables_and_why.pdf).

    Thanks for the Mosteller reference (http://www.medicine.mcgill.ca/epidemiology/hanley/tmp/DescriptiveStatistics/06_mosteller_writing_about_numbers.pdf), by the way!

    Bill
    --
    Bill Harris
    Data & Analytics Consultant
    Snohomish County PUD
    Everett WA
    (425) 783-1790




  • 4.  RE: Significant figures

    Posted 11-01-2017 11:18
    See, too, Andrew Gelman's "Tables as graphs: The Ramanujan principle".  I think the idea is consistent with Ehrenberg and Wainer.  I also like the idea in the linked "Effective communication of standard errors and confidence intervals."

    ------------------------------
    Bill Harris
    Data & Analytics Consultant
    Snohomish County PUD
    ------------------------------



  • 5.  RE: Significant figures

    Posted 10-31-2017 09:43

    Bruce,

     

    I believe this is just a general rule of thumbs.  However, to me there is nothing wrong with reporting all of the digits and letting the reader decide.  Rounding may hide differences which are significant.

     

    Sent from Mail for Windows 10

     






  • 6.  RE: Significant figures

    Posted 11-01-2017 10:14
    An important reason to carry all the digits is for checking or reproducing calculation or analysis procedures. This has nothing to do with the accuracy of which the original data were measured.


    ------------------------------
    Alan Feiveson
    Statistician
    NASA
    ------------------------------



  • 7.  RE: Significant figures

    Posted 11-02-2017 11:13
    Points taken, about not discarding data internal to the calculations. (E.g., for checking and replication of calculations; and of course to minimize rounding errors along the way.) I'd interpreted the original question as being about? reporting of findings at the end--where presumably there is an intended application for the result.


    With respect to differentiating overlapping points on a graph, which was mentioned, the technique of 'jittering' would have much the same effect. As a reason for keeping extra digits, isn't this use of them kind of acknowledging some randomness to the very small apparent differences that would differentiate points on the graph?


    Bill




  • 8.  RE: Significant figures

    Posted 11-01-2017 12:23
    ?I think Andrew is saying that even if you stick with Bruce's rule of thumb (based on the input numbers used), not all the digits displayed in an answer may be practically significant. It ultimately comes down to the precision of the data collection tool/process at the outset....and avoiding implying that the precision was greater.


    I had a great teacher--who had worked in industry--way back when I took some electronics courses, who definitely made his point about this: He gave most of the class zero's (me, too!) on a quiz, nominally based on calculations from observations from our analog oscilloscopes. We all showed 10 or so digits, reading off our calculators; yet one could hardly read data off those old oscilloscopes (even assuming perfect accuracy) beyond a couple of decimal places. His point was that the numbers in all those extra places on the calculator had no reliability; so he felt it was misleading to display them as true results of the measurement.


    It's a lesson I've found very useful, and generalizable. For instance, students may fret about exactly when one should switch from a "type 1" to a "type 2" assumption for conducting a 2 independent sample t test. But try simulating borderline cases where it makes a difference to say, the p-value, that would be noticeable within a practically significant range of decimal places for that value; it's almost impossible (I've tried). (So in that context, I've told students to make a reasoned selection of type--and not worry further about it.) If nothing else, an awareness of limits of practical significance can help focus attention, I've found, on what's the real "punch line" for the finding one's obtained.




  • 9.  RE: Significant figures

    Posted 10-31-2017 09:57
    Significant figures needs to die. It's based upon the assumption that is wrong. It is supposed to denote a certain amount of accuracy/certainty, without bias, of the instrument you used. I.e. all instruments report exact values +/- some value like 10.00ml +/- 0.01ml. If this was true, there would be no need for gage R&R studies. 

    Even if the instruments were that good, if you have a value, to proper sig figs, of 10.0000, the variability would need to be +/- 0.0001. If this is ever not true, the number of sig figs would be wrong. Should we have 10.0000 +/- 0.01, the device is only certain to 1-2 decimal places, not 4. Thus the idea of sig figs is wrong..... Cuz your not as certain as the glassware suggests. 

    All of that assumes chaotic systems don't exist too... SSSSSSSSOOOOOOOOOOO.......

    ------------------------------
    Andrew Ekstrom

    Statistician, Chemist, HPC Abuser;-)
    ------------------------------



  • 10.  RE: Significant figures

    Posted 11-01-2017 17:47

    Hear! Hear!

     

     






  • 11.  RE: Significant figures

    Posted 11-02-2017 23:07
    Significant figures are definitely useful, and definitely in chemistry. I'll give an example.

    I worked in a lab at Harvard for 4 years. We had research assistants using chemistry/biochemistry methods to collect data for epidemiology studies. Everyday the same chemical solutions had to be prepared. For one, the recipe was 15 microliters of an antibody solution and 12.505 milliliters of buffer solution. One research assistant measured out the latter first by measuring 12.5 milliliters using a 15 milliliter pipette and then an additional 5 microliters using a 10 microliter pippette. She didn't know about sig figs - if she did she wouldn't have wasted her time measuring out 5 microliters after using a 15 mL pipette. The uncertainty from the 15 mL pipette is much larger than 5 microliters.






  • 12.  RE: Significant figures

    Posted 10-31-2017 12:08
    Check out Fred Mosteller's chapter, "Writing About Numbers" in his book (with John Bailar), Medical Uses of Statistics.

    Here's a scanned copy that Google found for me:

    http://www.medicine.mcgill.ca/epidemiology/hanley/tmp/DescriptiveStatistics/06_mosteller_writing_about_numbers.pdf


    Ralph O'Brien, PhD
    Retired Professor of Biostatistics
    (but still keenly professionally active: http://rfuncs.weebly.com/about-ralph-obrien.html)
    Dept. of Population and Quantitative Health Sciences
    Case Western Reserve University
    910.553.4224; Cell: 216.312.3203

    "The best classroom in the world is at the feet of an elderly person."
    "We don't ask to get old. We just get old. And if you're lucky, you may get old, too."
            ― Andy Rooney, 1919-2011, American Radio and TV writer, CBS "60 Minutes" weekly commentator








  • 13.  RE: Significant figures

    Posted 11-01-2017 01:43
    Bruce,
    This rule has bothered me for a long time and I too have asked others for a reference. I am occasionally given references regarding significant figures. I think that the rule is impractical. I am switching to one additional DP for means and medians, with the same for measures of variability.
    David

    ------------------------------
    David Bristol
    Statistical Consulting Services, Inc.
    ------------------------------



  • 14.  RE: Significant figures

    Posted 11-01-2017 11:00
    The rule that I think I remember from Shoemaker and Garland's textbook for physical chemistry experiments is to choose the number of significant digits in the reported results based on the breadth of the confidence interval.

    But your question reminds me of the USP rounding rules which I think make little sense.  Those rules say to report the individual numbers to the same number of decimal places as there are in the specification.  That leads to loss of information and to graphs wherein data points hide each other.

    ------------------------------
    Emil M Friedman, PhD
    emilfriedman@gmail.com
    http://www.statisticalconsulting.org
    ------------------------------



  • 15.  RE: Significant figures

    Posted 11-01-2017 13:32
    Bruce,

    Both in preclinical and clinical research, we always did (and my friends that are not retired yet still do) what you "... heard/read/made up the notion" of using an extra digit in the mantissa for a mean or median and a further more digit for the SD/SE.  If one investigates properties of sampling distributions of observed data, this procedure may have some impact on the tail probabilities of some findings.  Many many years ago, I attended a seminar that Sir Ronald Fisher gave at the Indian Statistical Institute in Calcutta, India (before he left India for Australia for good).  The same question was raised by some individual in the audience.  His answer was very similar.  I would check in Fisher's classical books to see whether he addresses this question.  Henry Mosteller's chapter that Ralph O'Brien provided is an excellent way addressing various related issues, not necessarily the exact one. Let us play with a simple case of hypothetical observations: 1,2,3,4.  Both the mean and median of this set is 2.5.  Obviously we have to use the number "2.5" for either the mean or the median of this set thereby including ".5" in the mantissa.  Often if one does not include a second number in the mantissa beyond that for the SD/SE of the mean or median in hypothesis testing, one might disrupt the tail probability of significance depending on the magnitudes of the numbers being tested.  If the numbers being tested are large, it may not make iny difference if one does not include extra number or two in the mantissa beyond the mean/median.  Hope this help.

    Ajit K. Thakur, Ph.D.

    ------------------------------
    Ajit K. Thakur, Ph.D.
    Retired Statistician
    ------------------------------



  • 16.  RE: Significant figures

    Posted 11-02-2017 12:08
    Suppose I have a digital thermometer that reads to 1 decimal point. If I take 100 readings and get a confidence interval of (10.25555, 10.25558). According to Sig Figs, I have to round this up to 10.3 because I am "only" certain of the first decimal. So, do I believe in statistics or non-sense?

    Suppose I have a digital thermometer that reads to 1 decimal point. If I take 100 readings and get a confidence interval of (9.0, 11.0). According to Sig Figs, I have 3 because I am "certain" to the first decimal. Stats says I can't confidently say if the temperature is 9.x or 10.x. So, do I believe in statistics or non-sense?  

    If I go out and buy 10 10.00ml volumetric pipettes, the manufacturer will send me a "certificate of analysis" claiming the pipettes will deliver 10.00ml +/- .02ml. If I conduct a Gage R&R study on the pipettes and find that some deliver 9.94mL +/- .05ml, some deliver 10.03mL +/- 0.01ml and some deliver 10.00ml +/-0.0001ml. How many sig figs do I have now? So, do I believe in statistics or non-sense? 

    If I design an experiment and get my regression model and optimize it, will I get different results for my optimal solution if I use sig figs vs the 15 digits the software uses? (Sometimes)  Can those different optimal solutions be very different? (yes) So, do I believe in non-sense? 

    For those that want to defend non-sense, well....

    ------------------------------
    Andrew Ekstrom

    Statistician, Chemist, HPC Abuser;-)
    ------------------------------



  • 17.  RE: Significant figures

    Posted 11-03-2017 08:18

    I've also struggled with significant digits reporting.  Part of the problem, as Andrew's post illustrates, is that almost any rule can be defeated by artificial examples.  In many years of calculating CI, I've never seen one where a reasonable degree of rounding produced two endpoints both of which were outside the interval and on the same side!  


    By the way, with 100 observations on a thermometer producing a mean of around 10.25, I estimate that the CI would probably be around 10.24 to 10.26, and I would use those endpoints without rounding more than that.


    I think the critical issue is how much precision is needed. For the pipettes example, if we want pipettes that deliver very close to 10, an error of 0.01 might not matter, but an error of 0.06 might. So we need sufficient precision to pick that up. If the thermometer data (with CI 10.2555 to 10.25558) was testing a hypothesis that this object would be warmer than the standard object (10.25 degrees) by at least 0.001, then maybe we really need many significant digits!


    I often use (and somebody mentioned) the range of the variable as a guide. For example, with percentages that vary from 0 to 100%, it is often sufficient to round to a whole number, or maybe one decimal place. On the other hand, for percentages of success that only range from 80 to 90 across groups, I would probably keep at least 1 decimal place.  Some judgment required!


    Ed


    Ed J. Gracely, PhD
    Associate Professor
    Family, Community, & Preventive Medicine

    College of Medicine

    Associate Professor

    Epidemiology and Biostatistics

    Dornsife School of Public Health

    Drexel University
    2900 W. Queen Lane,
    Philadelphia PA, 19129

    Tel: 215.991.8466 
    | Fax: 215.843.6028
    Cell: 609.707.6965

    eg26@drexel.edu (egracely@drexelmed.edu forwards)
    drexelmed.edu  |  drexel.edu/publichealth






  • 18.  RE: Significant figures

    Posted 11-06-2017 09:50
    If we think about what Sig Figs are suppose to do, give an idea about accuracy, precision and variability of the reading, Sig figs don't do that. A confidence interval does. 

    What is the point of having 4 decimals of sig figs if the CI is ?.???? +/- 0.0XXX? The CI says it's only accurate to 2 decimal places. Why do I need 4 decimal places? Why should I report 4 decimal places? 2 - 3 is enough. 

    My whole argument is this: Use confidence intervals and common sense. Sig figs are stupid.


    Though, I will admit artificially truncating decimal points in a calculation led to one of my favorite areas in mathematics, Chaotic Systems. If it wasn't for a computer using 6 decimal places but printing out only 3 decimal places, we wouldn't know about chaotic systems, fractals, etc. (at least according to the story in my text book)

    ------------------------------
    Andrew Ekstrom

    Statistician, Chemist, HPC Abuser;-)
    ------------------------------



  • 19.  RE: Significant figures

    Posted 11-03-2017 08:22
    "Suppose I have a digital thermometer that reads to 1 decimal point. If I take 100 readings and get a confidence interval of (10.25555, 10.25558). According to Sig Figs, I have to round this up to 10.3 because I am "only" certain of the first decimal. So, do I believe in statistics or non-sense?"

    Non-sense.

    The # of sig figs is the # of digits known with certainty + 1. If you had 3 sig figs and the value is 10.3, this indicates there is uncertainty in the first decimal place. But you have claimed you know the first decimal place with certainty. But you cannot both have 3 sig figs and know the first decimal place with certainty.

    Also, CI is based on SE, whereas precision reported in chemistry is usually +/- 1 SD.

    I say your claim is non-sense. Sorry.







  • 20.  RE: Significant figures

    Posted 11-03-2017 10:37
    some thoughts:

    Average # of kids in a US household = 2.4 and yet the values are obviously 0,1,2,3.  Is this sense or non-sense?

    the stats camp typically wants more digits based on their concern of the impact on calculation, summary values, etc
    the concern is around the mean (typically), not the measurement system that produced the individual values

    the measurement camp does not want to "over-sell" the precision of the method, so they (often) limit the digits

    So, the two camps have competing goals.


    ------------------------------
    Brad Evans
    Associate Director
    Pfizer, Inc
    ------------------------------



  • 21.  RE: Significant figures

    Posted 11-06-2017 08:59
    For the last 30 years I have been a statistician but my original training was as a physicist. As part of my physics training I had to take a course in error measurement and reporting. The primary perspective was measuring & reporting maximum possible error not the the average error. I believe that is the crux of the debate here. A scientist tries to avoid being over confident in their experiment results & what is reported. They ask the question: What if They did every measurement wrong in the same direction. On the other hand, A statistician is prepared to validate an erroneous theory or result every once in a while. I think a lack of a scientific perspective is part of the problem we are having with reproducibility.

    Sent from Jack's iPad




  • 22.  RE: Significant figures

    Posted 11-06-2017 12:31
    The primary perspective was measuring & reporting maximum possible error not the the average error. I believe that is the crux of the debate here. A scientist tries to avoid being over confident in their experiment results & what is reported. They ask the question: What if They did every measurement wrong in the same direction.

    This is a crucial point. Significant figures are designed to deal with systematic in addition to random error. If every observation is systematically inflated (because the measurement device is miscalibrated--though within the bounds of the reported significant figures) then Andrew's confidence interval will be way off.


    ------------------------------
    Jonathan Christensen
    ------------------------------



  • 23.  RE: Significant figures

    Posted 11-07-2017 03:05
    I fully agree that "lack of a scientific perspective is part of the problem we are having with reproducibility". More generally I think that reporting of numbers should depend on the context and objectives. This I think is consistent with http://www.medicine.mcgill.ca/epidemiology/hanley/tmp/DescriptiveStatistics/06_mosteller_writing_about_numbers.pdf supplied by Ralph G. O'Brien (page 305, "there are no absolute rules, and good practice depends on what sort of document is being prepared"). Most of the views presented there still apply, although space is less expensive now in most journals. In some contexts, including error measurement and reporting in physics, the rules based on significant digits are appropriate. In other contexts, including empirical work in most health sciences, other considerations may be more important. Most important among these considerations is, I thinnk, interpretability. Numbers should not be reported with more digits than is important for interpretation (perhaps with one more digit in some contexts). In particular, this usually implies that means and standard deviations should be reported with the same number of digits because they have comparable interpretations. Concerning confidence intervals there are cases with large data sets where the confidence limits are equal when reported with an interpretable number of digits. While I generally prefer reporting confidence intervals and not standard errors, I think that in these cases it may be better to report the sample size, mean and standard deviation, and perhaps state that the confidence interval is very narrow due to the large sample size. All this said, there are cases where much more precision is needed, see e. g. http://www.medicine.mcgill.ca/epidemiology/hanley/tmp/DescriptiveStatistics/06_mosteller_writing_about_numbers.pdf , last paragraph page 311/first paragraph page 312 ("Tables can have two main purposes ,... report is not always necessary", also the paragraph "The discussion above ... for example" page 312, and "Although fewer digits promote understanding, it is also true that many digits may be required for internal comparisons. [Calculation is a different matter, and is not treated here)" page 314). Increasingly, systematic reviews including meta analyses are performed based on published work. In these analyses published numbers are used as a basis for further calculations, and if possible intermediate results in calculations should not be rounded. This may be solved by including electronic supplements in the articles where numbers are available without rounding. Even better, in accordance with the principles of transparency and reproducible research, data should be made publicly available when possible, but in the health sciences this is seldom possible due to patient privacy.

    ------------------------------
    Tore Wentzel-Larsen
    Norwegian Centre for Violence and Traumatic Stress Studies
    Regional Center for Child and Adolescent Mental Health, Eastern and Southern Norway






  • 24.  RE: Significant figures

    Posted 11-03-2017 18:10
    Andrew,



    I'm confused about your argument, because your first example is not actually possible. But I think it's useful to explain why I say that, because it illustrates why the precision of input measurements really matters.



    Your example was: .... a digital thermometer that reads to 1 decimal point. If I take 100 readings and get a confidence interval of (10.25555, 10.25558). According to Sig Figs, I have to round this up to 10.3 ....



    But how could this case ever occur? The suggested confidence interval is equivalent to:

    mean +/- (2 x standard error), where mean = 10.255565, and the standard error equals 0.0000075.

    Since n = 100, the full population's standard deviation sigma must = 0.0000075 x sqrt(100) = 0.000075



    So if the population's roughly normal, virtually every population value is likely to fall in the range from mean +/- (5 x sigma), i.e. from 10.25519 to 10.25594. If a digital thermometer reads this data to one decimal point, then every single reading would be 10.3; the proposed confidence interval of (10.25555, 10.25558) could never be observed.



    I hadn't thought of it quite this way before, but now I see why my very pragmatic-oriented electronics teacher that I wrote about earlier was so adamant against displaying multiple extra digits in final answers. He wasn't the sort to debate about the theoretically "best" numbers of digits. But the above case highlights that if the measurement device could never obtain or confirm a claim that depends on all those extra digits, something's gone awry.





    ?




  • 25.  RE: Significant figures

    Posted 11-08-2017 10:38
    Observing a (discrete) random element in 0.1*{non-negative integers} and modeling it as a (continuous) random variable does sound problematic. BTW what is the statistical methodology for fixed-precision random elements, is it "sig figs" or something else?




  • 26.  RE: Significant figures

    Posted 11-08-2017 11:13
    Edited by Michael Mout 11-09-2017 14:29


  • 27.  RE: Significant figures

    Posted 11-09-2017 08:44
    ​I would like to thank everyone for their contributions to this discussion. I especially appreciate all the suggestions about reporting data in tables. Allow me to narrow the focus a bit.

    The reason the notion of significant figures is being discussed internally deals with how data should be recorded in a database. Some data, like viscosity, can be in the thousands, whereas other measurements might be on a scale of 0 to 1. Let's take the example of pH. If a pH meter is good to one decimal place, but the SOP calls for 3 measurements to be taken and the average recorded, then according to my original post, my suggestion is to record it to 2 decimal places. If it is also desired to record the standard deviation of those readings (not very likely, but some people seem attached to reporting standard deviations), my suggestion would be to record one more decimal place than for the average.

    What I was wondering about was if there was any advice or rules of thumb in the literature that was similar to what I have just outlined.

    Once again--thank you all!

    Bruce

    ------------------------------
    Bruce White
    Statistician
    Ecolab
    ------------------------------



  • 28.  RE: Significant figures

    Posted 11-10-2017 09:57

    Hmmm, do you think you might need one rule for pH and a different rule for viscosity?


    Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.





  • 29.  RE: Significant figures

    Posted 11-10-2017 11:32
    If we are just talking about databases, extra digits can't hurt you.  And, with averages of three, you might want two or more extra digits to reduce round-off error.

    ------------------------------
    Emil M Friedman, PhD
    emilfriedman@gmail.com
    http://www.statisticalconsulting.org
    ------------------------------



  • 30.  RE: Significant figures

    Posted 11-10-2017 11:34
    On second thought, it might be better to record all three readings.  That will reduce the likelihood of computational mistakes and will also help detect odd-ball readings.

    ------------------------------
    Emil M Friedman, PhD
    emilfriedman@gmail.com
    http://www.statisticalconsulting.org
    ------------------------------



  • 31.  RE: Significant figures

    Posted 11-11-2017 07:48
    Bruce,

    If you are to take averages of 3 measurements, I would record all measurements to the full precision reported by the instrument and then use the average, not rounded at all. You would round once you are done with all calculations involving those measurements, not for intermediate calculations.by rounding for intermediate calculations you introduce measurement error. The average of x, y, z is just that, it is not that average rounded to sig figs.

    The SD could be useful, but most analyses won't incorporate it. Instead, just use the single average, unrounded, as a single measurement.






  • 32.  RE: Significant figures

    Posted 11-12-2017 09:27
    I am not sure one can find any definitive rule about reporting means and SDs/SEs.  In my fifty years of research, teaching, and consulting, we always espoused what you are doing; e.i. report one additional point than the raw data for the mean and one more than the number of decimal points for the variance/SD/SE.  Some people seem to forget or ignore the fact that when we do hypothesis testing (like it or not, it will be around), we are actually testing the equality of distributions.  The means and SDs etc. are the two parameters used in most cases assuming approximate normality.  Bruce, I would not change any part of your practice regarding these significant figures.

    Ajit K. Thakur, Ph.D.
    Retired Statistician 





  • 33.  RE: Significant figures

    Posted 11-13-2017 11:21
    I think the answer goes to the definition of "significant" (compared to importance, interpretability, clinical significance, statistical significance,...).
    Suppose weight (kg) is measured to 1 DP
    with mean (SD) of 78.23 (3.872).
    It is a stretch to think that any human will interpret this as anything but 78.2 (3.9).
    Only the first DP is "significant" for interpretation.
    Thus 78.23 (3.87) should satisfy all of the "sig fig" people.

    David

    --
    David R. Bristol, PhD
    President, Statistical Consulting Services, Inc.
    1-336-293-7771





  • 34.  RE: Significant figures

    Posted 11-14-2017 08:24
    I'm surprised this thread has gone on so long.  Some participants have been quick to ridicule the practice, which is a shame, because it's often unwise to be too dismissive of others.

    Some have pointed out that it has to do with physical sciences that rely on measuring devices of finite precision.  The physical scientist wants to sharing meaningful results, even though the results are based on math that's based on such imprecise measurements.

    I remember that in Chemistry classes, I learned that it's kind of misleading to use a device that measures to the precision of a tenth of a millimeter, divide 12.34 millimeters by 3 (for example), and casually report a result about 4.113333 millimeters.  Some reader who cares about the difference between 4.113333 and 4.113335 millimeters certainly shouldn't take your result seriously, because your device only is good at telling the difference between 4.1 and 4.2 or 4.0 and 4.1.  You are misleading the reader about the precision of your results when using so many digits without comment.

    Personally, I think it would be fine to show all the digits if you document your workflow.  These days that's becoming more acceptable as reproducible research gains favor.  I'd go so far as to say that the research results should state measuring device specifications and the software versions used for calculation on collected data.

    Googling for "chemistry significant figures" finds several nice references.  The one below says that the significant figures are the ones that are "meaningful."

    If the use of significant features is an attempt to avoid overstating the precision of results, I'm not sure how chaos theory can be used to demonstrate that it's misguided.  (A couple posts sounded like, "Significant figures?  Hello?  Chaos theory exists!  QED.")  I'd be happy to learn, though, if there's a logical argument to be made there.

    Significant Figure Rules
    Psu remove preview
    Significant Figure Rules
    Significant Figure Rules There are three rules on determining how many significant figures are in a number: Non-zero digits are always significant. Any zeros between two significant digits are significant. A final zero or trailing zeros in the decimal portion ONLY are significant. Focus on these rules and learn them well.
    View this on Psu >


    ------------------------------
    Edward Cashin
    Research Scientist II
    ------------------------------



  • 35.  RE: Significant figures

    Posted 11-15-2017 08:28

    Edward,

     

    I had read those rules from PSU before. To be more accurate, I visited that page before and skimmed those rules. This time I read the addition/subtraction and multiplication/division rules more closely. My original question in this thread was about means and standard deviations. I have operated under the rule of thumb of reporting the mean to one more decimal place than the raw data and the standard deviation to one more place than the mean. However, looking at the PSU rules, I don't think I should do that, but report both to the same precision as the raw data.

     

    That being said, since my follow-up question had to do with recording data in a database, the suggestion to store all the raw numbers and not the average of X number of readings was a good one that I will promote.

     

    Thank you again to everyone who has participated in this discussion! I found it very useful and enlightening!

     

    Best Regards,

    Bruce White

     

    Statistician

    (651) 795-6534

    Computare in aeternum

    Bruce.white@ecolab.com

     

    CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.





  • 36.  RE: Significant figures

    Posted 11-16-2017 16:53

    When I was working with a land surveyor many years ago, there was another wrinkle on the significant figures rule.  If you were dividing by a pure number, such as finding an average of a large number of lower significant figure numbers, they felt you could actually increase your available significant figures.  This has been a time-honored technique in that field, and some of the early surveys improved their accuracy by "duplicating" their measurement a large number of times.  One would have to understand the theodolite to understand how this could be done and be justified, but I think the logic is reasonable.  The combined error was assumed to increase by the square root of the sum of the squares of the round off error.  If you added two numbers, with a possible round off error of 0.5, then the sum would have an error of SQRT(2)*0.5 = 0.7071....  If you divided that number by the pure number 2 to get the average, you would have reduced the error to 0.3536.  If one averaged 100 numbers this way, conceivably, one could gain another whole digit of significance.  This is land surveyor logic not statistician logic, but this part works like the standard error of the average of a set of measurements.  While we might want a 95% confidence interval, they were content with more like a 68% confidence interval.


    Raoul Burchette

    Biostatistician III

    Research and Evaluation, SCPMG

    100 S. Los Robles Avenue

    Pasadena, CA 91101

    message phone:  626-564-3471 (8.338.3471)

    email:  Raoul.J.Burchette@kp.org


    NOTICE TO RECIPIENT:  If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents.  If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them.  Thank you.






  • 37.  RE: Significant figures

    Posted 11-17-2017 04:04
    Given two identical measurements 5.67, with 3 sig figs, if I average them, I first sum them to 11.34. the sum has 4 sig figs according to the addition/subtraction sig figs rule. You then divide this by 2, an exact number, so the average then also has 4 sig figs, 5.670, which is more sig figs than the original measurements. So the average can have more sig figs than the measurements according to the standard sig figs rules.






  • 38.  RE: Significant figures

    Posted 11-15-2017 09:32
    I (and lots of others) have written about this topic - four refs below


    Chapter 10 in Visual Revelations: Graphical Tales of Fate and Deception from Napoleon Bonaparte to Ross Perot. (2nd edition) Hillsdale, N. J.: Lawrence Erlbaum Associ­ates, 2000

    Rounding Tables. Chance, 11(1), 46-50, 1998.

    Extracting Sunbeams from Cucumbers. (with R. Feinberg)  Journal of Computational and Graphical Statistics Dec 2011, Vol. 20, No. 4: 1-18.

    https://www.significancemagazine.com/science/297-extracting-sunbeams-from-cucumbers


    But the short version is that it is rarely useful to have more than three digitss.

    This is for three reasons:

    1. Humans can't understand long strings of numbers. Walmart's corporate revenue in 2010 was $256,317,134,212 but if that number was on a screen for a corporate report, you'd just say, "$256 billion"
    2. We almost never care about accuracy of more than three digits:If you round to $256 billion the percent error introduced by rounding is .1%
    3. We can only rarely justify more than three digits of accuracy statistically:

    Suppose we have a correlation that we calculate to be equal to 0.7654. What is the sample size that would justify reproducing this number to this level of precision? We would want a standard error to be less than 0.00005, for it is only in this situation that we have a reasonable chance for the last digit to be a 4 and not a 3 or 5. We note that the standard error is proportional to one over the square root of the sample size (1/sqrt(n)). And so, doing the algebra yields standard error = 0.00005 = 1/SQRT(n), or SQRT(n) = 1/0.00005 = 20,000,or
    n = (20,000)2 = 400 million.

    I leave as an exercise for the reader to calculate the sample size necessary to justify reproducing a correlation to more than one decimal place (hint: n > 400). 


    The Nobel Laureate economist George Stigler once said that economists produce the GDP to the nearest dollar to show they have a sense of humor.








  • 39.  RE: Significant figures

    Posted 11-16-2017 11:33
    John Hartigan once said to me, in reference to a well-known statistics package, "It's magic. You put in a few single-digit numbers and you get out a huge pile of 15-digit numbers." 





  • 40.  RE: Significant figures

    Posted 11-17-2017 09:40
    It may be of interest to thin k about why statistics packages do this.
    My own theory is that getting answers like this is an anachronism left over
    From FORTRAN FORMAT statements that used a general F12.5 output for everything, so as to not miss anything.

    For younger readers who never heard of FORTRAN or its FORMAT statements, ask your parents (grandparents?).






  • 41.  RE: Significant figures

    Posted 11-16-2017 13:00
    Howard, I quite like the cucumbers article: clear in process and in the message of the content. 

    I'm curious: in hindsight, might it have been more insightful to make the rightmost column in table 6 a rate, too?  Perhaps not; perhaps there are too many additional factors (density and distribution of population centers, etc.) that would make office rates no more insightful than a count of offices, and the count is certainly a number people can relate to.

    Bill

    ------------------------------
    Bill Harris
    Data & Analytics Consultant
    Snohomish County PUD
    ------------------------------



  • 42.  RE: Significant figures

    Posted 11-18-2017 15:26
    Hi Bill,
    I used just the number of Planned Parenthood offices first because the result is so dramatic (the relationship between that number and the incidence of STDs is interocular (it hits you between the eyes)! Any further refinement would be painting the flower. But such refinements are easy to think about (e.g., how long a drive it is to get reproductive help in Texas). But I'm glad you liked the example. It makes it clear how well a tabular presentation can work, if you work at it -- and how opaque it can be if you don't.
    HW

    ------------------------------
    Howard Wainer
    Extinguished Research Scientist
    ------------------------------



  • 43.  RE: Significant figures

    Posted 11-15-2017 12:07
    It was a lot easier back when we used slide rules for everything.

    ------------------------------
    Emil M Friedman, PhD
    emilfriedman@gmail.com
    http://www.statisticalconsulting.org
    ------------------------------