ASA Connect

 View Only
  • 1.  500-year flood

    Posted 09-05-2017 12:20
    In the Wednesday, September 30 edition of the NY Times, there appeared an article about the Houston flooding in which the author refers to the term, "500-year event." It was defined as follows: "A 500·year event has a I in 500 chance of occurring in a single year."  Needless to say, I was a bit puzzled.  I could not envision what probability model was used.   One colleague referred me to the Wiki site (among others) which defined the probability from an empirically defined flood flow rate in Germany.  The model presented included the assumption of a constant parameter which was later described as not actually constant.  No one else in my department seems to know what model was employed.

    So I am asking the statistical community: What probability model is - or can be - used to describe such events?

    Collegially,
    Gene Fisch, Adj. Prof.
    CUNY/Baruch College


  • 2.  RE: 500-year flood

    Posted 09-06-2017 02:25
    Hi Gene,

    Five-Thirty-Eight posted last week a detailed discussion of the concept and some of the assumptions used for the model. I've been pondering whether to bring it up in class when covering probability models.
    It's Time To Ditch The Concept Of '100-Year Floods'
    FiveThirtyEight remove preview
    It's Time To Ditch The Concept Of '100-Year Floods'
    Photos of water-covered neighborhoods and families riding floating refrigerators to safety have made clear the scale of Hurricane Harvey's wrath. But the risks that coastal Texans faced before the storm hit - and the probability that others will be dealt a similar fate - are still a confusing mess.
    View this on FiveThirtyEight >


    ------------------------------
    Brigitte Baldi
    Lecturer
    University of California, Irvine
    ------------------------------



  • 3.  RE: 500-year flood

    Posted 09-06-2017 12:03
    I don't know how the NYTimes got to a 500 year flood, however a very good description of the distributions of extremes can be found in Gumbel's book.  The basic mathematical idea is that
    if F(t) is the distribution of time to event, then the distribution function of the maximum time to event for n events is [F(t)]^n.  Tibbet's three asymptotes of extremes describe the three solutions to this when we known E(t) and some measure of spread.  One solution requires that there be a essential supremum, and so that is out.  The asymptote usually used is that the distribution of the extreme is a three parameter model for which the parameters can be estimated from observed samples.  Then, it is a simple matter to find a value of t associated with an appropriately low probability.

    David Salsburg





  • 4.  RE: 500-year flood

    Posted 09-06-2017 15:10
    The statistical foundation of 100 and 500 year floods is based on
    extreme value theory. There is an alternative approach based on fractal
    dimensions. A good starting point for this topic is Wikpedia:

    https://en.wikipedia.org/wiki/100-year_flood

    and a nice introduction to fractals and floods is at

    nvlpubs.nist.gov/nistpubs/jres/099/jresv99n4p377_A1b.pdf

    A good recent article that talks about both the statistics and the
    physical measurements that are associated with these calculations is

    https://fivethirtyeight.com/features/its-time-to-ditch-the-concept-of-100-year-floods/

    I'm not a big fan of Nassim Nicholas Taleb and his black swan arguments,
    but his perspective would be valuable in studying recent events like
    Hurricane Harvey.

    --
    Steve Simon, mail@pmean.com
    I'm blogging now! blog.pmean.com




  • 5.  RE: 500-year flood

    Posted 09-07-2017 03:18
    What you are getting at is a reason we, as ASA members, statisticians, data scientists et al, need to get out of our offices and into the offices of scientists and engineers. 

    If you look through some of the ideas, assumptions and math models used in hydrogeology, you find that they probably make sense. When you look at how they perform a sensitivity analysis, they will change 1 parameter at a time by adding and subtracting some nominal value, say X1 +/- 0.05X1. There might be 5 parameters they can change. In such a case, they will make 11 models. One where everything is at the measured value, then change one value at a time as I described above. I had my hydrogeology prof tell me my Monte Carlo simulation was wrong because I changed many things at the same time. After all, what does an operations researcher know about simulations and stats?

    It wouldn't surprise me to find out that someone, somewhere, with little knowledge of stats, made a model based upon a normal distribution, found that a water level, rainfall level etc, over say 70 years of data collection (often less than that) and misused some stats test to determine "Rainfall greater than Y.yy inches will occur about 1 every 100 seasons. Rainfall greater than Z.zz will occur once every 500 years..." 

    If you want to know what model they are using, you might want to look through an intro to stats book. Probably Algebra based. And most likely, they resented having to take such a class too. Are all geologists going to be that bad? No. But will most of them use "statistical" arguements to tell you, the statistician, that everything you learned in your academic career is wrong? Probably. Using textbooks written by statisticians as "proof" won't work. Cuz, "It's all theoretical non-sense".

    Keep in mind, there is a lot of debate in academic science about the mere existence of multiple linear/logistic/Poisson regressions. Most don't believe they exist. For those that do, there is debate about why you would ever use one.

    ------------------------------
    Andrew Ekstrom

    Statistician, Chemist, HPC Abuser;-)
    ------------------------------



  • 6.  RE: 500-year flood

    Posted 09-07-2017 10:47
    Here's some background and links to current methods for calculating annual exceedance probabilities for floods:

    The U.S. Water Resources Council Hydrology Committee published a report in 1967 titled A Uniform Technique for Determining Flood Frequencies, commonly known as Bulletin 15 (Water Resources Council, 1967). Since then the Bulletin has been revised several times (Bulletin 17 - 1976, Bulletin 17A – 1977, Bulletin 17B - 1982, and Bulletin 17C - in draft 2017) and software developed to assist with analysis (England Jr. and others, [in draft]; Flynn and others, 2006). In the most recent Bulletin, as with the previous bulletins, many of the flood-frequency methods rely on the assumption the data are stationary, independent and identically distributed, and lack any long-term persistence or autocorrelation. For some watersheds, these assumptions are acceptable and the methods suggested in Bulletin 17C are sufficient, for others with nonstationarities (caused by natural climate variability, land-use change, regulation, or anthropogenic climate change) methods suggested in Bulletin 17C may result in incorrect conclusions. As more information becomes available through historical and paleo records, there have been questions about whether the hydrologic system ever was stationarity and if some apparent nonstationarities are simply the result of long-term persistence or autocorrelation (Cohn and Lins, 2005). Methods for flood-frequency analysis under nonstationary conditions are a big topic in hydrologic research right now.

    The recommended draft of Bulletin 17C is available here Subcommittee on Hydrology - Bulletin 17C. It contains some information on the history of flood-frequency analysis methods in the US and includes the mathematics of current methods. This and the references within it would be a good place to start in understanding flood probability calculations. To go back further in history see Rumsey (2015).

    Bulletin 15 recommended use of the Pearson Type III distribution with log transformation of the data (log-Pearson Type III distribution, often referred to as LPIII) as a base method for flood-frequency studies. The LPIII distribution is widely used in hydrology, particularly in the United States, where it is used in Federal guidance (England Jr. and others, [in editorial]; Interagency advisory committee on water data, 1982; Water Resources Council, 1967). Applications of the LPIII distribution use systematically collected and historical peak-streamflow values to define a frequency distribution based on the sample mean (location), standard deviation (scale), and skew (shape). Bulletin 17C methods estimate those distribution parameters using the expected moments algorithm (EMA; Cohn and others, 1997) and can incorporate additional information, such as flood interval estimates (acknowledging the uncertainty of estimates in flood magnitudes) and threshold estimates based on paleo data.

    Other statistical distributions are applicable to flood-frequency analysis, such as the GEV, generalized extreme value distribution. See Asquith and others (2017) for more about probability distributions and parameter estimation methods.

    Asquith, W.H., Kiang, J.E., and Cohn, T.A., 2017, Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities: U.S. Geological Survey 2017–5038, 93 p., accessed at https://doi.org/10.3133/sir20175038.


    Cohn, T.A., Lane, W.L., and Baier, W.G., 1997, An algorithm for computing moments-based flood quantile estimates when historical flood information is available: Water Resources Research, v. 33, no. 9, p. 2089–2096.

    Cohn, T.A., and Lins, H.F., 2005, Nature's style: Naturally trendy: Geophysical Research Letters, v. 32, no. 23, p. L23402.

    England Jr., J.F., Cohn, T.A., Faber, B.A., Stedinger, J.R., Thomas Jr., W.O., Veilleux, A.G., Kiang, J.E., and Mason, R.R., [in editorial], Guidelines for determining flood flow frequency --- Bulletin 17C --- April 10, 2017 Draft: U. S. Geological Survey 4–BXX, accessed August 8, 2017, at https://acwi.gov/hydrology/Frequency/b17c/Bulletin17C-draft-for-USGS-EditorialReview-10Apr2017.pdf.

    Flynn, K.M., Kirby, W.H., and Hummel, P.R., 2006, User's Manual for Program PeakFQ Annual Flood-Frequency Analysis Using Bulletin 17B Guidelines: U.S. Geological Survey Book 4, 42 p., accessed January 23, 2017, at https://pubs.usgs.gov/tm/2006/tm4b4/.

    Interagency advisory committee on water data, 1982, Bulletin 17B: Guidelines for determining flood flow frequency, https://water.usgs.gov/osw/bulletin17b/dl_flow.pdf.

    Rumsey, B., 2015, From flood flows to flood maps: the understanding of flood probabilities in the United States: Historical Social Research, v. 40, no. 2, p. 134–150.

    Water Resources Council, 1967, A Uniform Technique for Determining Flood Flow Frequencies: U.S. Water Resources Council Hydrology Committee Bulletin No. 15, accessed January 23, 2017, at https://water.usgs.gov/osw/bulletin17b/Bulletin_15_1967.pdf.



    ------------------------------
    Karen Ryberg
    Research Statistician
    U.S. Geological Survey
    ------------------------------



  • 7.  RE: 500-year flood

    Posted 09-08-2017 10:58
    I have had only one experience discussing statistics with a geologist. It was quite a few years ago. but to the best of my recollection, the geologist was asking for help with a paper he was reading that used a lot of statistics. He was very naive statistically, but very willing to try to learn (although somewhat slow at catching on). On the basis of that, I agree with Andrew that statisticians need to engage in a lot of outreach with geologists (as well as other scientists and engineers), but also believe that we need to be prepared to (at least in some cases) start from square one and exercise patience and tact in doing so.

    And, while I'm at it: we need to look at high school curricula to be sure that three-dimensional analytical geometry is included in courses for students headed toward science and engineering, since  the ability to visualize z changing as  x and y simultaneously change is important in getting past the mind-set of "only change one thing at a time."

    ------------------------------
    Martha Smith
    University of Texas
    ------------------------------