ASA Connect

 View Only

Building Size-Stratified Business Survival Tables from BLS and BDS Data: Methodology and Open Questions"

  • 1.  Building Size-Stratified Business Survival Tables from BLS and BDS Data: Methodology and Open Questions"

    Posted yesterday

    The Problem

    Building actuarial survival tables for privately held business valuation requires two things that no single public data source provides together: longitudinal cohort tracking (to measure true survival from birth) and size stratification (to distinguish survival rates for a five-employee firm from a fifty-employee firm).

    The BLS Business Employment Dynamics (BED) Table 7 provides excellent cohort tracking - it follows employer establishments from birth through exit via unemployment insurance records, producing observed survival rates at years 1 through 20 across up to thirty-one birth cohorts. But it is published only at the two-digit NAICS sector level. No size stratification is available.

    The Census Bureau Business Dynamics Statistics (BDS) provides size stratification - annual entry and exit counts by sector, employment size band, and firm age. But it is cross-sectional, not longitudinal. It does not track cohorts from birth. Using BDS exit rates for older firm-age categories to construct survival curves produces a systematic upward bias: the age 11–15 and 16–20 firm-age groups in the BDS are already survivor-filtered populations. The most fragile firms have exited long ago, leaving a progressively more durable pool. Exit rates computed from these groups understate true long-run mortality because the denominator is a survivor cohort, not a birth cohort.


    The Approach We Used

    The methodology treats the two data sources as complementary rather than substitutable:

    BLS provides the absolute survival level and hazard shape. Sector-average survival curves were constructed by averaging BLS cohort data across all available birth cohorts at each year (up to 31 cohorts for year 1, 12 cohorts for year 20). A two-parameter Weibull was fitted to the observed averages; shape parameters of 0.63–0.71 across sectors confirm declining hazard - survivors become progressively more durable. These curves are the foundation.

    BDS provides relative size-band differentials. Rather than using BDS to construct absolute survival levels, we used it only to measure how much faster or slower each size band exits relative to the sector average - and only for young firms (age 0–5) where survivor filtering has not yet had time to distort the signal materially. Annual exit rates were averaged over 2013–2023, and a ratio was formed: size-band exit rate divided by the simple average of the four size bands within that sector.

    Power scaling combines them. The BDS ratio is applied as a power scalar to the BLS survival curve:

    S_size(t) = S_BLS(t) ^ ratio

    A ratio above 1.0 (higher exit rate than sector average) compresses the survival curve downward; below 1.0 it lifts it. Ratios are clipped to [0.6, 1.5] to prevent extreme values from producing implausible results. The BLS curve anchors the absolute level; the BDS ratio scales it proportionally.

    Three fallback cells. For Manufacturing 100–499 employees, Health Care 5–9 employees, and Food Services 20–99 employees, the BDS size-band signal was statistically unreliable - small cell sizes or high year-to-year volatility. These three combinations default to the BLS sector average (ratio = 1.0) and are flagged in the table.


    What We Are Uncertain About

    1. The power-scaling assumption. Applying the BDS ratio as a power to the BLS survival curve is a reasonable but not uniquely justified choice. An additive or multiplicative hazard adjustment would produce different results. We chose the power form because it preserves the shape of the BLS hazard function and produces monotonically declining survival curves, but we have not seen this specific hybrid approach validated in the literature.
    2. Young-firm exit rates as a proxy for lifetime differentials. We use age 0–5 exit rates as the size-band signal on the assumption that relative size-mortality differentials established early persist throughout the firm's life. This may not hold - the size advantage of larger firms may grow or shrink over time in ways the early exit rates don't capture.
    3. The clipping bounds. The [0.6, 1.5] clip range was set judgmentally to prevent implausible survival curves. Different bounds would produce different results at the extremes of the size distribution.
    4. Non-monotonic patterns. In several sectors the 100–499 employee band shows survival at or slightly below the 20–99 band - the opposite of the expected size-survival relationship. We attribute this partly to acquisition miscounting in the BDS (exits that are actually mergers) and partly to genuine economic disruption in Retail Trade. But we cannot fully rule out methodology artifacts.

    The Question for the Forum

    Is there a more defensible way to integrate these two data sources? Specifically: is the power-scaling approach used in actuarial or demographic literature for similar hybrid problems, and if so, what are the known limitations? Are there alternative approaches to extracting size-band differentials from cross-sectional data that avoid the survivor-filtering bias in the older BDS age groups?


    Happy to share the Python scripts and underlying data for anyone who wants to examine the methodology in detail



    ------------------------------
    Michael Sack Elmaleh
    Principal
    Michael Sack Elmaleh CPA, CVA
    ------------------------------