ASA Connect

 View Only
  • 1.  new Census Bureau privacy policy

    Posted 12-06-2018 23:02
    The US Census Bureau has announced that it's planning to use differential privacy methods to achieve data privacy. I've long had concerns about DP, and have worried that its effect could be devastating to research in everything from education to poverty to public health. Coincidentally, this morning I received a message from IPUMS, an organization aimed at making it easier for researchers to access Census data. IPUMS is also concerned that DP could be a blow to research.

    DP is a fancier, more abstracted version of Random Data Perturbation methods. The basic problem with them and DP is that, if noise is added only in univariate forms, correlation structures between variables become attenuated. Earlier forms of DP, to my knowledge, did not address this crucial issue. Though I've seen some recent papers that at least refer to multivariate forms of DP, again to my limited knowledge, there is no good, developed, practically useful form for this.

    I know that some CB people read this forum, so I would like to ask what the CB has one to ensure that, for instance, a researcher will still be able to obtain useful estimates of regression coefficients under the new scheme. I'm well aware that any method has its drawbacks, but what can we expect from the new policy?

    This is extremely important.

    ------------------------------
    Norman Matloff
    University of California, Davis
    ------------------------------


  • 2.  RE: new Census Bureau privacy policy

    Posted 12-07-2018 07:33
    Here is a link to an excellent analysis of this policy by those around IPUMS (full disclosure, I have collaborated with them, and with John Abowd who is the main advocate.  It could make the 2020 Census and the American Community Survey virtually useless for many analyses.

     https://assets.ipums.org/_files/mpc/MPC-Working-Paper-2018-6.pdf

    Implications of Differential Privacy for Census Bureau Data and Research Task Force on Differential Privacy for Census Data

    Institute for Social Research and Data Innovation (ISRDI) University of Minnesota

    The Census Bureau has announced a new set of standards and methods for disclosure control in public use data products. The new approach, known as differential privacy, represents a radical departure from current practice. In its pure form, differential privacy techniques may make the release of useful microdata impossible and severely limit the utility of tabular small-area data. Adoption of differential privacy will have far-reaching consequences for research. It is possible-even likely-that scientists, planners, and the public will lose the free access we have enjoyed for six decades to reliable public Census Bureau data describing American social and economic change. We believe that the differential privacy approach is inconsistent with the statutory obligations, history, and core mission of the Census Bureau.

     



    ------------------------------
    Andrew Beveridge
    Professor of Sociology
    Queens and Grad Center CUNY
    ------------------------------



  • 3.  RE: new Census Bureau privacy policy

    Posted 03-12-2019 12:58

    Counter example for logic used to protect census 2020 data by Differential Privacy Mechanism

    Before I begin, let me ask you one question. What is more important a) preserving individual privacy or b) preserving individual human life?  Of course preserving individual human life is far more important than preserving individual privacy.

    Now let us apply the "zero risk" reasoning used to protect individual privacy in census 2020 data; to the decision to preserve individual life from events such as a) tornados b) flooding c) potential nuclear attack.

    To protect individual human life from events such as these, the ultimate (almost) zero risk solution is to force individuals to live in harden concrete bunkers (to protect from extreme tornados and potential nuclear attack) constructed 100 feet above ground level (to protect from extreme flooding).  

    I am sure none of you will agree with me on above proposed solution to the perceived threats to individual human life. The reason for that is I have failed to take in to considerations aspects such as a) probabilities of occurrence of events and b) total cost of implementation of proposed solutions along with degradation of quality of human life.

    Current proposed method to protect individual privacy in 2020 census fails to consider probability of occurrence of extreme attacks and cost benefit analysis of using extreme solution typical of differential privacy mechanism.  Instead, new laws and or regulations should be used to protect from perceived threat to individual privacy.

    Ramesh A Dandekar

    Retired



    ------------------------------
    Ramesh Dandekar
    Math Stat - Retired
    ------------------------------



  • 4.  RE: new Census Bureau privacy policy

    Posted 03-13-2019 11:36
    I've had experience using census block data (for marketing) and it is down to pretty small populations AND the data's availability is widespread. 

    If ICE (or any other agency) had access to that data they could easily target small neighborhoods for immigration raids. So, there are some serious privacy issues here.

    ------------------------------
    Michael Mout
    MIKS
    ------------------------------



  • 5.  RE: new Census Bureau privacy policy

    Posted 03-14-2019 08:24

    Echoing Michael Mout's concerns (I seem to do that a lot...), I too have used block-level census for marketing over many years. In fact, this very experience with Census block data has proved tremendously useful in my Data For Good work as well. One example is found in research on human trafficking, where a model identifying regressors at a state level was applied to neighborhoods within a specific metropolitan area (Cincinnati). This method enables targeting of specific areas for action by law enforcement to fight human trafficking. 

    The technology Michael Mout describes already exists, used in a different legal context for community service. It could easily be misapplied to target the instance Michael describes. The privacy issues here are real. As a question of ethics, the statistical community needs to consider the impact of taking technology developed for community service and re-applying it to harm communities instead.   



    ------------------------------
    David J Corliss, PhD
    Director, Peace-Work www.peace-work.org
    davidjcorliss@peace-work.org
    ------------------------------



  • 6.  RE: new Census Bureau privacy policy

    Posted 03-17-2019 10:44

    Protecting individual privacy in public use summary tables, created by statistical agencies is a critical task. This is obvious, based on the amount of resources currently being spent by U. S. Census for this task. Back in 1990s I developed a concept of synthetic tabular data to achieve that objective. In the initial Microsoft 2006 paper on differential privacy of data summarized in contingency tables, Microsoft cites my work and states "Our approach can be viewed as a special case of a more general approach for producing synthetic data".

     

    When I first saw that paper, I contacted Microsoft authors of the paper and asked them to demonstrate the merits of their work by using real life data I used in my 2004 paper "Maximum Utility-Minimum Information Loss Table Server Design for Statistical Disclosure Control of Tabular Data", Ramesh A. Dandekar, June 9-11, 2004, Barcelona, Spain. In that paper I used Current Population Survey (CPS) file from UC Irvine available in the public domain as a practical example for real life application of the method.

     

     

    After 13 years of research on the differential privacy, I have not seen a practical example that demonstrates the merits of using differential privacy on tabular data. I would like to encourage DP researchers to use public domain microdata such as CPS file, to evaluate the relative merits of their research papers.

     

    To benefit future research on this topic, I have uploaded on Research gate website, a pdf file containing 55 pages of information pertaining to my work in 2004 paper along with a copy of my email to Microsoft authors June 2007 (and a copy of Microsoft 2006 paper for easy access). The file also contains 2D and 3D summary statistics of CTA protected CPS data, relative to original CPS data on pages from 39 to 55.

     

    I Hope you will all find this information useful in your future research activities on the topic.

     

    Ramesh A. Dandekar

    Retired



    ------------------------------
    Ramesh Dandekar
    Math Stat, Retired
    ------------------------------



  • 7.  RE: new Census Bureau privacy policy

    Posted 03-18-2019 10:34
    Hello, all!

    We've had a number of moderation reports for this thread, and it will take some time to respond to all of them. Thank you for your patience as we work through this process.

    As with all ASA Community discussions, consider the questions posted by the original poster when responding to the thread. This person will receive *all* of your responses in their inbox, and will most likely get the most value from posts that respond directly to the original questions they pose.

    - Lara

    ------------------------------
    Lara Harmon
    Marketing and Online Community Coordinator
    American Statistical Association
    ------------------------------



  • 8.  RE: new Census Bureau privacy policy

    Posted 03-19-2019 12:43
      |   view attached
    Supporting document (pdf file) mentioned in my posting two days ago ... "To benefit future research on this topic, I have uploaded on Research gate website, a pdf file containing 55 pages of information pertaining to my work in 2004 paper along with a copy of my email to Microsoft authors June 2007 (and a copy of Microsoft 2006 paper for easy access). The file also contains 2D and 3D summary statistics of CTA protected CPS data, relative to original CPS data on pages from 39 to 55".

    Many thanks to ASA staff member to show me how to attach a file to the posting

    - Ramesh A Dandekar

    ------------------------------
    Ramesh Dandekar
    Retired --- for private use
    ------------------------------