ASA Connect

 View Only
  • 1.  Data Forensics/Psychometrics

    Posted 03-08-2017 09:52
    Hello all,

    I am wondering if anyone has done some work on data forensics (erasure analysis) in psychometrics. Instead of using p value to flag observations, most studies use the outlier score which isthe  absolute value of 1.086*log(p/1-p); p being the p value. I am wondering if anyone can refer me to resources explaining the reasons behind that.

    Best regards,




  • 2.  RE: Data Forensics/Psychometrics

    Posted 03-09-2017 07:48
    The company called Caveon has done a lot of work on this.  You may contact them to know more.

    ------------------------------
    Ji Zeng
    Psychometrician
    Michigan Department of Education
    ------------------------------



  • 3.  RE: Data Forensics/Psychometrics

    Posted 03-10-2017 08:37
    The transformation you referenced is advocated by DRC (Data Recognition Corporation). The justification appears to be a desire to linearize the p-value after it has been calculated using the normal distribution (I.e., the average number of wrong-to-right answer changes in a school or classroom is compared with the population average using the central limit theorem and the normal distribution).

    DRC calls this a threat scale. The constant 1.086 was selected somewhat arbitrarily so certain values of the resulting scale conform to select p-values. I don't remember which scale values they selected for alignment, but you should be able to figure this out.

    The main justification for the transformation is that in test security work p-values need to be very small for flagging. There is a desire to be extremely conservative because these data were not collected according to an experiment and the underlying distributions have not been rigorously documented. Indeed, the distributions depend upon student ages (I.e., grade), the subject being taught, the amount of time allowed to take the test, and many other factors, including item-by-item variability. It's not unusual to calculate p-values which are less than one chance in one billion.

    Personally, I prefer a simple logarithmic transformation base-10. I tell people it's easier to count zeros than to print them. The other advantage of the logarithm transformation is the ability to convert the transformed p-value into a chi-square statistic for use with inference and combining with other statistics. Thus, a value of 8 represents a probability of one chance in 100 million. I have several clients who use very small p-values as triggers for determining to investigate anomalies (I.e., one chance in one trillion or smaller).

    The other reason that small p-values must be used is to prevent investigating false positives. You can use an alpha inflation correction to do this, or you can just choose a critical value which keeps the flagging rate at a manageable and practical level. The important thing is that school systems monitor and take some action so that educators are aware the data are being analyzed and they will modify their behavior. In other words, deterrence is more important than enforcement.

    Dennis Maynes, Caveon Test Security

    Sent from my iPad




  • 4.  RE: Data Forensics/Psychometrics

    Posted 03-09-2017 08:02

    Hi Allassane,

     

    I have not heard of this but then again I don't spend much time in this area.  I wonder if there is something more to this since the function appears to be non-negative for all p in [0,1] with a non differentiable turn at p=0.5.  I'm guessing that it is also symmetric around this point so naively, the function would return the same value for p=0.05 as it would for p=0.95.  I look forward to more on this.

     

    Cheers,
    PDM

     

     

    Pat Mitchell, MA

    Statistical Science Director

    Early Clinical Biometrics IMED Biotech Unit AstraZeneca

    35 Gatehouse Dr

    Waltham, MA 02451-1215

    Office: 1-781-839-4982

    Cell: 1-302-420-3612 (Text OK)

    Home Office: 1-508-309-3813

     


    Confidentiality Notice: This message is private and may contain confidential and proprietary information. If you have received this message in error, please notify us and remove it from your system and note that you must not copy, distribute or take any action in reliance on it. Any unauthorized use or disclosure of the contents of this message is not permitted and may be unlawful.






  • 5.  RE: Data Forensics/Psychometrics

    Posted 03-10-2017 02:28
    where this expression is called the threat score, and about 548 other hits. 1.086 is defined as 10/abs(log(.0001/.9999)) according to the explanation after formula (1) page 6.