View Only

NIJ Recidivism Challenge Dataset

By George Mohler posted 14 days ago

  

Risk assessment instruments (RAIs) are used during parole decisions to forecast the probability of recidivism.  Advocates of RAIs point to higher accuracy of algorithms compared to human decision makers [1] and reductions in the number of re-arrests after parole when algorithms are used in parole decisions [2].  RAIs are not without criticism; for example, a 2016 ProPublica study of the COMPAS algorithm and dataset highlighted racial disparities in the accuracy of RAIs used to forecast recidivism, showing that the false positive rate for Black individuals was twice as high compared to white individuals in the data [3]. 

Since the ProPublica study was released, the statistics and machine learning research communities have investigated properties of RAIs (e.g. bias, interpretability) used in forecasting recidivism and developed a variety of novel algorithms.  Many of these studies utilize the COMPAS dataset, which according to one estimate is the second most used dataset in algorithmic fairness research papers [4].  While a consistently used dataset has advantages in terms of benchmarking algorithms across studies, having alternative recidivism datasets to assess RAIs may provide better insights into the generalizability of results.  Furthermore, the COMPAS dataset has several well-known limitations (for example non-stationary time censoring [5]) that must be carefully handled when used.

In 2021, the National Institute of Justice held a recidivism forecasting challenge to encourage further research in this area.  As part of the challenge, NIJ released a recidivism dataset from the state of Georgia on 25,835 individuals released from prison to parole supervision between 2013 and 2015.  The dataset includes 54 columns with both demographic and event level features, and Public Use Microdata Area (PUMA) ids allow for joining additional spatial covariates.  Recidivism outcome variables in the dataset consist of indicators for arrest after parole within 1 year, 2 years and 3 years.  The train-test splits used in the NIJ competition to assess model performance on held-out samples are also provided, which allow for benchmarking comparisons of algorithms across studies.  While a handful of papers have been published on the NIJ competition [6-8], the dataset has been used sparingly in comparison to COMPAS, and may be useful for researchers looking for an alternative or complement.

The dataset is made publicly available at the following url: 

https://data.ojp.usdoj.gov/Courts/NIJ-s-Recidivism-Challenge-Full-Dataset/ynf5-u8nk/about_data

 

References

[1] Lin, Z. J., Jung, J., Goel, S., & Skeem, J. (2020). The limits of human predictions of recidivism. Science advances, 6(7).

[2] Berk, R. (2017). An impact assessment of machine learning risk forecasts on parole board decisions and recidivism. Journal of Experimental Criminology, 13.

[3] https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

[4] Fabris, A., Messina, S., Silvello, G., & Susto, G. A. (2022). Algorithmic fairness datasets: the story so far. Data Mining and Knowledge Discovery, 36(6).

[5] Barenstein, M. (2019). Propublica's compas data revisited. arXiv preprint arXiv:1906.04711.

[6] Mohler, George, and Michael D. Porter.  (2021). A note on the multiplicative fairness score in the NIJ recidivism forecasting challenge.  Crime Science, 10.

[7] Lee, Y., O, S., & Eck, J. E. (2023). Improving Recidivism Forecasting With a Relaxed Naïve Bayes Classifier. Crime & Delinquency. 

[8] Circo, G. M., & Wheeler, A. P. (2022). An Open Source Replication of a Winning Recidivism Prediction Model. International Journal of Offender Therapy and Comparative Criminology.

0 comments
8 views

Permalink