The lesson here is that if the ultimate outcome you care about is hard to measure, or involves a hard-
to-define combination of outcomes, then the problem is probably not a good fit for machine
learning. Consider a problem that looks like bail: Sentencing. Like bail, sentencing of people who
have been found guilty depends partly on recidivism risk. But sentencing also depends on things
like society’s sense of retribution, mercy, and redemption, which cannot be directly measured. We
intentionally focused our work on bail rather than sentencing because it represents a point in the
criminal justice system where the law explicitly asks narrowly for a prediction. Even if there is a
measurable single outcome, you’ll want to think about the other important factors that aren’t
encapsulated in that outcome – like we did with race in the case of bail – and work with your data
scientists to create a plan to test your algorithm for potential bias along those dimensions.
Verify your algorithm in an experiment on data it hasn’t seen
Once we have selected the right outcome, a final potential pitfall stems from how we measure
success. For machine learning to be useful for policy, it must accurately predict “out-of-sample.”
That means it should be trained on one set of data, then tested on a dataset it hasn’t seen before. So
when you give data to a vendor to build a tool, withhold a subset of it. Then when the vendor comes
back with a finished algorithm, you can perform an independent test using your “hold out” sample.
An even more fundamental problem is that current approaches in the field typically focus on
performance measures that, for many applications, are inherently flawed. Current practice is to
report how well one’s algorithm predicts only among those cases where we can observe the outcome. In the bail application this means our algorithm can only use data on those defendants
who were released by the judges, because we only have a label providing the correct answer to
whether the defendant commits a crime or not for defendants judges chose to release. What about
defendants that judges chose not to release? The available data cannot tell us whether they would
have reoffended or not.
Assignment status: Solved by our experts