Criminal justice algorithms still discriminate

This story was originally published by Futurity.

Algorithms should transform America’s justice system, but data can discriminate, says data and law expert Ngozi Okidegbe.

Touted as dispassionate, computerized calculations of risk, crime, and recidivism, their use in everything from policing to bail and sentencing to parole should smooth out the often unequal decisions made by fallible, biased people.

But so far this has not been the case.

“Theoretically, if the prediction algorithm is less biased than the decision maker, it should result in fewer incarcerations of Black and Indigenous people and other politically marginalized people. But algorithms can discriminate,” says Okidegbe, associate professor of law and assistant professor of computer and data science at Boston University. Her fellowship examines how the use of predictive technologies in criminal justice affects racially excluded communities.

As it stands, these groups are incarcerated nearly four times as often as their white peers. According to the Bureau of Justice Statistics, a branch of the U.S. Department of Justice, in 2021 (the most recent year for which data is available), there were 1,186 Black adults and 1,004 Native Americans incarcerated in state or federal facilities for every 100,000 adults, and incarcerated Alaska Natives per 100,000 adults. Compare this to the rates at which whites were incarcerated in the same year: 222 per 100,000.

In recent work, Okidegbe has examined the role of algorithms in these injustices and the intertwined implications of technology and law, including exploring the data behind bail decisions.


In their simplest form, algorithms are shortcuts to problem solving. Engineers can train computers to process large amounts of data and then come up with a simple solution to a complex problem. Spotify, for example, uses algorithms to suggest songs that the company thinks their listeners might like based on what they’ve heard before. The more data a computer model has to process, the more differentiated and accurate its results should be.

READ :  JPMorgan Chase, Albertsons, Tesla, Beyond Meat, Delta and more

But a growing body of academic research — including by Okidegbe — and news reports show that algorithms built on incomplete or biased data can replicate or even amplify that bias when it churns out results. This isn’t a big deal if, say, your toddler’s Peppa Pig obsession seeps into your suggested Spotify playlists, but it can wreak havoc in other contexts.

Imagine a judge, says Okidegbe, receiving an algorithmically generated reoffending risk score as part of a report on a convicted criminal. This score tells the judge how likely it is that person will commit another offense in the near future – the higher the score, the more likely someone is to be a repeat offender. The judge takes that score into account and assigns more jail time to someone with a high recidivism rate. Case closed.

A major report by nonprofit news organization ProPublica found that these ratings can carry a lot of weight with the judges who use them because they feel impartial. In reality, these assessments are neither impartial nor valid. ProPublica found that a particular system used by courts across the country got blacks about twice as likely to be wrong as whites: It incorrectly labeled twice as many blacks who didn’t recidivist as high risk.


In a recent article for the Connecticut Law Review, Okidegbe traces this inconsistency to its source and identifies a three-pronged “input problem”.

First, she writes, jurisdictions are opaque about whether and how they use pre-trial algorithms, often adopting them without consulting marginalized communities “even though those communities are disproportionately impacted by their use.” Second, these same communities are generally excluded from the process of creating such algorithms. After all, even in jurisdictions where members of the public can express opinions on the use of such tools, their input rarely changes anything.

READ :  Top 5 Cryptocurrencies of the Week - Week 7

“From a racial justice perspective, there are other harms that result from the use of these algorithmic systems. The paradigm that governs whether and how we use these algorithms is fairly technocratic and not very diverse. Kate Crawford identified AI’s “white male problem,” says Okidegbe, referring to a senior researcher at Microsoft and co-chair of a White House symposium on AI and society who coined the term to describe overrepresentation white men in creating artificial intelligence to describe products and companies.

From the start, Okidegbe says, algorithmic systems exclude racially marginalized and other politically oppressed groups.

“I have looked at the decision-making power of whether and how algorithms are used and what data they produce. It greatly excludes the marginalized communities that are most likely to be affected, because those communities are not centered and often don’t even sit at the table when these decisions are made,” she says. “In this way, I propose that turning to algorithms is inconsistent with a racial justice project because they perpetuate the marginalization of those same communities.”


In addition to producing biased results that disproportionately harm marginalized communities, the data used to train algorithms can also be messy, subjective and discriminatory, says Okidegbe.

“In my work, I grappled with what I believe to be a misconception: that algorithms are only built with quantitative data. They’re not, they’re also created with qualitative data,” she says. Computer engineers and data designers will meet with policymakers to figure out what problem their algorithm should solve and what data sets they should pull it from, Okidegbe says.

READ :  7 Best Blockchain Stocks to Buy for the Rest of 2022

In the criminal and legal context, this could mean working with judges to determine, for example, what would help them in imposing prison sentences. But again, data engineers are far less likely to encounter incarcerated individuals, for example, as part of their early intelligence gathering process. Instead, as Okidegbe writes in an article for a recent issue of the Cornell Law Review, most of the large data sets used in pre-trial algorithms are built and trained on data from “prison knowledge sources” such as police records and court documents.

“That creates this narrative that these communities have no knowledge to add to the broader question,” says Okidegbe.

Realizing the promise of algorithms in the criminal justice system—the promise that they make the process more consistent and less biased than humans otherwise do—requires a radical rethink of the entire structure, says Okidegbe. She encourages her students to think about this as they shape the future of law and criminal justice.

“It means actually taking the knowledge of marginalized and politically oppressed communities into account and letting it inform how the algorithm is constructed. This also means continuous monitoring of algorithmic technologies by these communities. What I am proposing requires building new institutional structures, it requires rethinking who is credible and who should be in power when it comes to using these algorithms. And if that is too much, then we cannot call this a racial justice project in the same breath.”