Skip to Content

Can Algorithms Be Fair?

Courts in many jurisdictions now use algorithmic tools like the COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) or the PSA (Public Safety Assessment) to inform decisions about pretrial release and sentencing.

These tools generate a statistical risk score predicting the likelihood a defendant will fail to appear in court or be arrested for a new crime. Proponents argue they bring data-driven objectivity, reducing reliance on a judge’s potentially biased gut instinct. The reality is more troubling. These algorithms are not oracles; they are prediction machines trained on historical data. Their “ground truth” is the record of past arrests, convictions, and failures to appear. This data is not a neutral record of criminality, but a fossil of historical policing and prosecutorial practices. If a police department historically over-patrolled low-income Black neighborhoods, those neighborhoods will generate more arrest data. An algorithm trained on that data learns that zip code, and its correlated demographics, are predictive of risk, thus automating and legitimizing past discrimination under a cloak of mathematical neutrality.

The technical manifestation of this is proxying. Even if the algorithm is barred from using race as a direct input, it uses proxies strongly correlated with race, such as neighborhood, employment history, family criminal history, or education level. A 2016 investigation by ProPublica into COMPAS found it was nearly twice as likely to falsely label Black defendants as high risk for future violence compared to white defendants, while incorrectly labeling white defendants as low risk more often. The algorithm was equally accurate across races in a technical sense (predictive parity), but it was not fair in its error distribution. This highlights a critical insight: fairness in machine learning is not a single, agreed-upon mathematical definition. You can optimize for predictive parity, demographic parity, or error rate equality, but you often cannot satisfy all simultaneously- they are mutually exclusive. Choosing which fairness metric to prioritize is an ethical and political choice, not a technical one.

The legal and procedural consequences are severe. Judges presented with a “high risk” score may give it undue weight due to automation bias, deferring to the algorithm’s perceived scientific authority. This undermines the principle of individualized judgment. Furthermore, the proprietary nature of most commercial algorithms means defendants cannot cross-examine the “code” or fully understand the factors driving their score, violating due process rights to confront evidence. The central question becomes: can a system designed to predict outcomes within an unjust social landscape ever produce a just result? At best, these tools may reduce some individual judicial caprice while cementing systemic bias into code. At worst, they create a veneer of scientific objectivity that legitimizes and scales discrimination.

Fairness may not be a solvable equation, but a continuous political struggle for transparency, accountability, and the fundamental reform of the biased data-generating systems themselves.