Most of the time, we don’t question the systems around us until they fail. When planes crash, treatments go wrong, or loans are denied, we ask, “What happened, why, and who’s responsible?”
As the development of AI is rapidly progressing and systems take on more and more power in deciding what we see, what we get, and what we do, a true understanding of how they get to those decisions is crucial in order for us not to lose control. Today’s systems provide outputs with a suspicious amount of confidence. However, when asked why, they often can’t or won’t tell us, revealing a complex, data-driven, and incomprehensible “black box” even for its creators (Maclure, 2021; Guidotti et al., 2018; Kosinski, 2024).
In a world where AI is increasingly involved in tasks for high-risk sectors that even include taking decisions that impact human life, explainability becomes a crucial necessity.
What is Explainability?
AI explainability or Explainable AI (XAI), broadly refers to the challenge of making AI systems, including their decision-making processes and outcomes, understandable for humans.
With the move from rule-based programming to machine learning, emerging systems have dramatically improved their performance and accuracy, most importantly in fields like Natural Language Processing (NLP) or computer vision (Maclure, 2021; LeCun et al., 2015). However, with this increased accuracy, algorithms have also grown more opaque, with high-dimensional, nonlinear architectures involving sheer endless layers and parameters. Backtracking, understanding, and explaining the reasoning and processes that lead to a given (even correct) output across these complex architectures has become virtually impossible (Maclure, 2021). This phenomenon is often referred to as the “black box” problem: AI systems take in inputs and produce outputs with remarkable accuracy, but the steps in between often remain opaque and hidden, even to their developers.
The research field of Explainable AI (XAI) specifically focuses on finding ways to open up that black box. Thus, unlike performance metrics or accuracy scores, explainability is not about how well a model works but whether we as humans can make sense of its internal processes and outcomes.
Why Explainability Matters and What Happens When It’s Missing
Explainability isn’t just a technical feature; it’s a safeguard. The risks here aren’t just theoretical. If AI systems are being used to determine who gets a loan, who is policed, or what type of medical treatment is applied, not knowing how and why a decision was made is not only frustrating but outright dangerous. Hence, explainability isn’t just a matter of user experience; it’s about justice, autonomy, and legal accountability.
Explainability is also foundational to human rights and democratic values. When decisions are taken or supported by “Black Box” AI agents, without our external human capacity to understand how the decisions came about, we lose important control, oversight, scrutiny, and remedy (Maclure, 2021; McDermid et al., 2021). For instance, biased patterns in training data can go undetected and unchallenged if there are no mechanisms for explainability. Victims of unfair, discriminatory, or simply wrong outputs from AI systems would inherently be denied an explanation and therefore a meaningful way to contest or appeal the outcome if there is no way to understand how the outputs were arrived at.
Moreover, regulatory bodies may find it difficult to enforce existing laws around discrimination, liability, or malpractice when they cannot trace how decisions were made, undermining the rule of law itself. In sectors like education, AI-driven assessments, without proper accountability, might misclassify student performance without offering clear justification, obstructing not just the learning process but also negatively impacting career opportunities for students. Even on commonly used platforms, like search engines or social media, opaque recommendation systems can shape public discourse and opinion in invisible ways, without users ever knowing why they are seeing what they are seeing. In all these cases, the absence of explainability means a loss of agency, transparency, trust, and ultimately, justice.
On the other hand, when people can understand how a system works, they are more likely to trust its decisions (Miller, 2018). Explainability empowers us to ask questions, contest decisions, hold the systems and their developers accountable, and even improve them. But how do we get there? In the second part of this series, we’ll take a closer look at the approaches, methods, and processes that define today’s push for explainable AI.
References:
Kosinski, M. (2024, 29. October). Black box AI. What is black box artificial intelligence (AI)? https://www.ibm.com/think/topics/black-box-ai
LeCun, Y., Bengio, Y. & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
Maclure, J. (2021). AI, Explainability and Public Reason: The Argument from the Limitations of the Human Mind. Minds And Machines, 31(3), 421–438. https://doi.org/10.1007/s11023-021-09570-x
McDermid, J. A., Jia, Y., Porter, Z. & Habli, I. (2021). Artificial intelligence explainability: the technical and ethical dimensions. Philosophical Transactions Of The Royal Society A Mathematical Physical And Engineering Sciences, 379(2207), 20200363. https://doi.org/10.1098/rsta.2020.0363
Miller, T. (2018). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1–38. https://doi.org/10.1016/j.artint.2018.07.007
Further reading/watching/listening:
Books & Articles:
Molnar, C. (2019). Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. In Leanpub. https://originalstatic.aminer.cn/misc/pdf/Molnar-interpretable-machine-learning_compressed.pdf
Ribeiro, M. T., Singh, S. & Guestrin, C. (2016, 16. February). „Why Should I Trust You?“: Explaining the Predictions of Any Classifier. arXiv.org. https://arxiv.org/abs/1602.04938
Videos & Podcasts:
“Human-Centered Explainable AI: From Algorithms to User Experiences” Q. Vera Liao talks at Stanford.