Blog
Our latest blog posts on topics related to natural language processing & alignment.

Navigating Truth and Accountability in the Age of AI Information
Journalism, as one of the main driving forces behind information flows in modern societies, has traditionally promoted itself as the medium of truth. The credibility of news institutions and the legitimacy of journalism as a profession have long rested on their ability to produce, verify and disseminate information grounded in factual accuracy and editorial integrity. Yet, in the era of artificial intelligence, these epistemic foundations are being profoundly challenged: generative AI does not only replicate or automate journalistic processes, but also potentially transforms them. The generative potential of AI introduces a new layer of uncertainty to news production, as tools that are neither human nor conscious are now producing texts with the marks of human authorship, originality and even moral voice.

Tying the Knots of Trust: Understanding the Evolving Sociotechnical Ecosystem of Trust in LLMs
When we interact with a chatbot, ask a digital assistant for advice or rely on LLMs to summarise a long document, we are doing something profoundly human: we are trusting. Trust is part of what makes cooperation possible between people, but increasingly, also between people and machines. In the age of artificial intelligence (AI), and particularly with the rapid rise of large language models (LLMs), trust has become a central issue.

Beyond Accuracy: Why “Being Right” Isn’t Enough for Human-Centred AI
Imagine the following two scenarios. A teacher asks an AI to review a student’s essay. Its feedback is accurate, the grammar is fixed and the facts are straight, yet the student still feels stuck. The student has no clue what to try next. A software team asks an AI to flag bugs. The model points to real issues, but the way it explains them leaves new engineers more confused than confident. In both cases, the tool passes the test and fails a person.
Accuracy matters, but it’s not the whole story. If we chase only the right answer, we ship systems that look strong in demos and lose people in real use.

Beyond the Hype: What Actually Makes AI Design Different
Just a few years ago, conversation flow design was at the heart of chatbot research (Cho et al., 2025). Designers developed detailed guidelines to structure dialogues, crafted messaging frameworks for seamless interactions, and carefully designed output messages to align with chatbot personas. Then as transformer-based large language models (LLMs) arrived, the rigid and predefined conversation structures that worked for rule-based systems couldn’t accommodate LLMs’ dynamic, context-aware response. Research priorities shifted from designing fixed dialogue trees to exploring prompt engineering and interaction patterns (Cho et al., 2025). The expertise built over years around conversation flow design was fundamental but it needed rapid reframing; designers had to rethink how to guide conversations without prescribing every turn, how to maintain coherence without rigid structures, and how to evaluate interactions that varied with each user (Subramonyam et al., 2024). This narrative about change isn’t just applicable to chatbots. It’s applicable to Generative AI’s (GenAI) unique temporal challenge and it raises a critical question for anyone designing with or for AI – are our design methods keeping up, or do we need new ones?

The Moral Panic Around AI Mental Health
Trigger Warning/Disclaimer: This blog post mentions suicide.
Governments, startup founders, academics, mental health professionals and others wrestle over who gets to define the future of AI mental health care.
Amidst a lack of regulatory oversight regarding AI-based mental health chatbots, some states in the US have taken steps to ban these systems in order to protect the public. Full bans are in place in Illinois and Nevada, and although Utah has not banned it outright, it still imposes strong restrictions and requirements around transparency, advertising, data use and human professional involvement. Bans as a political strategy and policy risk unintended consequences on a population-wide scale (Oliver et al., 2019).

Safety Guardrails for AI: How LLMs Learn to Stay Safe
Large language models (LLMs) are trained on large amounts of text from the internet, books, forums and other sources in a process called pre-training. This gives them great versatility, but also comes with a hidden challenge: human language data contains biases, misinformation and unsafe patterns, such as hate speech, toxic or discriminatory content. When models learn from such data, they not only gain useful knowledge but also inherit these problems. On top of this, LLMs tend to be statistically overconfident (Guo et al., 2017; Minderer et al., 2021), meaning they assign higher probabilities to their predictions, due to the way that they interpret data (Xu et al., 2024). They often present information with certainty, even when the output is false. This combination of biased training data and overconfidence can lead to hallucinations, biased answers or unsafe outputs, such as toxic content or instructions for harmful behavior.