AI has become an everyday helper for many people. However, you can’t always trust its answers.
Computer science and philosophy
Six Criteria for the Reliability of AI
From individuals and institutions to conventional machines and AI systems, these criteria help assess trustworthiness.
Language models based on artificial intelligence (AI) can answer any question, but not always correctly. It would be helpful for users to know how reliable an AI system is. A team at Ruhr University Bochum and TU Dortmund University suggests six dimensions that determine the trustworthiness of a system, regardless of whether the system is made up of individuals, institutions, conventional machines, or AI. Dr. Carina Newen and Professor Emmanuel Müller from TU Dortmund University, alongside the philosopher Professor Albert Newen from Ruhr University Bochum, describe the concept in the international philosophical journal Topoi, published online on November 14, 2025.
Six dimensions of reliability
Whether a specific AI system is reliable is not a yes-or-no question. The authors suggest assessing how distinctly six criteria apply to each system in order to create a profile of reliability. These dimensions are:
- Objective functionality: How well does the system perform its core task and is the quality assessed and guaranteed?
- Transparency: How transparent are the system’s processes?
- Uncertainty quantification/Uncertainty of underlying data and models: How reliable are the data and models, and how secure are they against misuse?
- Embodiment: To what extent is the system physical or virtual?
- Immediacy Behaviors: To what extent is the user communicating with the system?
- Commitment: To what extent can the system have an obligation to the user?
“These criteria can illustrate that the reliability of current AI systems, such as ChatGPT or self-driving cars, usually exhibit severe deficits in most dimensions,” says the team from Bochum and Dortmund. “At the same time, it shows where there is need for improvement if AI systems are to achieve a sufficient level of reliability.”
Central dimensions from a technical perspective
From a technical standpoint, the dimensions transparency and uncertainty quantification of underlying data and models are crucial. These concern principal deficits of AI systems. “Deep learning achieves incredible things with large quantities of data. In chess, for example, AI systems are superior to any human,” explains Müller. “But the underlying processes are a blackbox to us, which has resulted in a key lack of trust up to this point.”
The uncertainty of data and models faces a similar situation. “Companies are already using AI systems to pre-sort applications,” says Carina Newen. “The data used to train the AI contain biases that the AI system then perpetuates.”
Central dimensions from a philosophical perspective
Discussing the philosophical perspective, the team uses ChatGPT as an example, which generates an intelligent-sounding answer to each question and prompt, but can still hallucinate: “The AI system invents information without making that clear,” emphasizes Albert Newen. “AI systems can and will be helpful as information systems, but we have to learn to always use them with a critical eye and not trust them blindly.”
However, Albert Newen considers the development of chatbots as a replacement for human communication to be questionable. “Forming interpersonal trust with a chatbot is dangerous, because the system has no obligation to the user who trusts it,” he says. “It doesn’t make sense to expect the chatbot to keep promises.”
Observing the reliability profile with the various dimensions can help understand the extent to which humans can trust AI systems as information experts, say the authors. It also helps to see why critical, routine understanding of these systems will be increasingly required.
Collaboration in the Ruhr Innovation Lab
Ruhr University Bochum and TU Dortmund University, which currently apply together as the Ruhr Innovation Lab in the Excellence Strategy, work closely on issues that help to develop a sustainable and resilient society in the digital age. The current publication stems from a partnership of the Institute of Philosophy II in Bochum and the Research Center Trustworthy Data Science and Security. The Center was founded by the two universities together with the University of Duisburg-Essen within the University Alliance Ruhr. The author Carina Newen was the first doctoral student to receive a doctorate from the Research Center.