The Category Error of AI Trust: Why We Need Trustability Before Trustworthiness

The Category Error of AI Trust: Why We Need Trustability Before Trustworthiness

In recent years, the question of whether we can trust artificial intelligence (AI) has moved from philosophical events to legislative chambers. Trust is often described as the foundation of interpersonal interactions and a fundamental component of a functional society. As we integrate AI into our lives — from chatbots to autonomous decision-making systems — we find ourselves asking: 

Is this system trustworthy?

In a new paper published in AI and Ethics, Jonathan Tallant and I argue that we are asking the wrong question.

By focusing on “trustworthiness” — whether an AI system is worthy of trust — we skip a grounding step: “trustability.” Absent from the philosophical literature, this concept defines whether an entity is even the kind of thing that can be trusted. We argue that treating AI as a candidate for trust is not just risky, but also a category error with profound ethical consequences.

The Trustability Gap

The distinction between “trustability” and “trustworthiness” is the paper’s central conceptual contribution.

  • Trustworthiness is a normative evaluation: Is this agent honest, competent, and reliable? It assumes the agent is capable of entering a trust relationship.
  • Trustability is a categorical threshold: Is this entity capable of holding the moral and structural qualities required for trust to exist at all?

Drawing on Paul Faulkner’s “grammar of trust” and recent work by Massaguer Gómez on human-robot interaction, we argue that current AI systems fail this first test. They elicit feelings of trust — through natural language, polite interfaces, and confident answers — without possessing the internal capacity to be trustable. They are mimics of trustworthiness, operating in a vacuum of accountability.

From Misplaced Trust to Structural Incoherence

When we trust an untrustable entity, we are not merely making a bad bet. We are engaging in a “structurally incoherent” attitude. Our paper posits that trust is not just a prediction of behavior (which would be mere “reliance”) but a normatively charged stance. We feel “betrayed” when a trusted person fails us. When a machine fails, we may be disappointed or harmed, but we cannot be “betrayed” in the moral sense because the machine was never a moral agent to begin with.

This confusion leads to problematic design and governance outcomes. If we treat AI as a “trust partner,” we might try to solve failure modes by “building trust” (e.g., making the AI sound more empathetic) rather than “ensuring reliability.” In our paper, we warn that many AI systems today are designed precisely to obscure this distinction, encouraging users to project human-like accountability onto software that cannot reciprocate it.

Reliance with Accountability

If we cannot “trust” AI, what should we do? We propose shifting our framework from “trust” to “reliance with accountability.”

  • Reliance is instrumental: We rely on a car to start or a bridge to hold weight. If it fails, we fix the engineering. No moral relationship exists between us and the tool.
  • Trust is relational and requires agency: It involves a relationship between moral agents — beings capable of understanding obligations, making choices, and being held accountable for their actions. When we trust someone, we implicitly assume they can comprehend what we’re trusting them with and can choose to honor or betray that trust.

By categorizing AI strictly under “reliance,” we clarify the ethical landscape. The key point is not about our vulnerability to machines — we are vulnerable to bridges and cars too — but about the absence of moral agency in AI. A machine cannot understand betrayal because it cannot make a moral choice. It simply fails or succeeds according to its programming.

The burden of responsibility therefore shifts from the machine (which cannot bear it) to the institutions, developers, and policymakers who deployed it. They are the moral agents. Our paper examines how this aligns with debates on trust in governments and institutions. While AI itself cannot be a trustee, the systems surrounding it can be held to standards that make reliance rational.

Conclusion: The Conditions for Future Trust

The paper concludes with an exploration of whether AI could ever become trustable. We outline the technical, moral, and political prerequisites necessary to reach this threshold. Simply making models more accurate is not enough.

For an AI to be genuinely trustable, and possibly trustworthy, it would require a status that integrates it into our moral expectations in a way that does not currently exist: by a recognition of dependence and accountability.

Moreover, trustworthy AI would also depend on the quality of AI integration in social, political, and organizational landscapes — on whether the systems are embedded in structures of genuine oversight, accountability, and redress.

Until then, trust in AI is a misnomer. Developers and policymakers must diagnose where trust is structurally impossible and replace it with rigorous, verifiable reliability rather than trying to humanize our machines. Instead of trying to forge a relationship with our tools, we must take responsibility for how we use them.