First published by Chartered Institute of Arbitrators Australia on 4 December 2025, Laina Chan and Dr Brydon Wang discuss Beyond the Black Box: From Hallucination to Proof in Legal AI.

By Laina Chan, CEO of MiAI Law
May 26, 2026

Snapshot

Most so-called “Legal AI” tools predict language rather than prove law. Trustworthy systems must be built from first principles to meet the law’s demand for verifiable reasoning.
Governance must move beyond risk and compliance to embed rule-of-law standards — transparency, procedural fairness, accountability and auditability — in both architecture and oversight.
Lawyers and developers share duties of verification, disclosure and competence. The profession’s trust will not rest on perfection but on proof: requiring systems that set out their reasoning and align technology with legal method and ethics.

Introduction

The rise of artificial intelligence in legal practice has compelled the profession to ask a different question: not what AI can do, but what it should be built to do? From our distinct yet converging perspectives, one as a barrister developing AI systems for legal reasoning, and the other as a researcher and governance expert in trustworthy AI and law, we share a single concern: that most ‘Legal AI’ still operates on probability, not proof.

The Problem Beneath the Promise

Artificial intelligence in law has been described in breathless terms: both revolutionary and de-stabilising, transformative and corrosive. Yet across the profession, we continue to see the one recurring misconception that all Legal AI is simply a large language model (LLM) dressed in legal branding, as good—or as bad—as its training data. As a barrister and an academic working at the intersection of law, technology and governance, we see how that assumption distorts both regulation and practice. This framing reduces all AI platforms to linguistic probability, precluding developments that are built from first principles, where structured reasoning forms the scaffolding on which linguistic probability is then applied. Without this initial first principles approach, we argue that legal AI is relegated to a mere tool that produces text, not proof.

When Plausibility Fails: The Deloitte Moment

In October 2025, the Australian Financial Review reported that Deloitte had refunded part of a $440,000 government contract after discovering that a report it produced was riddled with AI-generated citation errors. These included fabricated titles, misattributed quotes and paragraph numbers that did not exist.[i] As The Mandarin subsequently observed, the report even contained invented footnotes and a garbled quotation attributed to Amato.[ii] For lawyers and policy designers alike, this was not a story about proofreading but one of structural failure. The news provided a clear example of what happens when systems built to predict language are asked to demonstrate law. In law, plausibility is never enough. Our discipline demands verifiability, auditability and proof.

Courts Know the Limits… But Only Half the Story

Across Australia, judicial policy now converges on a baseline understanding of generative AI: that LLMs are probabilistic, that they hallucinate, that their processes are opaque, and that human verification is essential.[iii] From New South Wales’ Practice Note SC Gen 23 to Queensland’s Guidelines for Judicial Officers on the Use of Generative AI, every jurisdiction echoes the same warning: AI should only be used with a human-in-the-loop to check and disclose how it has been used.[iv] That baseline is correct for public chatbots such as ChatGPT, but it is incomplete if assumed to describe all possible forms of AI in law. The next phase of development — and, importantly, regulation — must ask a deeper question: Can AI be designed to reflect law’s own discipline?

AI in Arbitration

Artificial intelligence is also beginning to reshape arbitration and alternative dispute resolution, where cross-border disputes, multilingual parties and compressed timelines create strong incentives to adopt new technologies. In Dubai, practitioners working within the DIFC have increasingly turned to generative-AI tools not merely for efficiency, but to bridge persistent language barriers in drafting, research and communication. The DIFC Courts have directly acknowledged the risks inherent in AI. In Practical Guidance Note No. 2 of 2023 – Use of Large Language Models and Generative AI in the DIFC Courts, the court warned of the potential for erroneous output, setting out the requirement for parties to verify and, in some cases, disclose the use of AI-generated material in submissions.[v] The Note reflects a judicial recognition that unless AI is subject to disciplined verification and transparent governance, it risks undermining procedural fairness in a forum that depends on speed, autonomy and trust.

At its heart, Arbitration depends on credibility, expert evidence and the efficient management of complex cross-border proceedings. When language-prediction systems serve as a weak substitute for the legal method, they threaten not only accuracy but also the integrity of the process. The Chartered Institute of Arbitrators (CIArb) added its voice to the global conversation on trustworthy AI with the release of its Guidelines on the Use of AI in Arbitration on 13 March 2025.[vi] The Guidelines make clear that arbitrators cannot remain passive recipients of AI-generated material. Instead, they must actively shape the conditions under which AI is used in their proceedings. CIArb urges tribunals to raise the issue at the outset, to ensure that parties understand the parameters of permissible AI use and the corresponding duties of verification and disclosure.

Further, the Guidelines also contemplate the appointment of AI experts where specialised technical understanding is required, recognising that procedural fairness demands transparency not only in outcome but in method. Importantly, CIArb encourages tribunals to address AI directly in their awards and to factor non-compliance with AI-related directions into costs. In doing so, it reframes AI not as an unregulated convenience but as a component of arbitral procedure that must be governed with the same rigour and accountability that law demands.

Trustworthiness by Design: Aligning Governance with the Legal Method

The governance of AI in law must move beyond compliance to embody trustworthiness. Most existing frameworks, from the EU AI Act to Australia’s AI Ethics Principles and the 2023 Interim Guidance on Government Use of Generative AI Tools, to the OECD’s AI Guidelines, focus on managing risks, increasing transparency of data flows[vii] and decision-making processes[viii] and providing human oversight. These are essential but incomplete as they govern outcomes but not reasoning. In contrast, legal practice holds that reasoning is the very process that demands governance.

A trustworthy system for legal decision-support must therefore be governed not only by ethical principles but by the law’s standards of reasoned decision-making, transparency and procedural fairness that underpin the rule of law. It is insufficient to discuss these merely as ‘transparency’[ix] because what is required is a fundamental rethink of design principles: first, data lineage should function as the equivalent of an evidentiary chain; second, explainable reasoning paths must take form and draw from our extensive history of written judgments; and third, audit logs should serve as records for procedural fairness. In this manner, governance and architecture do not sit as separate domains but as interdependent layers of trust.

Where the Law’s discipline demands evidence, governance demands accountability. In trustworthy AI research and professional practice alike, the two must converge. Systems designed for legal use must not only retrieve the right sources, they must demonstrate why those sources matter. Trustworthy AI requires verifiable processes, traceable data flows and clear lines of responsibility.[x] From our combined perspectives (practitioner and governance specialist), we see that the future of legal AI will depend on embedding trustworthiness-by-design into both the technical architecture and the regulatory frameworks that oversee it. Trustworthiness in this context encompasses ability, integrity and, most importantly, benevolence: where the law’s orientation towards the individual requires the system to perform reliably, make its reasoning transparent and accountable for conclusions presented.

Architecture is necessary but not sufficient. Without governance, which we see articulated as professional standards, carefully calibrated verification protocols and disclosure norms, even the best-designed systems risk misuse. Our shared position is that the discipline of law and the discipline of engineering must meet in design. AI tools used in legal settings should embed:

Source primacy – retrieval limited to cases and statutes.
Structured reasoning – logic steps that are visible and explainable.
Guardrails – capacity to refuse an answer when data is insufficient.
Audit trails – persistent records for oversight and accountability.

When those features are codified not only in software but also in professional regulation, we move from trusting AI to verifying it.

Ethical Accountability: Beyond Technical Compliance

Ethical accountability in legal AI must extend beyond technical safeguards to moral responsibility. Lawyers cannot outsource judgment to machines. Likewise, developers cannot defer accountability to users. Both lawyer and developer must share responsibility for ensuring that automated reasoning remains anchored to law’s ethical foundations: fairness, integrity[xi] and commitment to transparency.[xii]

Professional ethics clearly articulate these obligations within the existing framework of legal duties. The duty of competence requires lawyers to understand the tools they use. The duty to the court demands that submissions are verified and truthful. The duty of confidentiality extends to data fed into machine systems. These duties do not vanish in the digital age. Instead, they expand. When AI is used in the practice of law, ethical accountability shifts from private virtue to shared governance, expanding from a lawyer’s intention at the individual level to system design that applies in each scenario of use in a legal context.

At the same time, human oversight must be substantive, going beyond the symbolic. To preserve trust, practitioners must understand and explain how a conclusion is reached. This capacity to contest and verify outputs is central to trustworthy AI. Ethical accountability therefore rests on three pillars: transparency in design, responsibility in deployment, and explainability in use. Together, these sustain what law demands most: reasoned justification.

From Prediction to Proof: Designing for Law’s Method

This premise underpinned the work at MiAI Law. If the legal method is defined by transparent reasoning and traceable authority, then any AI built for law should replicate those features. MiAI Law therefore retrieves only from primary sources — legislation and judgments. Each report is set out in the IRAC form (Issue, Rule, Application, Conclusion), every proposition footnoted to a pinpoint citation and the reasoning path exposed.

From a practitioner’s standpoint, that design is transformative. It enables verification at every step and ensures that legal propositions are proven, not predicted. From a governance perspective, this architecture introduces the very transparency and accountability that courts and regulators seek. When every output can be traced from issue to authority, the system itself becomes auditable — a precondition for public trust.

The Future: Accountability by Design

The future of legal AI will not turn on faster models or larger datasets but on whether architecture and governance can reinforce each other. For practitioners, that means using tools that are verifiable and auditable. For technologists and regulators, it means recognising that legal AI must embody the principles of trustworthy automation — fairness through transparency, accountability and explainability.[xiii]

As AI becomes embedded in legal work, the profession itself must set the standard for its responsible use. Judicial and professional bodies across Australia and beyond have begun to issue guidance, but these remain general. What is now required is a clear and enforceable framework for professional accountability in the face of ‘the dynamism of AI technologies’[xiv] and AI use.

At a minimum, lawyers should be subject to three obligations. First, a verification duty: every AI-augmented output must be checked against the source materials it purports to summarise. Second, a disclosure duty: when AI has materially contributed to legal analysis or drafting but has not been the subject of thorough human review and verification, this must be declared to clients or the court. It is insufficient to rely on blanket acknowledgements in contractual provisions or general disclaimers that AI ‘may have been used’. Disclosure must be specific, providing sufficient context and granularity, and needs to be proportionate to the task performed and the review undertaken. Where the necessary reviews have been completed to the satisfaction of the legal practitioner, the duty of candour that binds legal professionals will suffice.

Third, a competence duty: practitioners must understand the limitations of AI systems they use, including their data sources, reasoning processes and potential biases. A mature understanding of these limitations also requires acknowledging a parallel truth within the profession: perfection is a fiction. Lawyers frequently demand from technology a level of accuracy and consistency that exceeds what is attainable by human practitioners themselves. Recognising this asymmetry allows for more robust and transparent conversations with clients, our peers and the bench, about what can reasonably be achieved. Competence, in this sense, is not the pursuit of infallibility but the disciplined capacity to understand and manage imperfection.

Trustworthiness as the Foundation of Law and AI

The measure of progress in legal AI will not be perfection but proof. The AI systems most aligned with the discipline of law will not be those that promise flawless performance, but those that make their limitations visible and their reasoning accountable.[xv] In both architecture and governance, transparency must replace aspiration as the foundation of trust. When AI reveals its process, when each conclusion can be traced, explained and tested by the human legal user, it becomes a participant in the legal method rather than a distortion of it.

This reframing matters because the pursuit of perfection in technology has long obscured the humanity of legal practice itself. Lawyers err, evidence is contested, and reasoning evolves. The rule of law endures not because it is perfect, but because it is accountable. The same must hold true for the tools that assist it. Trustworthy legal AI must therefore be designed not to conceal uncertainty, but to articulate it.

From our respective vantage points — one co-author from the Bar, the other from the trust and governance frontier — we share a conviction that the future of legal AI depends on designing for accountability, not infallibility. Law’s enduring task is to make reasons visible. The same must hold for the machines we build to assist it, with the measure of progress in legal AI not in speed, but proof and trust. In this manner, the trustworthy legal AI of the future will not ask to be believed but will show its workings, going beyond the quest for perfection of code but in sending the signals to the recipient of AI-augmented output. Only then can we say that AI in law serves justice rather than undermines it.

First published in Digital Magazine July 2026: Why Legal AI is Shifting from Generation to Verification

First published in ITBrief, ChannelLife and CFOtech June 2026: The real problem in legal AI is not generation