Legal AI Needs a Duty of Care

I was listening to the Risky Business podcast, durig an interview with the CEO of Thinkst Canary they mentioned a simple idea that felt surprisingly relevant to the state of legal AI. Thinkst have published a /security page that explains, without fuss, what their product can safely do, what it cannot do and the features they refuse to ship because they cannot make them safe. It is straightforward, oddly rare in legal tech and probably one of the most practical ideas we could adopt.

We spend a lot of time talking about responsible AI, but very few legal tech products describe their design choices in a way that actually helps buyers understand risk. Instead we default to the usual assurances of Encrypted. Private. Compliant (yawn). All of which matters, but none of which tells you whether an AI tool behaves safely inside a matter-driven environment where privilege and client separation are the foundations of trust.

It feels like the right moment for something more grounded. Something that matches how legal work actually operates rather than how a product brochure presents itself.

The real risks sit underneath the features

When you look at the issues that genuinely worry lawyers, they rarely come from surface-level capabilities. They tend to appear underneath. A retrieval step that drifts across matters. A connector that reaches further than intended. An agent granted write access that nobody fully controls. A model that pulls context from a folder it should not even be able to see.

These risks do not come from the presence of AI. They come from design decisions that were never made explicit. A guardrail or a policy document cannot compensate for an architecture that allows behaviour the vendor never intended. This is why most procurement checklists, however well written, are mismatched to the actual failure modes of AI.

Legal AI is moving quickly, but the way we evaluate these products still belongs to an earlier era. The tools have changed. The questions have not.

Vendors should be willing to explain their boundaries

There is real value in a vendor explaining where the edges of their system sit. Boundaries say more about the maturity of a product than any feature list. They show that the vendor has challenged their own design and chosen not to include behaviours that cannot yet be delivered responsibly.

A legal AI vendor operating in today’s environment should be able to make clear statements about how the product behaves. For example:

it does not write into live documents
it does not ingest entire repositories
it does not pool client material for training or optimisation
it does not perform retrieval across matters
it avoids capabilities that depend on unpredictable model behaviour

These boundaries are a sign of judgement, not a lack of ambition. Some ideas can technically be built today, but that does not mean they can be delivered safely within the expectations of legal practice.

A shared duty of care would raise the standard

For legal AI to become something firms can genuinely rely on, we need a common way for vendors to describe their design decisions and the risks they have chosen to accept. A duty of care should not be a glossy marketing document. It should be a practical explanation of how the product behaves and how its boundaries are enforced.

A useful structure could look like this.

1. Scope of access
A clear account of what the system can read, what it can write and which areas it never touches.

2. Boundary enforcement
A description of how matter separation is maintained in practice, including the mechanisms that prevent cross-contamination rather than assuming the model will behave.

3. Behaviour under failure
An explanation of how the system responds when the model produces something unusual and how this is contained so it never reaches privileged content.

4. Breach containment
A realistic assessment of the consequences of a vendor-side compromise, with attention to the parts of the design that limit the blast radius.

5. Intentional omissions
A list of the capabilities the vendor has chosen not to build because the risks cannot yet be controlled. This is often the most revealing part of the document.

6. Evidence of validation
A description of how boundaries are tested, how releases are verified and how clients can gain confidence that the assurances are consistent over time.

This structure is simple enough for busy legal leaders and technical enough to matter. More importantly, it creates a baseline for comparing vendors on the aspects that determine operational safety rather than surface-level polish.

What this standard looks like in practice

Below is a short example for a fictional vendor, LexLexLex AI, that shows how a duty of care can be expressed clearly without drowning the reader in detail.

LexLexLex AI: Duty of Care Overview (Example)

Scope of access
LexLexLex processes only the documents a user selects inside the workspace. It does not crawl folders, index surrounding material or retain copies after a session. Each matter sits in a physically separate storage space.

Boundary enforcement
Every request generates a short-lived workspace that is limited to the specific matter. All retrieval paths are checked at the storage layer. If a document does not belong to that matter, the system cannot see it.

Behaviour under failure
AI outputs are reviewed inside the application before a user can export them. If the model references content outside the selected documents or produces unexpected patterns, LexLexLex pauses the process and raises it for review.

Breach containment
If the LexLexLex environment were compromised, only the temporary workspaces created during active user sessions would be exposed. These contain only the documents explicitly selected by users and are destroyed when the session ends.

Intentional omissions
LexLexLex does not support autonomous contract rewriting, bulk ingestion of repositories or cross-matter embedding. These capabilities are considered unsafe because their boundaries cannot be guaranteed.

Evidence of validation
Each release is tested using a fixed set of boundary cases designed to expose unintended retrieval or leakage. Clients can request results or run the same tests inside their own environment.

This level of clarity helps firms understand whether a product aligns with their risk appetite. It also creates a culture in which vendors are encouraged to show their reasoning rather than hiding behind broad assurances.

Buyers need better questions, not longer checklists

One of the reasons legal AI conversations stall is that people rely on familiar compliance questions. They are tidy, predictable and easy to check off, but they do not reveal how an AI system actually behaves. The next generation of procurement needs to probe the underlying design rather than the surface guarantees where the aim is only to cover the asker but provide little benefit.

The more useful questions are closer to these:

Which risks remain inherent in your current design, and how are they contained?
How do you enforce matter boundaries at the storage layer?
Which capabilities have you intentionally avoided because they cannot yet be delivered safely?
How do you test for context leakage or boundary drift, and what blocks a release?
If your environment were compromised, which client assets would be accessible and why?
What prevents your system from modifying or overwriting live documents?
How do you ensure new features or updated prompts do not widen the blast radius?
What evidence do you provide to show these assurances hold over time?

These questions shift the evaluation from sentiment to structure. They encourage vendors to explain their reasoning instead of offering recycled phrases.

Legal AI does not always need deeper autonomy

Another helpful reminder from the podcast was that some of the safest and most durable controls are those that deliberately avoid unnecessary complexity. Legal tech has equivalent examples. Retrieval-only tools, structured extraction, local inference for sensitive content and clear matter segmentation often provide more predictable value than an agent trying to interpret context across entire repositories.

There is always a temptation to build the next clever layer. The responsible choice is to expand capability only when you can guarantee the boundary that protects it.

Legal work depends on trust grounded in reasoning. AI adds new possibilities but also new failure modes. A duty of care helps vendors articulate their thinking and gives firms something more substantial than a slogan. It is a step towards tools that are both powerful and defensible.

The idea came from a small moment in a security conversation, but it translates neatly into the legal world. If a product wants to handle privileged work, it should be prepared to explain its limits and its judgement. When a vendor cannot do that, the issue is not the buyer’s scrutiny. It is the product.