How long does this kind of vendor diligence take per vendor?

A clean diligence on a single AI vendor, including reading the DPA, the subprocessor list, the SOC 2 Type II report, and a working session with the vendor's security team, takes a partner six to ten hours. We aim to complete a verdict within two weeks of kickoff per vendor. A complex vendor with multiple subprocessors and a custom DPA can run longer.

Do we have to ask all five questions to every vendor?

Yes, every time. The five questions are designed to be cheap to ask and expensive to fake. A vendor that does not have these answers ready for a buyer in your tier is not a vendor that has done the work. The cost of asking is fifteen minutes; the cost of not asking can be a deposition.

What if the vendor refuses to answer in writing?

That is the answer. A vendor unwilling to commit a retention period, a subprocessor list, or a notification SLA in writing is telling you they reserve the right to change those things without notice. For any data class above commodity sensitivity, that is not a vendor your firm should be using. We have walked away from vendors at the order-form stage on this single point.

Are these questions only for the big AI platforms or also for vertical AI tools?

Both. The questions are vendor-agnostic. They apply equally to Microsoft 365 Copilot, OpenAI Enterprise, Anthropic Commercial, and to niche legal, deal, or research AI tools. The large platforms tend to have stronger answers on Questions 1 and 5 because their compliance org has been pressed by regulated customers; smaller vendors sometimes have cleaner answers on Question 2 because the role boundaries are genuinely simpler. The questions surface the truth either way.

Does running this diligence delay deployment?

Less than a buyer expects. The cost of a clean diligence at the front is a fraction of the cost of a remediation, an incident, or a regulator inquiry on the back. The diligence runs in parallel with the architecture work, so a deployment timeline does not have to lengthen because of it. The hours skipped at the front tend to show up later, under pressure, in a worse posture.

What if our firm has already deployed AI vendors without this diligence?

That is a common starting point. The retroactive version of the work is a 'vendor reconciliation': run the five questions against every AI vendor already in use, produce a per-vendor verdict, and recommend the remediation that closes the gap (a contractual amendment, a configuration change, a tier upgrade, a replacement, or a sunset). Better to find the gap on a schedule the firm controls than on a schedule a regulator controls.

Can we use this checklist on non-AI vendors too?

Most of it generalizes. Questions 1, 2, 4, and 5 apply to any SaaS vendor handling confidential data. Question 3 (model versions) is AI-specific, but the underlying principle (reproducibility and change-management) applies to any vendor whose product behavior changes silently. If you would like the broader vendor-risk framework applied across your stack, that is a Trifident advisory engagement regardless of whether AI is the trigger.

5 AI Vendor Due Diligence Questions

Why we wrote this down

A typical AI vendor sales call runs forty-five minutes. Ten are a logo slide and a roadmap. Twenty are a feature demo. Fifteen are pricing and a soft close. By the end, the vendor has answered every question they wanted to answer and almost none of the ones that decide whether the tool is safe to plug into your firm.

The buyer-side gap is structural. A general counsel can read a DPA. A CIO can read a SOC 2 Type II report. Neither of those documents describes what actually happens to a prompt after the user clicks Send. That is where this diligence operates: not on the contracts the vendor will sign, but on the operational reality the contracts are supposed to govern.

The five questions below are the ones we put to a vendor before recommending the tool to a client. They test data flow, insider risk, change management, supply chain, and incident response. They are designed to be answerable, by a vendor that knows what it is doing, inside the time a procurement cycle actually allows. The contractual layer (DPA, ZDR, BAA) sits underneath them. Contracts describe what is supposed to happen. These questions test what does.

If your firm wants help running this kind of diligence, that is what our Secure AI Deployment practice was built for.

An AI vendor pitch lasts forty-five minutes. We have five questions. If the vendor cannot answer them clearly, we do not recommend the vendor.

"Where exactly do our prompts go after the response is rendered?"

Why we ask

Every prompt is a small data export. The text leaves the user's machine, lands in a vendor's tenant, gets logged in some form, and may be retained, indexed, or sampled. The vendor either knows the path precisely or they do not. If they do not, no contractual language can save you. This question tests data flow, retention, log scopes, and geographic egress in one breath.

What a good answer looks like

"Prompts are TLS-terminated at our edge in us-east-1, the inference call runs in the same region against an Anthropic Claude 3.5 Sonnet endpoint we hold under a Zero Data Retention agreement, and the prompt and completion write to a short-lived audit log retained thirty days for abuse monitoring. That log is encrypted at rest with a customer-managed key, accessible only to two named trust-and-safety staff via an audit-logged review console, and the access pattern is in our SOC 2 Type II scope. We can pin you to a single region by contract."

That is a vendor who has thought about the question.

What a red-flag answer looks like

"We are SOC 2 Type II certified and use enterprise-grade security. Your data is encrypted in transit and at rest." That is marketing copy. It names no region, no retention period, no logging scope, no human role. It disqualifies the vendor for any data class above commodity sensitivity.

A second red flag: the answer changes by who you ask. Sales says one thing, security another, engineering a third. If the company has no single source of truth on its own data flow, neither do you.

Where this fails in practice

A common pattern: a vendor's marketing claims end-to-end encryption and a no-training default. The buyer assumes a clean flow. The actual flow routes prompts through a third-party embedding service in a different region, retains content ninety days under a "quality assurance" provision, and exposes the content to a customer-success team that does not appear on any trust-and-safety org chart. Nothing about that contradicts the public marketing. Most of it would violate a typical outside-counsel guideline. The marketing copy is true. The diligence question reveals what the marketing copy did not have to mention.

"Who at your company can read our prompts in plaintext, and under what circumstances?"

Why we ask

Technical controls only matter if you trust the people inside the vendor's perimeter. Insider risk is the most uncomfortable part of vendor evaluation, and the part that gets the least attention. For the data we work with, an engineer browsing prompts on a Friday afternoon is not a hypothetical. A good answer enumerates roles, conditions, and oversight.

What a good answer looks like

"Three roles can see plaintext prompts under defined conditions. On-call engineering can access live inference logs for fifteen minutes after a paged incident, with written justification reviewed by a security lead within twenty-four hours. Trust-and-safety can review abuse-flagged samples, audited. Customer support can request access only with your written approval through a documented ticket. No model training team has access. Subprocessors do not. We will share the policy and audit log in writing on request."

What a red-flag answer looks like

"Only authorized personnel have access, and we follow industry best practices." Meaningless. No roles, conditions, durations, or oversight. It satisfies a checkbox, not a question.

A more dangerous red flag: the vendor admits, often casually, that any engineer with production access can pull prompts for debugging, or that customer success has standing access to all accounts. Both happen more often than the industry will admit. Both disqualify the vendor for privileged matter, MNPI, or principal-grade material.

Where this fails in practice

A vertical AI tool aimed at investment teams. Product is good, demos land, the assistant feels useful from the first prompt. Question 2 then surfaces the support architecture: any engineer on the customer-success team can open any customer's recent prompt history to "help users get unstuck," with no audit trail and no notice to the customer. A deal team using the tool would be pasting target-company financials into an environment where a stranger in a support seat could browse them. None of this is in the marketing. None of it has to be. It lives entirely in the support model, which is the question this question reaches.

"What model versions do we run against, and what is your release schedule?"

Why we ask

Most AI vendors are reseller layers on top of a foundation model they do not control. When OpenAI ships a new GPT version or Anthropic deprecates a Claude endpoint, the vendor either pins your inference or silently swaps the model. Silent swaps are the default. This question tests reproducibility and change-management. "We used the AI to draft this" is a different statement when the AI is a fixed snapshot than when it is "whatever the vendor was routing that week."

What a good answer looks like

"Today you are on Claude 3.5 Sonnet, snapshot 2024-10-22, on the Anthropic API under a Zero Data Retention agreement. You can pin to a specific snapshot by contract. We give thirty days written notice before any model upgrade and ninety before deprecating a snapshot, and we keep two prior snapshots in production for customers who have not migrated. If you require a fixed version for an auditable workflow, we will write it into the order form."

What a red-flag answer looks like

"We always use the latest and greatest models so you get the best results." That sounds like a feature. It is the opposite for any firm with a record-keeping obligation. It means the model that produced an output last quarter is not the model that would produce it today, and there is no way to show a third party what actually generated the artifact in your file.

A subtler red flag: the vendor cannot tell you, in real time, which model is routing your traffic. Ask the question on a sales call. The honest answer takes seconds; the operations team has the configuration in front of them. The pause, the message to engineering, the ten-minute lookup, mean the same thing every time: the vendor is not pinning your traffic in real time.

Where this fails in practice

A law firm using an AI document-review tool across a litigation matter. The vendor silently upgrades its model six weeks into review. Output quality shifts in ways the associates notice and the partners do not. When opposing counsel later asks about the review process, the firm cannot say which model reviewed which document. The version pin is the difference between a defensible audit trail and a deposition exhibit. For matters of any duration, a law-firm AI program should pin versions for the life of the matter.

"What is your subprocessor list, and how do we get notified when it changes?"

Why we ask

No AI vendor runs alone. Every one depends on hosting, monitoring, error tracking, support tooling, analytics, and one or more foundation-model providers. Each subprocessor may see prompt content, completion content, or sensitive metadata. The trail matters when a regulator asks. This question often surfaces exposure paths the vendor did not realize were exposure paths.

Rob Fuller frames the broader shift cleanly: "agents are now privileged users." A subprocessor's automated pipeline reading your prompts has the same effective access scope as a junior employee with read on a sensitive folder. We treat them with the same scrutiny.

What a good answer looks like

"Our subprocessor list is published at this URL: hosting (AWS us-east-1 and us-west-2), foundation-model provider (Anthropic, enterprise agreement), error monitoring (Sentry, configured to scrub prompt content before transmission), support tool (Zendesk, never receives prompt content), analytics (PostHog self-hosted, no external transmission). We give your designated contact thirty days written notice before adding or changing a subprocessor, with a right of objection in your DPA. We last updated the list on March 14."

What a red-flag answer looks like

"We use industry-standard infrastructure and our subprocessor list is available on request." Opacity dressed up as discretion. A vendor who does not publish a subprocessor list does not have a clean one.

A separate red flag: the published list does not match what is visible on the wire. The reconciliation is straightforward. Run a test session, capture outbound connections, compare them to the published list. Discrepancies are common. Sometimes a forgotten free-tier monitoring SDK. Sometimes a serious oversight. Either way the gap must close before signature; the published list is a representation, the wire is the audit.

Where this fails in practice

A family office deploying an AI assistant for principal correspondence. The published subprocessor list reads clean. The wire tells a different story: error logs shipping to a monitoring vendor based outside the United States, with the client SDK never configured to redact prompt content before transmission. Correspondence about a principal's medical letters would be visible to a foreign-domiciled subprocessor that never appeared on the buyer-facing list. Subprocessor reconciliation belongs on every family-office AI program calendar; the wire is the audit, not the document.

"In a worst-case incident, what is your notification timing and what do we get in writing?"

Why we ask

Every vendor will eventually have an incident. The shape of the relationship is set by what they are obligated to do in that moment, and how fast. The day of the incident is not the day to learn the SLA. This question tests contractual specificity, authority, and whether the people on the phone in a crisis can commit to anything in writing.

What a good answer looks like

"Forty-eight-hour written notification of any incident affecting your data, with a preliminary report covering affected data categories, time window, suspected vector, and containment. Thirty-day final report covering root cause, forensic timeline, remediation, and control changes. The response is led by our VP of Security, named in the DPA, with authority to commit the company. We will name the region, the audit-log range, and the specific prompts and completions involved, to the extent we can identify them. We will not require an NDA before sharing this with your counsel."

What a red-flag answer looks like

"We will notify you of any material security incident in accordance with applicable law." That is the floor, not a commitment. State breach-notification laws do not protect a confidential matter file or trust documents. A vendor who commits only to the legal floor is committing to tell you what they have to, when they have to. That is not a partner. That is a counterparty.

A sharper red flag: the vendor refuses to put a notification SLA in the order form. "We cannot commit to a specific timing because every incident is different" is the sentence to listen for. Every incident is different. A forty-eight-hour "we have an incident, here is what we know so far" is still a reasonable commitment for any vendor that takes incident response seriously, and any vendor that pushes back on it is telling you something about their crisis posture.

Where this fails in practice

A private equity firm running a vendor whose default DPA contains no notification SLA. The vendor has a real incident. The firm learns about it forty-one days later, in a generic email that does not say whether the firm's data was involved. Two weeks of partner time and outside-counsel hours follow, trying to determine if MNPI on a target was exposed. By the time the firm has an answer, the deal has moved. The forty-eight-hour SLA, named human, and no-NDA-gate posture is what a PE AI program should require before signature, not after the call from the vendor's general counsel.

Want this run on the AI vendors your firm is evaluating?

Two starting points; both lead to the same place. The questionnaire is yours to take in 60 to 90 minutes. The briefing is thirty.

Discovery Questionnaire

What this looks like in our Discovery phase

In a Trifident engagement, these five questions sit inside a broader vendor-risk worksheet alongside the Discovery Questionnaire and the architecture phase of our Secure AI Deployment practice. The questionnaire covers the firm's side: data classes, workflows, regulators. The vendor diligence covers the platform the firm is about to plug in.

The deliverable is a written per-vendor verdict: pass, conditional pass with required contractual or configuration changes, or no. Each verdict is signed by a Trifident partner. It is the kind of artifact a malpractice carrier, an SEC examiner, an LP DDQ team, or an opposing counsel can ask to see. The verdict stays current as the vendor's terms change.

If a firm would rather run the diligence in-house, the five questions are useful as a standalone checklist. Print them. Hand them to the next vendor on the calendar. If the firm would rather we run it, the briefing is thirty minutes and confidential.

The 5-question due diligence we run on AI vendors before we sign.

Why we wrote this down

"Where exactly do our prompts go after the response is rendered?"

"Who at your company can read our prompts in plaintext, and under what circumstances?"

"What model versions do we run against, and what is your release schedule?"

"What is your subprocessor list, and how do we get notified when it changes?"

"In a worst-case incident, what is your notification timing and what do we get in writing?"

Want this run on the AI vendors your firm is evaluating?

What this looks like in our Discovery phase

Further reading

The questions buyers ask us about this playbook

Want this run on the AI vendors your firm is evaluating?

Start the Discovery Questionnaire

Schedule a Confidential Briefing