Why we wrote this down
A typical AI vendor sales call runs forty-five minutes. Ten are a logo slide and a roadmap. Twenty are a feature demo. Fifteen are pricing and a soft close. By the end, the vendor has answered every question they wanted to answer and almost none of the ones that decide whether the tool is safe to plug into your firm.
The buyer-side gap is structural. A general counsel can read a DPA. A CIO can read a SOC 2 Type II report. Neither of those documents describes what actually happens to a prompt after the user clicks Send. That is where this diligence operates: not on the contracts the vendor will sign, but on the operational reality the contracts are supposed to govern.
The five questions below are the ones we put to a vendor before recommending the tool to a client. They test data flow, insider risk, change management, supply chain, and incident response. They are designed to be answerable, by a vendor that knows what it is doing, inside the time a procurement cycle actually allows. The contractual layer (DPA, ZDR, BAA) sits underneath them. Contracts describe what is supposed to happen. These questions test what does.
If your firm wants help running this kind of diligence, that is what our Secure AI Deployment practice was built for.
An AI vendor pitch lasts forty-five minutes. We have five questions. If the vendor cannot answer them clearly, we do not recommend the vendor.
"Where exactly do our prompts go after the response is rendered?"
Every prompt is a small data export. The text leaves the user's machine, lands in a vendor's tenant, gets logged in some form, and may be retained, indexed, or sampled. The vendor either knows the path precisely or they do not. If they do not, no contractual language can save you. This question tests data flow, retention, log scopes, and geographic egress in one breath.
"Prompts are TLS-terminated at our edge in us-east-1, the inference call runs in the same region against an Anthropic Claude 3.5 Sonnet endpoint we hold under a Zero Data Retention agreement, and the prompt and completion write to a short-lived audit log retained thirty days for abuse monitoring. That log is encrypted at rest with a customer-managed key, accessible only to two named trust-and-safety staff via an audit-logged review console, and the access pattern is in our SOC 2 Type II scope. We can pin you to a single region by contract."
That is a vendor who has thought about the question.
"We are SOC 2 Type II certified and use enterprise-grade security. Your data is encrypted in transit and at rest." That is marketing copy. It names no region, no retention period, no logging scope, no human role. It disqualifies the vendor for any data class above commodity sensitivity.
A second red flag: the answer changes by who you ask. Sales says one thing, security another, engineering a third. If the company has no single source of truth on its own data flow, neither do you.
A common pattern: a vendor's marketing claims end-to-end encryption and a no-training default. The buyer assumes a clean flow. The actual flow routes prompts through a third-party embedding service in a different region, retains content ninety days under a "quality assurance" provision, and exposes the content to a customer-success team that does not appear on any trust-and-safety org chart. Nothing about that contradicts the public marketing. Most of it would violate a typical outside-counsel guideline. The marketing copy is true. The diligence question reveals what the marketing copy did not have to mention.
"Who at your company can read our prompts in plaintext, and under what circumstances?"
Technical controls only matter if you trust the people inside the vendor's perimeter. Insider risk is the most uncomfortable part of vendor evaluation, and the part that gets the least attention. For the data we work with, an engineer browsing prompts on a Friday afternoon is not a hypothetical. A good answer enumerates roles, conditions, and oversight.
"Three roles can see plaintext prompts under defined conditions. On-call engineering can access live inference logs for fifteen minutes after a paged incident, with written justification reviewed by a security lead within twenty-four hours. Trust-and-safety can review abuse-flagged samples, audited. Customer support can request access only with your written approval through a documented ticket. No model training team has access. Subprocessors do not. We will share the policy and audit log in writing on request."
"Only authorized personnel have access, and we follow industry best practices." Meaningless. No roles, conditions, durations, or oversight. It satisfies a checkbox, not a question.
A more dangerous red flag: the vendor admits, often casually, that any engineer with production access can pull prompts for debugging, or that customer success has standing access to all accounts. Both happen more often than the industry will admit. Both disqualify the vendor for privileged matter, MNPI, or principal-grade material.
A vertical AI tool aimed at investment teams. Product is good, demos land, the assistant feels useful from the first prompt. Question 2 then surfaces the support architecture: any engineer on the customer-success team can open any customer's recent prompt history to "help users get unstuck," with no audit trail and no notice to the customer. A deal team using the tool would be pasting target-company financials into an environment where a stranger in a support seat could browse them. None of this is in the marketing. None of it has to be. It lives entirely in the support model, which is the question this question reaches.
"What model versions do we run against, and what is your release schedule?"
Most AI vendors are reseller layers on top of a foundation model they do not control. When OpenAI ships a new GPT version or Anthropic deprecates a Claude endpoint, the vendor either pins your inference or silently swaps the model. Silent swaps are the default. This question tests reproducibility and change-management. "We used the AI to draft this" is a different statement when the AI is a fixed snapshot than when it is "whatever the vendor was routing that week."
"Today you are on Claude 3.5 Sonnet, snapshot 2024-10-22, on the Anthropic API under a Zero Data Retention agreement. You can pin to a specific snapshot by contract. We give thirty days written notice before any model upgrade and ninety before deprecating a snapshot, and we keep two prior snapshots in production for customers who have not migrated. If you require a fixed version for an auditable workflow, we will write it into the order form."
"We always use the latest and greatest models so you get the best results." That sounds like a feature. It is the opposite for any firm with a record-keeping obligation. It means the model that produced an output last quarter is not the model that would produce it today, and there is no way to show a third party what actually generated the artifact in your file.
A subtler red flag: the vendor cannot tell you, in real time, which model is routing your traffic. Ask the question on a sales call. The honest answer takes seconds; the operations team has the configuration in front of them. The pause, the message to engineering, the ten-minute lookup, mean the same thing every time: the vendor is not pinning your traffic in real time.
A law firm using an AI document-review tool across a litigation matter. The vendor silently upgrades its model six weeks into review. Output quality shifts in ways the associates notice and the partners do not. When opposing counsel later asks about the review process, the firm cannot say which model reviewed which document. The version pin is the difference between a defensible audit trail and a deposition exhibit. For matters of any duration, a law-firm AI program should pin versions for the life of the matter.
"What is your subprocessor list, and how do we get notified when it changes?"
No AI vendor runs alone. Every one depends on hosting, monitoring, error tracking, support tooling, analytics, and one or more foundation-model providers. Each subprocessor may see prompt content, completion content, or sensitive metadata. The trail matters when a regulator asks. This question often surfaces exposure paths the vendor did not realize were exposure paths.
Rob Fuller frames the broader shift cleanly: "agents are now privileged users." A subprocessor's automated pipeline reading your prompts has the same effective access scope as a junior employee with read on a sensitive folder. We treat them with the same scrutiny.
"Our subprocessor list is published at this URL: hosting (AWS us-east-1 and us-west-2), foundation-model provider (Anthropic, enterprise agreement), error monitoring (Sentry, configured to scrub prompt content before transmission), support tool (Zendesk, never receives prompt content), analytics (PostHog self-hosted, no external transmission). We give your designated contact thirty days written notice before adding or changing a subprocessor, with a right of objection in your DPA. We last updated the list on March 14."
"We use industry-standard infrastructure and our subprocessor list is available on request." Opacity dressed up as discretion. A vendor who does not publish a subprocessor list does not have a clean one.
A separate red flag: the published list does not match what is visible on the wire. The reconciliation is straightforward. Run a test session, capture outbound connections, compare them to the published list. Discrepancies are common. Sometimes a forgotten free-tier monitoring SDK. Sometimes a serious oversight. Either way the gap must close before signature; the published list is a representation, the wire is the audit.
A family office deploying an AI assistant for principal correspondence. The published subprocessor list reads clean. The wire tells a different story: error logs shipping to a monitoring vendor based outside the United States, with the client SDK never configured to redact prompt content before transmission. Correspondence about a principal's medical letters would be visible to a foreign-domiciled subprocessor that never appeared on the buyer-facing list. Subprocessor reconciliation belongs on every family-office AI program calendar; the wire is the audit, not the document.
"In a worst-case incident, what is your notification timing and what do we get in writing?"
Every vendor will eventually have an incident. The shape of the relationship is set by what they are obligated to do in that moment, and how fast. The day of the incident is not the day to learn the SLA. This question tests contractual specificity, authority, and whether the people on the phone in a crisis can commit to anything in writing.
"Forty-eight-hour written notification of any incident affecting your data, with a preliminary report covering affected data categories, time window, suspected vector, and containment. Thirty-day final report covering root cause, forensic timeline, remediation, and control changes. The response is led by our VP of Security, named in the DPA, with authority to commit the company. We will name the region, the audit-log range, and the specific prompts and completions involved, to the extent we can identify them. We will not require an NDA before sharing this with your counsel."
"We will notify you of any material security incident in accordance with applicable law." That is the floor, not a commitment. State breach-notification laws do not protect a confidential matter file or trust documents. A vendor who commits only to the legal floor is committing to tell you what they have to, when they have to. That is not a partner. That is a counterparty.
A sharper red flag: the vendor refuses to put a notification SLA in the order form. "We cannot commit to a specific timing because every incident is different" is the sentence to listen for. Every incident is different. A forty-eight-hour "we have an incident, here is what we know so far" is still a reasonable commitment for any vendor that takes incident response seriously, and any vendor that pushes back on it is telling you something about their crisis posture.
A private equity firm running a vendor whose default DPA contains no notification SLA. The vendor has a real incident. The firm learns about it forty-one days later, in a generic email that does not say whether the firm's data was involved. Two weeks of partner time and outside-counsel hours follow, trying to determine if MNPI on a target was exposed. By the time the firm has an answer, the deal has moved. The forty-eight-hour SLA, named human, and no-NDA-gate posture is what a PE AI program should require before signature, not after the call from the vendor's general counsel.
Want this run on the AI vendors your firm is evaluating?
Two starting points; both lead to the same place. The questionnaire is yours to take in 60 to 90 minutes. The briefing is thirty.
What this looks like in our Discovery phase
In a Trifident engagement, these five questions sit inside a broader vendor-risk worksheet alongside the Discovery Questionnaire and the architecture phase of our Secure AI Deployment practice. The questionnaire covers the firm's side: data classes, workflows, regulators. The vendor diligence covers the platform the firm is about to plug in.
The deliverable is a written per-vendor verdict: pass, conditional pass with required contractual or configuration changes, or no. Each verdict is signed by a Trifident partner. It is the kind of artifact a malpractice carrier, an SEC examiner, an LP DDQ team, or an opposing counsel can ask to see. The verdict stays current as the vendor's terms change.
If a firm would rather run the diligence in-house, the five questions are useful as a standalone checklist. Print them. Hand them to the next vendor on the calendar. If the firm would rather we run it, the briefing is thirty minutes and confidential.
Further reading
Fuller, Rob. The Day-Zero Normal: A Practical Reprioritization Guide for CISOs Entering the AI Vulnerability Era. Vulnerability Management Research Group, April 2026. (init6.com/papers/Day-Zero-Normal-CISO-Brief.pdf). The reprioritization table on page 3 is worth the read on its own; "TPRM questionnaires" moves from HIGH to LOW priority, with "agents are now privileged users" elevating non-human identity governance to CRITICAL. The questions in this post are how a buyer-side CISO operationalizes that shift on each new AI vendor.