Ai Contract Review

I’m working with a vendor that uses AI tools, and I’m struggling to understand some of the contract language around data usage, ownership, and liability. I don’t want to agree to terms that could expose my business or my clients to unnecessary risks. Can anyone explain what key clauses I should look for in an AI-related contract and what red flags I should watch out for?

First thing, assume the vendor contract favors the vendor. Your job is to narrow risk.

Here are the key areas to mark up or push back on.

  1. Data ownership
    Look for language like “Vendor owns all output” or “Vendor owns all derivatives.”
    You want:
    • You own your input data.
    • You own output related to your business data.
    • Vendor gets a limited license to use your data only to deliver the service.
    Red flag: vendor claims rights to reuse your data for “any purpose” or “product improvement” without strict limits.

  2. Training and product improvement
    Common clause: “Customer data may be used for training models and improving services.”
    Ask for:
    • Option to opt out, in writing.
    • Clear scope: anonymized, aggregated, no reidentification.
    • No use of your data to train models for your competitors.
    If they say they “need” it for model quality, ask for a specific technical reason. If they stay vague, treat as risk.

  3. Confidential and personal data
    If you send customer data, employee data, or trade secrets, push for:
    • Data classified as “Confidential” by default.
    • No sharing with third parties except approved sub processors.
    • Data at rest and in transit encrypted.
    • Clear data retention and deletion timelines.
    Ask for a DPA with GDPR style terms even if you are not in the EU. It sets a higher bar.

  4. Liability and caps
    Typical pattern: vendor caps liability at 12 months of fees or even less, excludes “indirect damages”, and has no IP indemnity for AI output.
    Try to negotiate:
    • Higher cap for data breach and IP infringement, like 2x to 3x annual fees or a fixed amount.
    • No exclusion for data breach, confidentiality breach, and IP infringement.
    • Vendor responsible if their AI output infringes third party IP when you use it as instructed.
    If they refuse everything, treat the tool as high risk and use only on non critical data.

  5. Indemnity
    Ask for specific indemnity for:
    • Data breach caused by vendor or its tools.
    • IP infringement based on vendor models, training data, or output.
    • Regulatory fines where their failure causes the issue.
    They will try to push all responsibility to you for “your prompts and outputs.” Try to split it.
    You handle what you choose to input. They handle what their system generates.

  6. Security and access
    Look for:
    • Where data is stored and processed.
    • Use of subcontractors and hosting providers.
    • Access controls, logging, monitoring.
    If they say “industry standard” with no detail, ask for their security summary or SOC 2 report. If nothing exists, treat as low maturity.

  7. Data usage wording to edit
    Problem phrases:
    • “Any purpose”
    • “Perpetual, irrevocable, worldwide license to use Customer Data”
    • “Vendor owns all feedback and derivatives of feedback and related data”
    Safer wording:
    • “Vendor receives a non exclusive, revocable license to use Customer Data solely to provide and support the services under this Agreement.”
    • “Customer retains all ownership rights in Customer Data and output that is based on Customer Data.”

  8. Termination and deletion
    Ask for:
    • Clear process for data export in a usable format.
    • Timeframe for deletion of your data from systems and backups.
    • Written confirmation of deletion on request.
    If they say they keep data “for analytics” after termination, you need strong anonymization language.

  9. Internal guardrails
    Even with a good contract, set your own rules:
    • Do not input trade secrets unless needed.
    • Do not input PII unless contract and DPA cover it.
    • Keep a data classification list and share with staff.

If you can, run the contract by a lawyer with tech or SaaS experience. Tell them to focus on: data ownership, training rights, liability cap, and indemnity for AI output. That will keep the bill tighter and focused on the real risks.

Couple of things I’d layer on top of what @jeff already laid out.

  1. Watch for “Output as a Service” tricks
    Some AI vendors try to say: “We own all models and all output, you just get a license.”
    That can quietly kill your ability to:
  • Resell what you create
  • Use outputs after termination
  • Claim any IP in what you designed

Ask for a clause that says something like: you own business logic, workflows, prompts, and configurations that are specific to your use case. If they refuse and say “everything is ours,” assume they want to box you in and upsell forever.

  1. Prompt + context = your secret sauce
    A lot of contracts ignore prompts entirely. That’s risky.
    You want:
  • Prompts and system configurations that you create are your confidential information
  • They cannot reuse those for other customers
    Otherwise they can effectively productize the know‑how you spent years building, just because you typed it into their tool.
  1. Shadow training risk
    Even if they say “we don’t train on your raw data,” look for vague wording like “usage patterns,” “metadata,” or “derived data.”
    Ask:
  • Can “derived data” ever be used to reconstruct or approximate our data or strategies?
  • Are you using embeddings, vectors, or logs from our environment to benefit other customers?

If they won’t define “derived data,” assume it’s a back door to train on your stuff.

  1. Regulator & audit angle
    Most contracts totally skip this:
  • Do you have a right to audit or get a third‑party security summary at least annually?
  • If a regulator, auditor, or big customer asks you where and how data is processed, will the vendor cooperate in writing, within a time limit?
    Add a clause that they must reasonably assist with compliance inquiries and incident investigations, at their cost if the issue is on their side.
  1. GenAI hallucination & “you’re on your own” language
    Vendors love to say:
  • “Customer is solely responsible for verifying all outputs”
  • “Outputs are provided as is and may be inaccurate”

Fine in principle, but try to:

  • Limit this where you follow their documented “best practices”
  • Require them to fix systemic issues if the model repeatedly generates harmful or infringing content
    Also push for a commitment to content filters or safety controls if your use case is sensitive (legal, medical, HR, finance). Not bulletproof, but it gives you leverage when something goes sideways.
  1. Open source & third‑party model landmines
    Ask specifically:
  • Are you using open source models or datasets with copyleft or weird licensing?
  • Do any third‑party terms “flow down” to me?

If they use third‑party AI, the contract should say they are responsible for managing those licenses and indemnifying you from claims tied to that stack. You should not have to chase six different model providers because your vendor stitched them together.

  1. Exit strategy for model dependence
    Jeff covered data export; I’d go further:
  • Can you export prompts, config, and fine‑tuning artifacts in some usable form?
  • If they shut down or are acquired, do you have a fallback or escrow plan for your critical automations?

Might sound dramatic, but AI vendors die or pivot all the time. If your operations depend on them, you want some form of “continuity of use” plan, even if light.

  1. Practical way to triage risk
    If you’re not a lawyer, here’s a quick triage:
  • Red use: core IP, trade secrets, heavy PII
  • Yellow use: internal docs, nonpublic but not catastrophic
  • Green use: public marketing, generic content

If contract is weak:

  • Only use them for green
    If contract is decent but not great:
  • Green + some yellow
    If contract is strong and they show real security maturity:
  • Then consider red, with internal rules
  1. When to actually walk away
    For me, hard “no” signs:
  • “Perpetual, irrevocable license to everything including your confidential data”
  • No opt out from training at all, plus vague “derived data” rights
  • Zero extra liability for data breaches and no willingness to tweak
  • Refusal to say where data is stored or who subprocessors are

If they won’t move even a little on any of those, that’s not just “vendor‑friendly,” that’s “we own your future” friendly.

If you want, you can paste a few specific clauses (scrub any sensitive info) and people here can help translate the worst offenders into normal human language.