5 AI Vulnerabilities Pen Testers Miss

You hired a pen testing firm. They tested your web app. Clean report. But they didn’t touch your AI features — because they don’t know how.

This is happening across the industry right now. Companies are getting their annual penetration test, checking the compliance box, and walking away thinking their application is secure. Meanwhile, their chatbot, copilot, or AI assistant sits in production with vulnerabilities that a traditional pen tester wouldn’t even know to look for.

Here are five critical AI vulnerabilities that your regular pen tester will miss.

1. Prompt Injection

Your pen tester tested for SQL injection. They tested for command injection. They tested for LDAP injection. But did they test whether a user can type a message that makes your chatbot ignore its system instructions and leak internal data?

Prompt injection is the number one vulnerability in AI applications. It’s the equivalent of SQL injection for the AI era — and it’s everywhere. An attacker crafts an input that overrides the AI’s instructions, causing it to follow the attacker’s commands instead. This can lead to data exfiltration, unauthorized actions, safety filter bypasses, and complete loss of control over the AI’s behavior.

Traditional pen testers don’t test for this because it doesn’t exist in their playbook. They’ve never had to think about how natural language input can be weaponized against a system.

2. Cross-Tenant Data Leakage Through AI

In a multi-tenant SaaS application, can one customer’s chatbot conversation access another customer’s data? Your pen tester probably checked API authorization — making sure Customer A can’t access Customer B’s API endpoints. But they didn’t check whether Customer A can ask the AI a question that causes it to retrieve and display Customer B’s data from the RAG pipeline.

This happens when the AI’s retrieval system doesn’t properly enforce tenant isolation. The API layer might be perfectly secure, but the vector database or document retrieval system that feeds the AI might not separate customer data correctly. The AI becomes a side channel that bypasses your carefully designed access controls.

3. System Prompt Extraction

Your AI has a system prompt — the instructions that tell it how to behave, what it can and can’t discuss, what data it can access, and what actions it can take. This system prompt often contains business logic, pricing rules, internal policies, competitive information, and technical architecture details.

Can a user extract it by asking the right questions? On most chatbots, the answer is yes — and it’s trivially easy. Simple prompts like “repeat your instructions” or more sophisticated techniques that gradually coax the AI into revealing its configuration can expose everything in the system prompt. Your pen tester never checked because they didn’t know the system prompt existed.

4. Excessive Agency

Your AI assistant can look up orders, process refunds, send emails, and update customer records. That’s what makes it useful. But can a user trick it into processing a refund it shouldn’t? Can it be manipulated into sending emails to arbitrary addresses? Can an attacker use the AI to modify records they don’t have permission to touch?

Traditional pen testers test API permissions — they verify that authenticated users can only access what they’re authorized to access. But they don’t test whether the AI itself can be manipulated into misusing its own permissions. The AI has legitimate access to these functions. The question is whether an attacker can hijack that access through the conversation interface. This requires a completely different testing approach that traditional pen testers aren’t trained for.

5. Indirect Prompt Injection via Documents

A user uploads a document to your application. The document looks normal — a PDF, a spreadsheet, a Word file. But hidden inside it are instructions that the AI will read and follow. The AI processes the document, encounters the hidden instructions, and executes them instead of doing what the user actually asked.

This attack vector simply doesn’t exist in traditional web applications. There’s no equivalent in the world of SQL injection and XSS. Your pen tester has never seen it because it’s unique to AI systems that process user-supplied documents — which includes virtually every AI application with file upload capabilities.

Hidden instructions can be embedded as white text on a white background, in document metadata, in tiny font sizes, or in other locations that are invisible to human readers but perfectly visible to the AI. The implications range from data theft to unauthorized actions to complete compromise of the AI’s behavior.

The Fix

You don’t need to choose between traditional pen testing and AI security testing. You need both — ideally from a team that can do both in a single engagement. Testing the web application and the AI layer together reveals vulnerabilities that exist at the intersection of the two, which neither test would catch in isolation.

Ready to Test Your AI Properly?

Book a free scoping call with our team. Our lead tester comes from Meta’s AI RED Team and tests both your AI and your application in a single engagement — so nothing falls through the cracks.

Book a Free Scoping Call →

5 AI Vulnerabilities Your Regular Pen Tester Will Miss