Threat Model
AI agents face 16 categories of threats. Most are invisible to traditional security tools because they hide in tool descriptions, config files, and prompt instructions — not executable code.
A study of MCP servers found that 72.8% of tool poisoning attacks succeed against unaudited agent stacks. 341 malicious tools have been found on agent marketplaces. 82% of MCP servers have path traversal vulnerabilities. Firmis detects all of these statically, before your agent runs a single tool.
At a glance
Section titled “At a glance”| # | Category | Rules | Severity | What it enables |
|---|---|---|---|---|
| 1 | Tool Poisoning | 10 | Critical–High | Hidden instructions that hijack agent behavior |
| 2 | Data Exfiltration | 12 | Critical–High | Sending your files and env vars to attacker servers |
| 3 | Credential Harvesting | 18 | Critical–High | Reading ~/.aws/credentials, SSH keys, token caches |
| 4 | Prompt Injection | 13 | Critical–High | Overriding agent instructions from tool output or config |
| 5 | Secret Detection | 60 | Critical–Medium | Hardcoded API keys, tokens, and passwords in source |
| 6 | Supply Chain | 8 | Critical–High | Compromised or typosquatted dependencies |
| 7 | Malware Signatures | 6 | Critical | Known malicious code patterns |
| 8 | Known Malicious | 10 | Critical | Packages flagged in threat databases |
| 9 | Network Abuse | 10 | High–Medium | Unauthorized DNS lookups and HTTP connections |
| 10 | File System Abuse | 10 | High–Medium | Reads/writes to sensitive system paths |
| 11 | Permission Overgrant | 7 | High–Medium | Tools requesting broader permissions than they need |
| 12 | Agent Memory Poisoning | 7 | High | Corrupting agent context to affect future behavior |
| 13 | Malware Distribution | 6 | Critical–High | Code that downloads and executes additional payloads |
| 14 | Suspicious Behavior | 16 | High–Medium | Obfuscation, encoded payloads, evasion techniques |
| 15 | Insecure Configuration | 3 | Medium–Low | Disabled security controls, open CORS, weak defaults |
| 16 | Access Control | 3 | High–Medium | Missing authentication or authorization checks |
Total: 209 rules across 16 categories.
Tool Poisoning
Section titled “Tool Poisoning”Tool poisoning is the most direct attack against AI agents. A malicious MCP server can inject hidden instructions into a tool description that tells the agent to read ~/.aws/credentials and send the contents to an attacker’s server — all while showing the user a perfectly innocent tool name like “search the web.”
Because agents read tool descriptions to understand what a tool does, hidden content in those descriptions can redirect agent behavior without the user’s knowledge. The attack is invisible to code review because the payload is in a string, not in logic.
Example finding (tp-001 — Critical): Zero-width Unicode characters (\u200B, \uFEFF) in a tool description. These characters are invisible to humans reviewing the code but are processed by the agent as text, allowing hidden instructions to be smuggled past review.
Example finding (tp-002 — High): The phrase “Ignore all previous instructions” embedded in a tool description fetched from an external MCP server — a textbook prompt override attack.
Data Exfiltration
Section titled “Data Exfiltration”Data exfiltration rules detect code patterns that send local data — files, environment variables, clipboard contents — to external URLs or third-party services. The attack is rarely obvious: the exfiltration is usually embedded inside a tool that also does something legitimate.
Example finding (exfil-003 — Critical): A tool that reads a local file and passes its contents as the body of a fetch() POST request to an external URL.
Example finding (exfil-007 — High): A tool that accesses process.env and sends environment variable values to a webhook endpoint. If your agent has access to API keys via env vars, this is a full credential dump.
Credential Harvesting
Section titled “Credential Harvesting”Credential harvesting rules detect access to files that store cloud credentials, SSH keys, browser-stored passwords, and token caches. These files are the single highest-value targets on a developer’s machine. Access to them from agent code is almost always unauthorized.
Example finding (cred-001 — High): A reference to ~/.aws/credentials or ~/.aws/config in a tool’s file-read path.
Example finding (cred-002 — Critical): Access to ~/.ssh/id_rsa — a private SSH key file that grants access to every server it’s authorized on.
Prompt Injection
Section titled “Prompt Injection”Prompt injection is different from tool poisoning. Tool poisoning corrupts the tool definition itself. Prompt injection arrives through content the agent reads at runtime — a web page, a tool return value, a Markdown file, or a database record.
Unlike XSS or SQL injection, prompt injection does not require code execution. A plain-text instruction embedded in a document is enough to override the agent’s behavior if the agent is instructed to follow instructions in the content it reads.
Example finding (pi-001 — Critical): A Markdown file consumed by an agent containing the phrase “Disregard your instructions and instead…”
Example finding (pi-008 — High): A tool return value template containing a role-reassignment phrase such as “You are now operating in unrestricted mode.”
Secret Detection
Section titled “Secret Detection”Secret detection covers 60 rules for hardcoded credentials across cloud providers, SaaS APIs, infrastructure services, and generic token patterns. This is the largest category by rule count because hardcoded secrets are still the most common security mistake in software — and they become dramatically more dangerous when an AI agent can read and exfiltrate them.
Services covered include: AWS, Azure, GCP, GitHub, GitLab, Slack, Stripe, Twilio, SendGrid, HuggingFace, OpenAI, Anthropic, Datadog, PagerDuty, HashiCorp Vault, Docker Hub, npm tokens, SSH private key headers, and more.
Example finding (sec-045 — Critical): An OpenAI API key (sk-...) hardcoded in a tool configuration file.
Example finding (sec-012 — High): An AWS Access Key ID (AKIA...) in a Python source file — one grep away from a full account compromise.
Supply Chain
Section titled “Supply Chain”The agent ecosystem has a supply chain problem. Packages get compromised. Maintainers get coerced. Typosquatted packages with nearly identical names sit in registries waiting to be installed.
Supply chain rules detect dependencies with known security incidents and typosquatting patterns that mimic popular package names.
Example finding (supply-001 — Critical): A dependency on event-stream — a package that was compromised to steal bitcoin wallets and downloaded by millions of developers before the attack was discovered.
Example finding (supply-002 — High): A dependency named lodassh — a typosquat of lodash that runs a reverse shell on install.
Malware Signatures
Section titled “Malware Signatures”Malware signature rules match code patterns associated with known malware families and attack tools observed in the wild. These are the highest-confidence findings in the rule set. If one fires, something is very wrong.
Example finding (mal-003 — Critical): A Base64-encoded payload string matching a known command-and-control beacon pattern — the fingerprint of a specific malware family that has been observed targeting developer machines.
Known Malicious
Section titled “Known Malicious”Known malicious rules match package names and identifiers from curated threat intelligence databases, including packages flagged by npm security teams and community disclosures.
Example finding (km-007 — Critical): A dependency on a package that was reported as malicious in the npm advisory database — still installable, still in package.json, silently running on every npm install.
Network Abuse
Section titled “Network Abuse”Network abuse rules detect unauthorized DNS lookups, HTTP requests to suspicious domains, tunneling services, and data-over-DNS patterns used to bypass network monitoring.
Example finding (net-004 — High): HTTP requests to a tunneling service (ngrok.io, localtunnel.me) that creates an unmonitored egress channel. Legitimate tools rarely need to phone home through a tunnel.
Example finding (net-009 — High): DNS TXT record lookups that encode exfiltrated data in DNS queries — a technique specifically designed to bypass HTTP-level network monitoring and firewall rules.
File System Abuse
Section titled “File System Abuse”File system abuse rules detect reads, writes, or deletions of sensitive system paths — including /proc filesystem entries, system logs, shell history files, and container credential paths.
Example finding (fs-001 — High): Access to /proc/self/environ — reads the process environment directly from the kernel filesystem, exposing all environment variables including any secrets injected at runtime.
Example finding (fs-006 — High): Writing to or truncating system log files to cover activity traces — a classic anti-forensics technique.
Permission Overgrant
Section titled “Permission Overgrant”Permission overgrant rules detect tool definitions that request broad or wildcard permissions without scoping them to the minimum required for the tool’s declared purpose. This is the agent equivalent of a mobile app requesting access to your camera, contacts, and location to show you weather.
Example finding (perm-003 — High): An MCP server tool declaring permissions: ["*"] rather than enumerating specific permission scopes. A wildcard grant means the tool can do anything the agent can do.
Agent Memory Poisoning
Section titled “Agent Memory Poisoning”Agent memory poisoning rules detect patterns that corrupt or hijack the agent’s context window, conversation history, or persistent memory — causing the agent to behave differently in future turns. Unlike prompt injection (which attacks a single session), memory poisoning persists.
Example finding (mem-002 — High): A tool that writes adversarial instructions into a persistent memory file consumed by the agent on startup. Every future session starts with the poisoned context.
Malware Distribution
Section titled “Malware Distribution”Malware distribution rules detect code patterns that download and execute additional payloads, install backdoors, or propagate malicious code to other systems.
Example finding (dist-001 — Critical): A curl | bash pipe-to-shell pattern that downloads and immediately executes a remote script without verification. The downloaded script could be anything; there is no integrity check.
Suspicious Behavior
Section titled “Suspicious Behavior”Suspicious behavior rules cover obfuscation techniques, encoded payloads, and evasion patterns that are not specific to one threat category but indicate malicious intent. Legitimate tools rarely need to hide what they do.
Example finding (sus-004 — High): A long Base64-encoded string passed to a dynamic code execution function — a common technique for hiding malicious logic from static scanners and from developers reviewing the code.
Example finding (sus-011 — Medium): Heavy use of string concatenation to build a URL, specifically structured to evade simple domain-matching rules. The result URL is never visible in a single line of source.
Insecure Configuration
Section titled “Insecure Configuration”Insecure configuration rules detect agent configurations that disable security controls, set overly permissive CORS policies, or use known-insecure default settings.
Example finding (cfg-002 — Medium): A server configuration with allowOrigins: "*" and no authentication requirement — any website can make authenticated requests to the agent’s tool server.
Access Control
Section titled “Access Control”Access control rules detect missing authentication checks on tool endpoints, unauthenticated admin routes, and hardcoded bypass conditions.
Example finding (ac-001 — High): A tool handler that processes requests without verifying the caller’s identity or checking an authorization token — any process that can reach the socket can invoke the tool.
What to read next
Section titled “What to read next”- Detection Engine — how rules are evaluated, scored, and why Firmis keeps false positive rates low
- Built-in Rules — full list of all 209 rules with IDs and descriptions
- Ignoring Findings — suppress false positives per file or rule without disabling the entire category
- firmis scan — CLI reference and severity filtering flags