AI systems now connect to external tools and data sources through Model Context Protocol (MCP) servers, and security has become a critical concern. MCP is still new and operates in what experts describe as a “wild west” environment. The OWASP Gen AI Security Project addresses this challenge with a complete framework designed to help developers and organizations deploy third-party MCP servers safely. MCP servers introduce unique vulnerabilities that traditional APIs don’t have. These include tool poisoning, prompt injection and memory poisoning. We’ll get into the critical security threats facing MCP implementations and OWASP server security best practices. You’ll also find useful guidance to protect your AI infrastructure against emerging risks.
Understanding MCP Architecture and Security Challenges

Image Source: dida
MCP operates through a client-host-server architecture built on JSON-RPC 2.0, where host applications manage multiple clients with each maintaining a dedicated one-to-one connection to specific servers. The protocol defines three distinct layers: hosts coordinate AI systems and user interactions, clients handle protocol communication and server discovery, and servers expose capabilities through tools and resources. Communication occurs via two transport methods: STDIO for local integrations running in the same environment, and HTTP with Server-Sent Events for remote connections.
The architecture follows a critical isolation principle: servers receive only necessary contextual information without access to full conversation history or visibility into other connected servers. This design promotes modularity but creates complex trust boundaries. Research from 2025 revealed severe implementation gaps. Knostic’s analysis of nearly 2,000 internet-exposed MCP servers found zero instances of authentication. Similarly, Backslash Security identified the same vulnerabilities in another 2,000 servers and documented patterns of over-permissioning and complete local network exposure.
The confused deputy problem emerges when MCP servers execute actions with broader privileges than users possess and potentially violate least-privilege principles. More, tool descriptions function as executable code loaded into model reasoning rather than static documentation. This transforms descriptions from passive metadata into active injection vectors. It distinguishes MCP security from traditional API protection approaches in a fundamental way.
Critical Security Threats in Third-Party MCP Servers

Image Source: SOCRadar
Invariant Labs uncovered a novel vulnerability class called Tool Poisoning Attacks (TPA), where malicious instructions embed within tool descriptions invisible to users but processed by LLMs. The attack surface extends way beyond the reach and influence of description fields. Every schema element including function names, parameter defaults and required fields serves as a potential injection point. Attackers exploit the gap between MCP’s optimistic trust model and LLM inferential capabilities.
Advanced Tool Poisoning Attacks (ATPA) manipulate tool outputs like error messages or follow-up prompts generated during execution. As with this, MCP Rug Pull attacks occur when servers swap tool descriptions after getting approval. They serve benign versions during onboarding before delivering malicious ones later. Updates happen silently without version locking or signatures.
Memory poisoning targets persistent knowledge bases and achieves 80%+ success rates when agents consult memory before responding. OWASP recognizes this as ASI06 with persistence that is hard to detect. Financial exposure reaches significant levels. GDPR breach penalties go up to 4% of global annual revenue or €20 million.
Authentication vulnerabilities compound these potential risks. Tests revealed that only 4% (27 of 660) authorization server endpoints support Dynamic Client Registration. Cross-server shadowing makes one malicious server hijack legitimate tools from trusted servers through hidden metadata instructions. Excessive permissions expand attack surfaces when tools receive broader access than they need.
OWASP Server Security Best Practices for MCP Implementation

Image Source: LinkedIn
MCP implementations need fundamental changes from network-centric defenses to identity-aware control planes. The OWASP Gen AI Security Project mandates OAuth 2.1 with Proof Key for Code Exchange (PKCE) for all authorization flows. This prevents authorization code interception attacks through cryptographic challenges. MCP servers must confirm that tokens were issued for their use and implement RFC 8707 Resource Indicators to bind tokens to intended audiences. This prevents confused deputy attacks.
Token management follows strict least-privilege principles. Short-lived, minimally scoped tokens should be issued where each server receives only permissions needed for intended functionality. Authorization servers should issue tokens with brief expiration windows. This reduces the effect from compromised credentials. Session-based authentication remains forbidden. All requests that implement authorization must verify against valid user-linked tokens.
Manipulation attacks can be prevented through input confirmation. All parameters should be confirmed against schemas for allowed characters, length and format. Secure APIs or parameterized commands should be used instead of shell concatenation to prevent command injection. Tool descriptions and outputs must be sanitized before feeding back into LLM context and strip markup that agents might interpret as malicious instructions.
Components should be confined through sandboxing so exploitation has limited effect. Servers can be deployed in containers with filesystem isolation, network restrictions and minimal privileges. Organizations that implement these controls can Book a Readiness Call to confirm their security posture. Detailed logging of tool invocations, parameters and results should be enabled and integrated with SIEM systems to identify suspicious patterns.
Conclusion
Securing third-party MCP servers demands immediate action. Tool poisoning, memory attacks, and authentication gaps pose real threats. We’ve explored OWASP’s complete framework covering OAuth 2.1 implementation, input validation, and sandboxing practices critical for protecting AI infrastructure. Organizations must change from traditional API security models to identity-aware controls. Book a Readiness Call to verify your security posture and ensure your MCP implementations follow these battle-tested best practices. Vulnerabilities can escalate into breaches that get pricey.
Key Takeaways
The OWASP Gen AI Security Project provides essential guidance for organizations deploying third-party MCP servers in an increasingly vulnerable AI landscape.
• MCP servers face unique vulnerabilities: Tool poisoning attacks embed malicious instructions in descriptions, while memory poisoning targets knowledge bases with 80%+ success rates.
• Authentication is critically lacking: Research shows zero authentication across nearly 4,000 internet-exposed MCP servers, creating massive security gaps.
• Implement OAuth 2.1 with PKCE: Use short-lived, minimally scoped tokens and validate that tokens were issued specifically for your server to prevent confused deputy attacks.
• Deploy comprehensive sandboxing: Isolate servers in containers with filesystem restrictions, network limitations, and minimal privileges to contain potential breaches.
• Validate all inputs rigorously: Sanitize tool descriptions, parameters, and outputs before feeding back into LLM context to prevent injection attacks.
The shift from traditional API security to identity-aware control planes is essential for protecting AI infrastructure against these emerging threats that could result in GDPR penalties up to €20 million.
FAQs
Q1. What makes MCP servers more vulnerable than traditional APIs? MCP servers operate with delegated user permissions and dynamic tool-based architectures that allow chained tool calls. Unlike traditional APIs, tool descriptions function as executable code loaded directly into model reasoning rather than static documentation, creating unique injection vectors. This architecture, combined with the lack of authentication found in nearly 4,000 internet-exposed servers, makes MCP implementations significantly more vulnerable to attacks like tool poisoning and prompt injection.
Q2. What are Tool Poisoning Attacks and how do they work? Tool Poisoning Attacks (TPA) involve embedding malicious instructions within tool descriptions that are invisible to users but processed by large language models. These attacks exploit every schema element including function names, parameter defaults, required fields, and types as potential injection points. Advanced versions can manipulate tool outputs like error messages or follow-up prompts generated during execution, allowing attackers to control AI behavior without user awareness.
Q3. Why is OAuth 2.1 with PKCE recommended for MCP security? OAuth 2.1 with Proof Key for Code Exchange (PKCE) prevents authorization code interception attacks through cryptographic challenges. It ensures tokens are issued specifically for intended servers and validates them against the correct audience, preventing confused deputy attacks where servers execute actions with broader privileges than users possess. This approach implements strict least-privilege principles with short-lived, minimally scoped tokens.
Q4. What is a “Rug Pull attack” in the context of MCP servers? MCP Rug Pull attacks occur when servers swap tool descriptions after initial approval. Attackers serve benign versions during the onboarding process to gain trust, then deliver malicious ones later during actual operation. Without version locking or cryptographic signatures, these updates happen silently, allowing previously vetted servers to become compromised without detection.
Q5. Can malicious content be hidden in files processed by MCP servers? Yes, malicious instructions can be embedded in various file types processed by MCP servers. This includes README files with injection instructions, PDFs with malicious metadata, or Markdown files containing invisible unicode characters. Since servers pass along file contents to the agent’s context without inspection, these hidden payloads can influence AI behavior even when the server itself is functioning correctly.