We Rated 5,928 MCP Servers. Zero Scored an A.
> TL;DR: We rated 5,928 MCP servers on a 10-point security scale. Zero scored an A. Average: 4.81/10. 1,287 servers (22%) have Grade D or F, meaning known security issues. These numbers are calibrated — we manually audited our F-grades and corrected a 14% false positive rate.
---
Grade Distribution
| Grade | Score Range | Count | % |
|---|---|---|---|
| A | 9.0 – 10.0 | 0 | 0% |
| B | 7.0 – 8.9 | 268 | 4.5% |
| C | 5.0 – 6.9 | 4,373 | 73.8% |
| D | 3.0 – 4.9 | 1,189 | 20.1% |
| F | 0.0 – 2.9 | 98 | 1.7% |
Average score: 4.81/10 — right at the C/D boundary. Half the ecosystem is below average.
Key Findings
1. Zero A-Grade Servers
Not one of 5,928 servers scores 9.0 or above. The best reach high B (8.5-8.9). Even the most security-conscious MCP servers have room for improvement.
2. 74% Cluster in Grade C
The vast majority of servers are "works but nobody secured it" — functional code with minimal security attention. This is the long tail of the MCP ecosystem.
3. 22% Have Known Issues
1,287 servers (D + F grade) have below-average security or critical vulnerabilities. That's more than 1 in 5 MCP servers your AI agent might connect to.
4. Description Quality is the Bottleneck
The three scoring dimensions reveal where the ecosystem struggles:
| Dimension | Weight | Average | Problem |
|---|---|---|---|
| Description Quality | 38% | 3.13/10 | 98% of tools lack "when to use" guidance |
| Security Analysis | 34% | 6.21/10 | Injection, path traversal, hardcoded secrets |
| Metadata Health | 28% | 5.89/10 | Missing licenses, no tests, abandoned repos |
Description quality is the biggest drag on scores. 98% of tools don't tell the AI agent *when* to use them. This isn't just a quality issue — it's a security surface. When an agent can't distinguish between tools, a malicious tool can insert itself with a better-sounding description.
5. Descriptions Are Fixable
The spidershield rewrite command auto-fixes tool descriptions using structured templates. In our tests, it lifts description scores from 3.1 → 7.5+ (a 2.4x improvement) with zero code changes:
npx spidershield rewrite ./your-server
This alone can move a server from D to C grade.
How We Ensure Accuracy
These numbers are calibrated. We manually audited our F-grade servers by reading source code and found a 14% false positive rate. 16 servers were upgraded from F to B/C/D after review. The remaining 98 F-grade ratings were confirmed accurate.
Full audit details: spiderrating.com/blog/we-scanned-5928-mcp-servers-then-audited-the-worst
What You Can Do
If you maintain an MCP server:
1. Scan it: npx spidershield scan ./your-server
2. Fix descriptions: npx spidershield rewrite ./your-server
3. Check your rating: spiderrating.com/servers/{owner}/{repo}
If you use MCP servers: 1. Check ratings before connecting: spiderrating.com 2. Add runtime protection: SpiderShield PreToolUse hook blocks F-grade servers 3. Star the scanner if useful: github.com/teehooai/spidershield
Methodology
- Scanner: SpiderShield v0.3.2, open source (MIT)
- 46+ security rules (OWASP-aligned)
- 3-layer scoring: Description (38%) + Security (34%) + Metadata (28%)
- Hard constraints: critical vulnerability → F cap
- Calibrated: 14% FP rate, 16 corrections applied
- Deterministic: same input always produces same output
---
*Browse all 5,928 ratings at spiderrating.com. Scan your server: npx spidershield scan. Source: github.com/teehooai/spidershield (MIT).*