# Most MCP servers don't need to exist. Your case might be an exception.

> An MCP server is a product interface for the age of agents, not an AI feature itself. This post presents a framework for deciding between direct API calls, CLI, Skills, and MCP, detailing the architectural signals, the hidden costs, and the failure modes you should know about before embarking on an MCP quest.

- Date: 2026-06-30T00:00:00.000Z
- Authors: Evgeniy Valyaev, Travis Turner
- Categories: AI, DX
- URL: https://evilmartians.com/chronicles/most-mcp-servers-dont-need-to-exist-your-case-might-be-an-exception

---

Most MCP servers are not agent architecture, but premature product interfaces. The first hype wave produced too many half-baked implementations because a lot of products simply don't need an MCP server. Instead, sometimes direct API calls are enough. For some, a CLI does the job. For others, a skill is a cheap experiment. This post presents a framework for deciding if an MCP is really worth it, and, if so, some considerations to keep in mind. 

Right out the gate, consider this: [Bloomberry's analysis of 1,412 MCP servers](https://bloomberry.com/blog/we-analyzed-1400-mcp-servers-heres-what-we-learned/) found roughly half the companies shipping one _have no public-facing API at all_. Shipping an MCP without an API means you haven't yet defined stable, intent-level operations for external consumers. 

TL;DR [checklist here](#or-just-answer-these-eight-questions).

The [MCP (Model Context Protocol)](https://modelcontextprotocol.io/) is an open standard introduced by Anthropic in late 2024. It gives AI clients like Claude, ChatGPT, Cursor, and VS Code a uniform protocol to discover tools, fetch resources, call operations, and connect those operations to product-owned authorization and permission checks.

**The shorthand worth remembering**: An API is your software contract. A CLI is your execution contract. The MCP is your agent access contract. Skills are how you teach an agent to use any of them well. So, an MCP server is a third interface to your product. Without a coherent first and second, the third has nothing to stand on.

The teams that successfully ship an MCP server treat it like any other piece of the product. They keep it narrow, opinionated, and they do the work to keep it running. These teams determine that it's worth building when enough people, in enough different places, actually need it. 

The rest of this post is how to figure out if that's you, or if you're better off going another direction.

---

*Evil Martians help devtools startups design for agent-first adoption: LLM-ready docs and MCP servers, agent auth and security. Let's talk about what we can build together.* [Contact Evil Martians](https://evilmartians.com/contact-us)

---

## Fix your CLI and API before you build an MCP

When teams think about getting their product agent-ready, the CLI question usually comes up. "_Should we ship a CLI?_" **Wrong question**. Coding agents already have one. Instead: "_is the CLI we're already shipping disciplined enough that an agent can drive it?_" 

Here, "disciplined" means stable, non-interactive commands shaped around product intent, structured output and no human-only patterns (like progress bars mixed into stdout). These help humans, CI, and agents at the same time.

The Playwright team made the trade-off explicit:

[`playwright-cli`](https://playwright.dev/docs/next/getting-started-cli) for token efficiency (community benchmarks report roughly 4× lower token use, official docs frame the CLI as the token-efficient option for coding agents). 

Playwright MCP for specialized loops with persistent browser state. Same engine. Different transports for different agent shapes. The boundary itself is sharp: when the user is no longer in a shell, the CLI is no longer the answer.

When teams want to make their product agent-ready, the next instinct is "_we already have an API, can we wire the agent to that?_" 

**Also the wrong question**. But not for the reason you'd guess! The problem isn't too many clients wired to too many tools; your API is shaped around how data is *stored*, not around what someone is trying to *do*.

Internal APIs are full of bookkeeping: IDs, pagination cursors, status flags, lifecycle states, nested objects describing nested objects. That shape exists because databases need it. Hand it to an agent and the model burns its limited attention figuring out your data model before it can do anything useful.

> So, you aren't exposing a capability. You're actually just exposing your database with better branding.

Switching to MCP doesn't fix any underlying API problem. If the tools are still shaped like database rows, MCP just wraps the same problem in a new protocol. Instead, the fix has to happen one step earlier: **design operations around what people are trying to do**, then decide whether an MCP is the right move.

## Try a skill before an MCP

An MCP handles _access_. Skills handle _operating discipline_: that means instructions, examples, reference files, templates, or sometimes helper scripts that teach an agent how to use capabilities exposed elsewhere. Reach for skills *before* MCP when:

- **Capability already exists, but discipline does not.** Say the agent has the API/CLI it needs but uses it badly: wrong call order, missed edge cases, lost context. A skill encodes a playbook without a server.
- **You want to test demand cheaply.** If usage is real, the case for MCP gets sharper.

Go for skills *alongside* MCP when you ship the server and want to teach the model how to use it well. Anthropic's skills directory features partner-built skills from companies like Notion and Canva. Sentry pairs its MCP with a Claude Code plugin that ships shared skills. 

A skill can teach an agent how to use APIs, CLIs, local files, or MCP servers, and it can include scripts for deterministic helper work. But it does not provide tenant scopes, remote access, authorization, audit, or cross-process governance. If those primitives belong in the product interface, you need an external boundary (such as MCP), not just a skill.

We'll see what that boundary looks like in the rest of this post.

## The big thing that justifies building an MCP server

There is one big sign that you should look for when deciding to build an MCP server: **AI clients you do not control consume the same operations.** 

For example, say your users live in clients like Claude, Cursor, ChatGPT, Copilot, customer-built agents and they want your product available there. 

If that isn't true, an MCP is mostly extra protocol on top of what function calls already give you: a single in-app agent reaching a single backend can use the OpenAI Responses API or Anthropic tools API to call the same operations, and with less ceremony. 

With cross-client demand, the cost of operating one governed contract can beat N drifting client-specific integrations. This is why Linear, Sentry, Resend, Cloudflare, and Stripe shipped official MCP surfaces: some remote and centrally hosted, others distributed as local or self-run servers, but all designed to make product capabilities available to AI clients.

[Linear's MCP](https://linear.app/docs/mcp) demonstrates a supported product interface: centrally hosted, authenticated, with a small set of intent-level tools for issues, projects, and comments. This is a third product interface, alongside the API and UI. Linear has an API, but users live in Claude, Cursor, and ChatGPT, and thus, it warrants an MCP server.

{% heading level: 2 %}MCP amplifiers{% endheading %}

Each of the items below can be solved at the application layer with function calls. None of them justify MCP on their own. But once the load-bearing signal above is determined to be real (cross-client demand) the following points sharpen the case and shape what the server should contain.

1. **Your service runs in a different process than the agent**: third-party SaaS, another team's service, a remote system the agent reaches over the network. For AI-client consumers, an MCP is increasingly the shared protocol they speak alongside native connectors, function calling, and IDE-specific extensions.

2. **Deterministic processing that turns raw data or low-level calls into intent-level operations**: aggregation, filtering, normalization, composition. `find_flaky_tests` instead of five CI calls and manual stitching. `get_customer_context` instead of asking the model to join users, workspaces, billing state, and incidents itself. This is good tool design at any layer; MCP is just where you ship it when the layer is cross-client.

3. **A codebase, documentation set, or dataset that will not fit in context**. The agent searches, fetches, and filters instead of being handed everything eagerly. The pull pattern itself is generic to function calling; the MCP makes the same pull surface discoverable across clients.

4. **Write-capable automation that needs governance.** Once the agent can change state, the operation needs scopes, approvals, and audit. Backend still owns enforcement. MCP can help surface user input or confirmation through clients that support [elicitation](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation), but support is uneven, so design those flows for the weakest target client.

{% heading level: 2 %}Know what should (and should not) live in the MCP server{% endheading %}

Gathering context belongs in MCP; deciding user goals does not. Preparing a rollback preview belongs in MCP; choosing whether the rollback is the right business move does not.

> If you can't articulate what belongs inside the server, you're not ready for one. The rule is short: **encapsulate operations, not workflows**. 

- **Allowed:** deterministic composition, validation, projection over data, permission checks, previews.
- **Dangerous:** autonomous goal selection, irreversible business decisions, hidden multi-step planning without an inspectable preview. 
- **Grey zone:** summarization, ranking, issue creation (useful only with explicit constraints, deterministic inputs, and a preview the model has to surface before it commits).

That rule has one clean example and an exception worth knowing.

**The winning example**: Sentry's API spans issues, events, releases, performance data, alerts, billing, and org management. Its MCP exposes around a dozen tools, all centered on one workflow: a coding agent debugging a production issue. Administration, team management, and billing stay in the API. The server scaled past 50M monthly requests by solving one workflow well, which is the fewest operations that preserve the constraints, not the most endpoints wrapped.

**The exception is surfaces too big for intent tools.** Some products can't be reduced to a handful of intent tools. For example, Cloudflare's API spans ~2,500 endpoints; one tool per endpoint would burn over a million tokens of schema before the first message. Its MCP ships two tools instead (`search` and `execute`) and lets the agent write code against a typed spec in a sandbox (the *codemode* pattern, ~1,000 tokens regardless of surface size). It assumes a programmable platform and a model that writes code well, so it isn't for everyone—but for thousand-endpoint surfaces it beats curating intent tools that can't cover the ground.

MCP does not replace your backend, evals, observability, or permission design. The agent still decides what to do next. Hide complexity, not responsibility.

## Hidden costs worth considering

**Context budget.** [Anthropic reported](https://www.anthropic.com/engineering/advanced-tool-use) a five-server setup with 58 tools consuming roughly 55K tokens before the first message, and internal cases reaching 134K tokens before optimization. Tool catalogs are part of the prompt: every name, description, and schema field eats context. Tool Search and deferred loading mitigate this where the host or model platform supports them, but they do not fix poor tool design upstream. MCP server design is prompt design. Bad interface design becomes latency, cost, and lower tool selection accuracy.

**Operational cost.** An MCP server is software you operate: deployment, uptime, versioning, backward compatibility, observability, incident response. The fact that [Portkey](https://portkey.ai/docs/product/mcp-gateway) exists as an MCP gateway product (auth, access control, audit, rate limiting between clients and servers) is the market saying the same thing as the architecture: production MCP needs governance infrastructure that the protocol itself does not bundle.

**Security debt.** [Astrix's audit of 5,205 MCP repositories](https://astrix.security/learn/blog/state-of-mcp-server-security-2025/) found 88% require credentials, 53% rely on static API keys or PATs, and only 8.5% implement OAuth. The threats are [not just generic API hygiene](https://cheatsheetseries.owasp.org/cheatsheets/MCP_Security_Cheat_Sheet.html); MCP combines API security with agent-specific risks: [tool poisoning](https://invariantlabs.ai/blog/mcp-security-notification-tool-poisoning-attacks) in shared registries, prompt injection through tool output, confused deputy across tenants, supply-chain risk for stdio servers shipped through npm or pip. Treating MCP as "just another adapter" is what turns a clean boundary into an incident.

**Approval laundering.** A class of failure worth naming: a harmless preview is followed by a destructive call the user already nodded through. Designing the destructive step to require its own consent (not inherited from the preview) cannot be left to the client alone; the server has to enforce the boundary. The breadth of [Resend's MCP](https://resend.com/docs/mcp-server#what-can-resend’s-mcp-server-do) (sending emails, managing contacts, verifying domains, reading inbound mail) is exactly why scopes, approvals, audit logs, and revocation cannot be optional. Once an agent can change customer-facing communication infrastructure, the bar rises and stays risen.

**Client fragmentation.** "One server, many clients" is true with a caveat that surprises teams in production. The MCP spec includes tools, resources, and prompts; clients implement different subsets. [GitHub Copilot's cloud agent](https://docs.github.com/en/copilot/concepts/agents/cloud-agent/mcp-and-cloud-agent) currently exposes MCP tools but not resources or prompts, does not yet accept OAuth flows for remote servers, and uses available tools autonomously without asking for approval before each call. [Cursor](https://cursor.com/docs/mcp#protocol-and-extension-support) supports tools, prompts, resources, roots, elicitation, and apps. [Gemini CLI](https://google-gemini.github.io/gemini-cli/docs/tools/mcp-server.html#mcp-prompts-as-slash-commands) treats prompts as slash commands. The protocol is one. The clients are not. Design for the weakest target, or the server stops working in production for half its users.

## Or, just answer these eight questions

The questions below filter to a yes or a no for a quick evaluation.

1. **Will only one in-app agent use this, under your control?** → Direct API tools. No MCP.
2. **Does the work fit a shell pipeline with typed output?** → Make the CLI agent-operable first.
3. **Is the missing piece capability, or discipline of use?** → If discipline, ship a skill against the CLI you already have. If capability, continue.
4. **Does the agent need structured discovery, runtime context, or reuse across clients?** → If none, MCP is premature.
5. **Do your target clients consume the MCP features you need — tools, resources, prompts, OAuth, elicitation?** → If your minimum-viable subset does not fit the weakest target client, stop.
6. **Can you name 3–5 stable, intent-level operations — or, for huge programmable surfaces, a search/execute pattern?** → If neither, you are not ready.
7. **Is the operation write-capable, cross-tenant, or otherwise high-risk?** → Governance, approvals, and audit must exist from the prototype, not after.
8. **Can you ship versioned changes without breaking installed clients?** → If versioning and deprecation are unsolved, you are not product-ready.

A yes on 1 or 2 sends you back to a smaller boundary. On 3, if the missing piece is discipline of use, start with a skill; if it is capability or access, continue. A no on 4 means too early. A no on any of 5–8 means not yet.

## Even if you get to "yes", watch for these

The common MCP fails that make servers go bad might be easy to miss if you don't know what to look for:

- **API dumps.** Every endpoint wrapped as a tool, every internal noun exposed, every schema shaped for engineers. This is just the same backend-leakage fail that breaks direct API calls.
- **Tool overload.** A growing tool catalog with overlapping names and schemas, which are often visible once a server gets into the teens. Names like `notification-send-user` versus `notification-send-channel` stop disambiguating; the model picks wrong.
- **No security model.** Auth without least privilege, no per-tool scopes, no approval gates, no audit logs, no prompt-injection defense. A networked interface with default-allow is a security incident waiting for the right prompt.
- **No evals.** Nobody knows whether the tools work, because nobody wrote a test that proves they do.
- **A server with no understanding of its clients.** The team built for the richest possible MCP feature set; the actual users are on a client that supports only a subset.
- **Endpoint-shaped tools.** `getUser`, `updateUser`, `deleteUser` instead of intent-level operations the agent can compose into work. The names give this one away.
- **No versioning story.** Shipping breaking signature changes without namespacing, deprecation windows, or additive-only schema evolution turns the protocol's "compounding layer" promise into a compounding maintenance burden.

All of these end the same way: too many tools, little usage, and the integration burden moved from API to MCP without disappearing.

## Use, wrap, or build

Default to the official server when the system is not your product domain.

- **Use** an official server when the abstraction is good enough: Stripe MCP for payments, Playwright MCP for browser automation, or another supported server.
- **Wrap** a third-party system only when you are adding your own product semantics, policy, or workflow on top.
- **Build** only when the abstraction is proprietary: your permissions, your customer context, your deployment model, your incident semantics, your billing state.

A thin wrapper around Stripe is infrastructure. A `get_customer_risk_context` tool that combines Stripe, product usage, support tickets, incidents, and contract terms is product surface.

## The final decision

Mature integrations eventually ship more than one layer: an API as the foundation, a CLI for local environments, skills for usage discipline, and an MCP server for cross-client agent access. Out of these, MCP costs the most to operate, so build it when the architectural signal, the production scenario, and the cost are all on the table, and the answer is _still_ yes.

---

**Hire Evil Martians to design and build your MCP** Evil Martians help devtools startups design for agent-first adoption: LLM-ready docs and MCP servers, agent auth and security. Let's talk about what we can build together. [Contact Evil Martians](https://evilmartians.com/contact-us)
