Whitepaper · AI-System Connectivity

Connecting AI to the Enterprise: Notes from the Messy Middle

Why the current pain in AI-system integration is familiar, tractable, and worth sitting through.

Practitioners, architects, executives
22 minutes
Cinchy · PeriMind

The integration question, again

Enterprise technology keeps asking the same question, generation after generation: how does a new layer connect to the systems that already run the business?

Mainframes faced it, then client-server, then the web, then mobile, then cloud. Each generation produced its own protocols, its own pain, and its own discourse of failure. New approaches proliferated, early deployments broke in instructive ways, critics declared the whole paradigm unworkable, vendors entered and consolidated. Eventually the operational discipline caught up to the protocols, and the layer became boring — which is usually how you know it worked.

AI is now in the same conversation. Models need to read from and act on the systems that run the business, and the early answers — the Model Context Protocol most visibly, but also native function calling, code generation against APIs, and an emerging set of gateway and broker patterns — are taking heavy criticism. Security writeups surface vulnerabilities. Practitioners document cost overruns and quality degradation as tool catalogs sprawl. Vendors that began as community projects are entering the space and reshaping it. Some organizations are abandoning the protocol entirely and writing code against APIs directly, which is worse in ways we'll come back to.

The discourse has the recognizable texture of an integration paradigm in its messy middle: a lot of heat, a lot of partial truths, and very little agreement on what the actual problem is. This paper is an attempt to bring the temperature down and offer a map.

Disclosure

We build infrastructure in this space. We have a commercial interest in the field maturing. We do not have a commercial interest in the reader reaching any particular conclusion about which pattern wins, and the analysis that follows is designed to stand on its own.

The scope of the paper is the connectivity question itself — how AI systems connect to enterprise systems — with MCP as the running example because it is the most visible current case, not because it is the subject. The arguments here would survive whatever protocol succeeds it. That generality is deliberate: protocols are less durable than the question they exist to answer, and the question of how non-deterministic systems talk to deterministic ones is going to outlast every implementation currently being argued about.

Three audiences should find this useful. Practitioners building or buying connectivity infrastructure will find vocabulary for what they are actually dealing with. Security architects will find a framework for locating risk in the right layer rather than the loudest one. Executives will find an investment-timing argument grounded in historical pattern rather than vendor narrative.

What this paper is not: a vendor comparison, a security analysis, or a formal maturity model. It is orientation. The terrain is real and currently lacks a map. We are trying to draw one.

What the layer is actually being asked to do

Before diagnosing the pain, it is worth being precise about the job. Some of the difficulty in AI connectivity is genuinely new, and some of it is what enterprise integration has always been. Sorting which is which is where most of the current discourse breaks down.

Every prior integration layer was built for deterministic callers. The thing on the other end of an API was code — written by humans, reviewed, tested against a contract, constrained at runtime to behave the way it was designed to behave. When something went wrong, the cause was a bug, a misconfiguration, or deliberate misuse, and each had a known remediation path. The operational machinery built up around those layers — authentication, rate limiting, schema validation, audit — quietly rested on the assumption that the caller was an artifact of intentional engineering.

AI connectivity inherits all of that and adds five things prior layers never had to handle.

The caller is non-deterministic. A model deciding which tool to call, with which arguments, in which order, based on its interpretation of a natural-language request, does not behave like code. It behaves like a system producing plausible outputs by drawing on patterns from a vast training corpus. Plausible is not the same as correct, safe, or intended. Twenty years of integration tooling assumes the caller can be reasoned about as a static artifact, and that assumption no longer holds.

Intent is expressed in language, not code. Earlier integration layers received structured calls and returned structured responses. The translation from human intent to API call happened upstream, inside an application, written by a developer who could be held accountable for it. AI connectivity moves that translation step into the runtime and hands it to the model. The system is now making consequential decisions about what the user meant in the same layer where it is deciding which API to call, and the two decisions are entangled in ways that are difficult to audit after the fact.

The trust boundary has moved, and most organizations have not yet adjusted. This is where zero trust becomes a useful lens. Network security spent the better part of a decade learning that "inside the perimeter" had stopped being a meaningful trust zone — cloud, mobile, and remote work had scattered the assets the perimeter was meant to protect. The response was zero trust: verify per-request, per-identity, per-resource, continuously, regardless of origin.

AI connectivity is doing the same thing to the application trust boundary that cloud did to the network one. The old assumption was that code inside the application was trusted because the organization had written and reviewed it. A model sitting inside that application, processing untrusted input, making authenticated calls on behalf of users, breaks that assumption. Prompt injection, exfiltration through tool descriptions, confused-deputy patterns, the various framings of the "lethal trifecta" — they all reduce to the same underlying point. The model is a trusted component handling untrusted input, and the architecture around it was not designed for that posture.

The response is structurally familiar. Stop treating "inside the application" as a trust zone. Establish trust per-action, per-identity, per-context. Assume the model can be manipulated, and design for that. The principles transfer almost directly from network zero trust; what is new is the unit being verified, which is no longer a packet or a request but an agent's decision to take an action.

Actions are auditable; decisions are not. When a tool call happens, the call itself can be logged, replayed, and reasoned about. The decision behind it — what the model considered, what context it had at the time, why it chose this action rather than the alternatives — is partial, probabilistic, and increasingly treated as proprietary by the model providers. Organizations are accustomed to audit trails that allow them to reconstruct what happened and why. AI connectivity gives them the what in high fidelity and the why in a form that is fundamentally less complete than what they are used to.

Composition is dynamic and unbounded. Microservices were composed at design time by engineers who decided which services would call which. Agents compose tools at runtime in ways no human designed and no test suite covered. The risk surface is not the set of individual tool calls — it is the combinatorial space of tool sequences, most of which were never explicitly considered. A read tool and a write tool that are each safe in isolation can compose into an exfiltration pattern neither tool's author anticipated. An agent with access to a CRM read tool and a Slack post tool, each individually scoped to legitimate uses, can read customer records and post them to a public channel without either tool behaving outside its specification.

None of these requirements are unsolvable. Several already have partial answers in the field. But they are genuinely new, and they explain why the current integration moment feels harder than the ones that came before it, even to engineers who have lived through several. The protocols are not worse than their predecessors. They are being asked to support a different kind of caller, against different threats, with different audit needs, and the operational layer around them has not yet caught up.

The critique that "MCP is broken" usually rests on an implicit comparison to integration layers that had decades of accumulated operational maturity behind them. The fair comparison is to the equivalent moment in those earlier layers — before service meshes, before mature API gateways, before federated identity.

In every prior case, the answer was not that the protocol was wrong. The answer was that the discipline around it had not yet been built. That is the position AI-system connectivity is in now. The protocols are early, the operational discipline is earlier, and most of the visible pain lives at the seam between them.

The ownership problem underneath the layers

The criticism aimed at MCP usually collapses three things that need to stay separate, but the more interesting story is who controls each of them.

The protocol is the spec. It defines how clients and servers exchange messages and describe capabilities. It is small and deliberately minimal. MCP was originated by Anthropic and donated in December 2025 to the Agentic AI Foundation, a directed fund under the Linux Foundation, co-founded with Block and OpenAI and supported by most of the major model providers and cloud vendors. Governance now sits with a neutral foundation, with day-to-day technical direction continuing under the existing maintainers. This matters in two ways for the enterprise. First, the protocol is no longer subject to a single vendor's roadmap, which removes one category of risk that was reasonably keeping some buyers cautious. Second, it does not change the underlying ownership reality for the enterprise consuming it — neutral governance is still governance by someone else, on someone else's timeline, optimized for the standard rather than for any specific deployment.

The implementation is a specific server or client built against the protocol. Implementations make hundreds of decisions the protocol does not: what to expose, how to describe it, what to authenticate, what to log, what credentials to require, what to do when arguments are malformed. Early in any integration paradigm, implementations tend to come from the community. Later, they get displaced by official versions built and operated by the SaaS vendors whose systems they front. This is already happening to MCP. The long tail of community servers for Salesforce, ServiceNow, Workday, Jira, and the rest will be replaced — is being replaced — by vendor-owned implementations. Which means the enterprise's control over the implementation layer is decreasing, not increasing, over time.

The deployment is how implementations are wired into the organization. The topology. Direct connections or mediated. Broad credentials or narrow. Centrally audited or scattered. This is the layer the enterprise actually controls. Everything else, it accepts or refuses.

This asymmetry is the part the clean three-layer story misses, and it is the part that determines what an enterprise can actually do about the current pain.

In theory, most of the visible problems in AI connectivity could be addressed at any of the three layers. Prompt injection could be mitigated at the protocol (richer trust signaling), at the implementation (servers that refuse to surface untrusted content as instructions), or at the deployment (mediation that enforces policy on actions regardless of how the model arrived at them). Tool sprawl could be addressed at the protocol (mandatory capability negotiation), at the implementation (smaller, better-scoped servers), or at the deployment (mediated discovery and filtering). Cost overruns and quality degradation have similar branching answers.

In practice, the enterprise rarely gets to pick the layer. The protocol is set by someone else. The implementations are increasingly set by vendors whose incentives only partially align with the enterprise consuming them — a SaaS vendor building an MCP server will optimize for breadth of capability and ease of adoption, not for the specific governance posture of any one customer. Which leaves the deployment layer as the place where the enterprise can act, because it is the only place where its authority is complete.

The strategic question stops being "which layer is the right place to solve this." The question becomes: can the shortcomings of the layers we do not control be addressed from within the layer we do, without the deployment layer becoming so complex that the cure is worse than the disease?

That reframe changes the texture of the current discourse considerably. It also explains the architectural pattern that keeps emerging — mediation tiers, gateways, brokers, registries, policy planes — and why it keeps emerging. These are not elegant solutions chosen because they are theoretically clean. They are the rational response to an ownership problem. The enterprise builds mediation because mediation is what you build when you need to impose your opinions on systems whose implementation you do not control and whose protocol you cannot dictate. The same logic produced API gateways when enterprises did not control the APIs they consumed, and service meshes when platform teams did not control individual service implementations.

There are real protocol-level concerns that deserve protocol-level attention — identity propagation, scope expressiveness, capability negotiation — and they are being worked on. There are implementation-level concerns that vendors will eventually internalize, because their customers will demand it. But the enterprise that waits for these layers to mature is making a bet that the upstream owners will solve the problem on a timeline that matches its own. That is rarely how it works. The enterprise that invests in the deployment layer is making a different bet: that whatever the upstream layers eventually look like, the operational capability to govern AI-driven actions will remain valuable, because the underlying need is durable across protocol generations and vendor cycles.

This is the part that holds regardless of which protocol wins. If MCP gets displaced tomorrow by something better, the ownership problem does not change. The new protocol gets defined by someone else. The new implementations get built by SaaS vendors with their own priorities. The enterprise still controls the deployment, and only the deployment. So the deployment-layer investment compounds across protocol generations in a way that protocol-specific bets do not.

The pattern is not new

Every integration paradigm in enterprise computing has eventually produced a mediation layer, and the reason is consistent across cases. It is not that mediation is the architecturally cleanest place for cross-cutting concerns. It is that mediation is where the enterprise can act when it does not control the upstream layers.

The clearest precedent is the API economy. Early REST adoption was decentralized, inconsistent, and operationally painful. Authentication was ad hoc. Endpoints sprawled. Audit was an afterthought. The response was the API gateway, which centralized auth, rate limiting, routing, transformation, and analytics into a mediation tier. The reason gateways won was not theoretical elegance — it was that enterprises did not control the APIs they were consuming, did not control how those APIs authenticated, and did not control how those APIs evolved. The gateway was the place the enterprise could impose its own opinions on systems it did not own. Today, no serious API program operates without one.

The closer structural precedent is microservices. Teams owned their services, services talked to each other directly, the network was the integration layer. The pain followed the script: observability gaps, identity sprawl, inconsistent retry behavior, emergent failures across services that no one had explicitly designed. The response was the service mesh, which moved cross-cutting concerns into a mediation layer underneath the application code. The driver, again, was an ownership problem. Platform teams did not control how individual service teams implemented their services. They could not mandate consistent retry logic, or consistent observability, or consistent mTLS, by asking nicely. The mesh let them impose those properties from a layer they did control. The service mesh arc is worth studying closely if you want to predict where AI connectivity is heading. The structural similarity is not a coincidence.

A related arc, in a parallel domain, is zero trust. Network security spent a decade working through the realization that perimeter-based trust had stopped matching enterprise reality. The response was continuous verification, identity-aware access, and policy enforcement at every hop. Zero trust matters here for two reasons. First, it is itself a maturation story — the most recent example of a security discipline working through its own messy middle, which it largely has. Second, its principles transfer almost directly to AI connectivity, because the model has done to the application trust boundary what cloud did to the network one. Inside the application is no longer a meaningful trust zone when a non-deterministic component is processing untrusted input and holding live credentials. The fix is structurally the same — stop treating "inside" as trusted, verify per-action, design for compromise — and the place that fix gets implemented is, predictably, a mediation layer between the model and the systems it acts on. For the same ownership reason: the security team does not control the model, does not control the SaaS implementations the model is calling, and can only act in the layer between them.

Across these arcs the shape is consistent. A new integration need emerges. Protocols and patterns proliferate. Deployments are decentralized because decentralization is fast and the operational implications have not yet bitten. Visible failures accumulate. Critics declare the paradigm broken. Mediation layers emerge in the one place the enterprise has full authority. The ecosystem consolidates around the mediation pattern, vendors fight, and eventually the layer becomes infrastructure no one argues about because the discipline around it has matured.

AI connectivity is on this curve. The protocols are early, the implementations are increasingly owned by SaaS vendors with their own priorities, the deployments are decentralized, the failures are accumulating, the critics are loud, and the mediation infrastructure is starting to emerge in response. This is not a sign that something has gone uniquely wrong. It is a sign that the field is in a phase every prior integration layer has passed through, and the reason it always passes through this phase is the same reason every time. Enterprises do not own the upstream layers, and they eventually build the layer they do own into something that can carry the weight.

Three postures, and one that matches the ownership reality

Organizations are responding to the current state of AI connectivity in three distinct ways. Each posture has real defenders. Only one of them aligns with the layer the enterprise actually controls.

Abandonment. Treat the current pain as evidence the paradigm is unworkable, pull back to chat-only AI, wait for something better. The reasoning is usually some combination of unacceptable security risk, unsustainable cost, and a wager that someone else will figure it out first.

The abandonment posture is rational at the level of individual risk decisions. It also refuses to engage with the ownership problem, which does not make the problem go away — it just delays the moment of engagement. The connectivity need is durable. The systems that run the business still have to be reachable by AI, and the competitive pressure to make that reach work efficiently does not pause while a given organization waits. Whichever protocol eventually wins, the implementations will still be controlled by SaaS vendors, and the enterprise will still only control the deployment layer. Sitting out the messy middle does not change that. It just means arriving at the consolidated phase with less institutional knowledge, less internal capability, and less leverage over the patterns that have settled without input.

There are real exceptions. Heavily regulated industries with extreme audit requirements, organizations with genuinely modest AI ambitions, and organizations that have honestly assessed their internal capacity as insufficient for the operational discipline this phase demands — for these, abandonment can be the right call. But it should be a deliberate choice, made with awareness that the ownership problem is being deferred rather than avoided.

Premature commitment. Adopt aggressively, often at scale, without commensurate investment in the deployment layer. The reasoning is usually some combination of strategic urgency, competitive positioning, and the not-entirely-wrong instinct that being early to a maturing technology offers compounding advantage.

The failure mode here is more specific than it first appears. Premature commitment without deployment-layer investment is, functionally, a decision to trust the upstream layers to get it right. It is a bet that the protocol owners will close the protocol-level gaps quickly, and that the SaaS vendors building implementations will optimize for the customer's governance posture rather than their own product priorities. Both bets are sometimes correct and structurally weak. Vendors building MCP servers will optimize for breadth of adoption, not for the audit requirements of any single customer. Protocol owners will close gaps on their own timeline, not the enterprise's. Organizations in this posture often look like leaders for twelve to eighteen months and then quietly re-platform when the gaps surface in production. The cost of the rebuild is rarely visible from outside, which is part of why this posture remains attractive.

Invested maturation. Accept that the paradigm is necessary, accept that the upstream layers will mature on their own schedule, and put deliberate effort into building authority in the layer the enterprise actually owns. Mediation infrastructure that handles identity propagation, audit, policy, and observability for AI-driven actions on enterprise systems, regardless of which protocol is fashionable this quarter. Connectivity adopted incrementally, with clear boundaries on what is trusted upstream and what is mediated locally. Internal expertise in the operational concerns every maturation arc eventually surfaces — credential scoping for non-deterministic callers, audit for probabilistic decisions, policy at the level of intent rather than action.

This posture is harder and slower in the short term. It is also the only one that responds to the actual ownership reality, which is that the enterprise does not control the protocol, will increasingly not control the implementations, and has to do its work in the layer where its authority is complete. The organizations that built API gateways before they were standard, that adopted service meshes during the messy phase, that took zero trust seriously while the perimeter model was still defended — these organizations entered the consolidated phase with capability and leverage that latecomers spent years assembling. The pattern is consistent enough across prior waves to be treated as predictive.

The argument is not that every organization should pick the third posture. It is that the choice should be made consciously, with the ownership reality in view. Abandonment is a deferral. Premature commitment is a bet on upstream layers the enterprise does not control. Invested maturation is the one posture that builds capability in the layer the enterprise actually owns, and the durability of that capability is the strategic point. Whatever the protocol landscape looks like in three years, whatever the SaaS-vendor implementations look like in five, the operational authority to govern AI-driven actions inside the enterprise will still be valuable, because the underlying need is durable across cycles.

What mediation does and doesn't solve

If the deployment layer is where the enterprise's authority is complete, and mediation is the architectural pattern that lets the deployment layer impose opinions on systems it does not control, then it is worth being precise about what mediation actually does for an organization and — more importantly — what it does not.

Mediation is doing real work on six fronts.

Identity propagation. A mediated architecture can attach a verified identity to every tool call an AI system makes, regardless of how the upstream model or server handles authentication. The model is not the actor; the identity on whose behalf the model is acting is — whether that identity is a human user, a service account, or an agent operating with its own delegated authority. This sounds like a small thing and is, in practice, the difference between an auditable system and an unauditable one. Most of the "we cannot trace agent actions back to a sponsor" problem dissolves once identity is enforced at a layer the enterprise controls.

Audit. A single, consistent log of every AI-driven action against enterprise systems — the request, the resolved identity, the policy that was applied, the result — is something no individual server or model can produce on its own. Mediation produces it by being the chokepoint every call has to pass through. The audit trail is not just for compliance; it is the substrate that makes every other operational improvement possible. You cannot reason about behavior you cannot observe.

Policy enforcement. Mediation lets the enterprise apply opinions about what should happen — at the level of individual tool calls (this tool requires extra approval), at the level of sessions (this user has exceeded their daily quota for sensitive operations), and at the level of combinations (this sequence of reads followed by writes looks like exfiltration). Policy at the deployment layer is the only place these rules can live coherently, because no individual server has the context to enforce them and no model can be trusted to enforce them on itself.

Human-in-the-loop control. Because every tool call passes through a single layer, the mediation tier can pause execution on calls that meet defined criteria — high-impact actions, novel patterns, anything crossing a risk threshold — and route them to a human for approval before they proceed. This converts a unilateral model decision into a deliberate human one at the moments where unilateral autonomy is the wrong posture. Without a chokepoint, there is nowhere to insert the pause; the model either acts or it doesn't, and the human has no surface to intervene on. With a chokepoint, the spectrum between "fully autonomous" and "fully manual" becomes configurable on a per-action basis.

Data protection. Inputs and outputs flowing through a mediated layer can be inspected, redacted, or blocked. Sensitive content can be stripped before it reaches the model. Output can be checked against policies about what can leave specific systems. This is where the DLP-style controls organizations already understand from their existing security stack get applied to a new attack surface.

Session-level reasoning. The most distinctive thing mediation enables, and the one that maps least cleanly to prior integration patterns. AI risk is rarely about individual tool calls. It is about sequences — a procurement agent that reads vendor records and then writes a payment instruction is doing two things that are individually routine and combined are consequential; the innocuous query that becomes meaningful only when followed by a specific action. A mediation layer that can see the whole session can apply reasoning that no single tool author or server can. This is the part that requires the mediation layer itself to be more than a router; it has to understand the calls flowing through it.


These six capabilities are substantial, and they explain why mediation keeps emerging as the architectural answer regardless of the specific connectivity protocol. They are also bounded. There are five categories of problem mediation does not solve, and being honest about them is what separates the argument from a sales pitch.

Prompt injection at the model boundary. Mediation can apply policy to actions, but it cannot prevent the model from being manipulated by content it processes. There is a meaningful subset of this problem mediation does help with: when malicious content reaches the model through a tool call response — a document fetched from a file share, an email body, a Confluence page with embedded instructions — that content flows through the mediation layer on its way back to the model, and can be inspected, scanned for injection patterns, or redacted before delivery. What mediation cannot see is content that reaches the model through paths it does not sit on: direct user prompts, system prompts, files uploaded straight to the model, content from third-party context sources the mediation layer is not party to. For those paths, if the model is induced to call a tool it should not have called, the mediation layer can stop the call — but the model has already been compromised, its reasoning has been hijacked, and downstream defenses are the only line of protection. This is a model-layer problem at root, not a connectivity-layer problem.

Policy at the level of intent. Mediation enforces policy on actions and combinations of actions. It does not, today, enforce policy on intent — the question of whether what the user is trying to do is something the organization wants to allow, independent of which specific tools get called. Intent-level policy requires the mediation layer to understand semantics, not just syntax, and the field is early on this. Some mediation platforms apply LLM-based classifiers to inputs and outputs; the technique works for many cases and fails in interesting ways for others. This is one of the frontiers.

Composition risk across agents. Single-agent composition is hard enough. Multi-agent systems, where one agent's output becomes another's input and decisions propagate across boundaries no single mediation layer can see, are harder still. The current architectural assumption is that mediation sits between a model and its tools. Multi-agent topologies break that assumption, because the "tools" of one agent may be other agents, and the policy questions become recursive. The field does not have a clean answer here.

The mediation layer as a target. Centralizing every AI action through a single layer makes that layer a high-value target. Compromise of the mediation layer is, in principle, catastrophic in a way that compromise of any individual server is not. The mitigation is the same as it has always been for centralized control planes — defense in depth, segmentation, careful operational hygiene, auditing the auditor — but the risk is real and worth naming. Organizations adopting mediation should plan for it the same way they planned for the API gateway becoming a critical-path component.

The audit gap between decision and action. Mediation produces high-fidelity records of what happened. It does not, on its own, produce records of why the model decided to do it. The decision lives inside the model's reasoning, which is partial, probabilistic, and increasingly proprietary. Some platforms attempt to capture reasoning artifacts — chain-of-thought, intermediate context, retrieval snippets — and these help, but they are not the same as the kind of complete decision trail organizations are used to having for code-driven systems.

The honest assessment is that connectivity-layer mediation handles the cross-cutting operational concerns the enterprise has always needed to handle — identity, audit, policy, human approval, data protection — and adds a new capability — session-level reasoning — that maps to the new risk surface. The model-layer problems, the intent problems, and the multi-agent problems sit outside its scope. Other layers of defense have their own contributions to make on those — model-side guardrails, content classifiers, agent identity frameworks, and the broader set of controls that surround the model itself. The argument here is not that connectivity-layer mediation is sufficient on its own; it is that for the cross-cutting operational concerns within its scope, it is the layer where the enterprise has the authority to act.

The discipline, not the protocol

The takeaway is not that any specific protocol wins, or that the critics are wrong, or that the pain is fake. It is that AI-system connectivity is an emerging discipline, it is following a recognizable maturation arc, and the patterns of its resolution are visible to anyone willing to look at the shape rather than the noise.

The organizations that navigate it well will be the ones that stop arguing about the protocol and start thinking carefully about the deployment topology. They will treat the layer they actually own — the mediation layer between AI components and the systems those components reach into — as the strategic investment, not as plumbing. They will accept that the protocol will mature on its own schedule, that the implementations will mature on their own schedule, and that neither schedule will match their own. And they will treat AI-system connectivity the way the discipline treated zero trust five years ago: not as a product, but as a posture. The question is not whether to participate in AI-system connectivity. The question is whether the participation builds durable capability or temporary dependency.

The messy middle is uncomfortable. It is also where the durable patterns are formed. Every integration paradigm has passed through this phase, and every time, the layer eventually became boring. The organizations that did the work early were the ones already past the messy middle when everyone else realized it had ended.

End of paper
Written by the team at Cinchy, makers of PeriMind.