Your AI Stack Has No Security Boundary (And That Should Terrify You)

February 19, 2026
Alex Peay, COO
Share:

"Why are you solving old infrastructure problems?"

We get this question a lot. From investors, from developers, from people who look at the AI infrastructure gold rush and wonder why ContextOS is building a unified platform instead of chasing the latest model serving framework.

It's a fair question. Here's why the answer should concern everyone building AI applications today.

There's No Such Thing as a "Pure AI Application"

Let me walk you through what an AI-powered application actually looks like in production. Not the demo. Not the hackathon prototype. The real thing.

Take a RAG-powered support system — the kind of application thousands of companies are building right now. An agent working a ticket asks "how did we resolve this type of issue before?" and the system retrieves relevant context from past tickets, documentation, and runbooks, then generates an answer using an LLM.

Sounds like an AI application. But look at what's actually running underneath.

On the traditional side, you have PostgreSQL for tickets, user accounts, organizations, SLA routing rules, and audit logs. You have API servers and a frontend. Redis for session management and rate limiting. Background workers handling email notifications and SLA breach alerts. Object storage for attachments and uploaded documents. This is a standard web application — the kind the industry has been building for two decades.

On the AI side, you have a vector database storing embeddings of your knowledge base. An embedding service converting text to vectors at both ingestion and query time. An ingestion pipeline that chunks, processes, and indexes source documents. An LLM integration layer that constructs prompts from retrieved context and handles responses. An orchestration layer tying retrieval and generation together.

Here's the point: these aren't two separate systems. They're one application. The AI capability is woven into the traditional application at every level. When that support agent asks a question, the request flows from the frontend through the API tier, into vector search, through prompt construction, to the LLM, and back — touching both "traditional" and "AI" infrastructure at every step.

AI applications don't simplify infrastructure. They roughly double it.

The Multi-Vendor Reality

So what does building this actually look like today? A developer stitches together six or seven separate services:

Vercel for the frontend. Railway for the API backend and background workers. Pinecone for the vector database. Upstash for Redis. Modal for the AI pipeline compute (ingestion, batch embedding). AWS S3 for object storage. OpenAI for embeddings and inference.

That's six to seven vendor accounts. Roughly fifteen separate configuration surfaces across those vendors — vercel.json, railway.toml, Pinecone index configs, IAM policies, API keys for each service. Six or more monitoring dashboards with no correlation between them.

And zero unified security boundaries.

The Security Gap Nobody's Talking About

This isn't just an operational inconvenience. It's a security crisis in slow motion.

GitGuardian's 2025 State of Secrets Sprawl report found 23.8 million new hardcoded secrets in public GitHub commits in 2024 alone — a 25% increase over the prior year. Seventy percent of valid secrets detected in public repositories in 2022 are still active today. And according to Verizon's 2024 Data Breach Investigations Report, stolen credentials have been used in 31% of all breaches over the past decade.

These aren't abstract statistics. In December 2024, Chinese state-sponsored attackers breached the U.S. Treasury Department by exploiting a single compromised API key belonging to BeyondTrust, a technical support provider. One API key. That's all it took to access Treasury workstations and unclassified documents.

Now look at that multi-vendor RAG stack and count the API keys.

Pinecone authenticates via API keys — each project gets one or more keys, and every API call requires one. Modal uses token-based authentication against their control plane running in AWS us-east-1. Upstash uses REST API keys. AWS S3 uses IAM credentials. Each of these keys lives in environment variables scattered across Vercel, Railway, and Modal configuration files.

There is no mutual TLS between any of these services. No certificate-based identity. Traffic between Vercel, Railway, Pinecone, and OpenAI travels over the public internet with API key headers as the sole authentication. A compromised key in one service doesn't trigger an alert in any other. There is no coordinated audit trail — if you need to answer "who accessed what data and when" for a compliance audit, you're pulling logs from six dashboards and correlating timestamps by hand.

And it gets worse when you think about what's actually stored in that vector database. Embeddings aren't just abstract mathematical representations. Researchers at Cornell demonstrated in 2023 that text embeddings from common models can be inverted to recover significant portions of the original text. Your customer support tickets, internal documentation, and proprietary knowledge stored as vectors in Pinecone aren't "safe" because they're in a different format. They are your data — and they're sitting on a completely separate platform, under a completely separate security model, from the source data they represent.

For any enterprise pursuing SOC 2, HIPAA, or FedRAMP compliance, this is a nightmare. You can't demonstrate a coherent security posture when your application's data flows through seven different vendors' security models connected by API keys stored in environment variables.

The Industry Is Repeating the Same Mistake

If you've been following this blog series, you'll recognize the pattern. In my first post, I wrote about the difference between building spikes and wedges — how companies with sharp initial products but no underlying platform hit architectural ceilings they can't break through. In my second, I examined three specific patterns where infrastructure startups made rational short-term decisions that created long-term constraints.

The AI infrastructure gold rush is doing both simultaneously.

Companies like Modal, Replicate, and Baseten are building excellent products that solve specific pieces of the AI pipeline — serverless GPU compute, model serving, inference APIs. I respect the engineering. But they're building spikes. A developer using Modal for inference still needs Railway for their API, Vercel for their frontend, Pinecone for their vectors, and Upstash for their cache. Nobody's building the unified platform where traditional and AI components are provisioned, secured, and monitored together.

Now, someone will point out that AWS offers all of these capabilities under one roof. That's true. But 200+ services, IAM policy complexity that requires dedicated engineers, and months of configuration isn't simplicity — it's a different kind of fragmentation, one where the complexity lives inside a single invoice instead of across multiple vendors. The gap in the market isn't "unified." The gap is unified and simple.

Why We Built the Platform First

This is why my co-founder Tom Hatch and I made the decision we made. It wasn't because we're solving "old" infrastructure problems. It's because we knew that AI applications would need the same foundation that every production application needs — databases, compute, caching, storage, networking, security — and that foundation needs to be unified, not assembled from vendor parts held together by API keys.

ContextOS runs PostgreSQL with pgvector, which means application data and vector embeddings live in the same database under the same security model. There's no separate Pinecone account, no separate auth model, no separate billing, no embeddings sitting on a different platform than the data they represent. You get SQL joins between your application data and vector similarity search. When a support article updates, the embedding updates in the same transaction.

That single consolidation eliminates an entire vendor category for most RAG use cases. Not all — if you're running billion-scale vector search, a purpose-built solution may still make sense. But for the vast majority of AI-powered applications being built today, pgvector within a well-tuned PostgreSQL instance is more than sufficient and dramatically simpler.

And it's not just the database. The Zero Trust Bridge covers every connection on the platform — automatic TLS, RBAC, every service authenticated by default. No API keys in environment variables. No manual CORS configuration between services. No hoping that seven different vendors' security models don't have gaps between them.

That's what it means to build a wedge instead of a spike. The platform architecture we invested in from day one — the same architecture that took longer than shipping a quick developer experience product on AWS — is exactly what makes this possible. Every new capability we add composes with what already exists instead of requiring another vendor, another API key, another security boundary.

The Real Question

The question isn't "why are you solving old infrastructure problems." The question is: why is everyone else pretending that AI applications don't need the same infrastructure every production application has always needed?

If you're starting to add an AI layer to your applications, we want to talk. Sign up for our beta and let us show you what a unified platform can do — for your infrastructure and your security posture.

[Sign up for the ContextOS Beta →]

Sources: GitGuardian, "The State of Secrets Sprawl 2025"; Morris et al., "Text Embeddings Reveal (Almost) As Much As Text" (2023), arxiv.org/abs/2310.06816; Verizon 2024 Data Breach Investigations Report; IBM Cost of a Data Breach Report 2024.