Whitepaper 2 of 2 · Perfai Security · 2026

The Runtime Security Crisis
Hidden Inside Every AI App

AI coding agents did not create the access control problem. They just made it too fast to manage manually. Authentication, authorization, multi-tenancy, SSO, collaborators — the permission model of a modern AI app is so complex that misconfiguration is inevitable. And no scanner catches it before a breach does.

The problem was never the code. It is what happens at runtime.

There is a persistent myth in application security: fix the code, fix the security. SAST, DAST, and code scanning tools exist because of this belief — and they have real value against injection, dependency, and misconfiguration issues.

But they miss the most dangerous class of vulnerability in modern apps entirely: access control. OWASP's own documentation states it plainly: "A tool cannot easily determine if User A should have access to Record B. Only the business logic and the specific domain requirements dictate that entitlement." Access control is a runtime problem. It can only be verified by testing what the application actually does — in a real session, with a real user type, hitting a real endpoint.

The user type explosion — who is using your app today?

A decade ago, a typical web app had users and admins. A modern AI app must correctly handle a matrix of eight or more user types — each with different authentication methods, session lifecycles, permission inheritance rules, and data visibility requirements. Every new user type multiplies the permission surface. Every misconfiguration in one type is a potential breach across all.

👤

Regular users

Email/password, social login, scope-limited data access

🏢

Enterprise SSO

SAML, OIDC, Okta, Entra ID — each with different claim mappings

🏗

Multi-tenant

Each tenant's data invisible to all others at every layer

🔗

Collaborators

Invited users with scoped, time-limited, per-resource permissions

🎧

Support staff

Impersonation rights, read-only views, audit trail requirements

🤖

AI agents

Service accounts acting for users — frequently over-privileged by default

🔌

API / M2M

Machine tokens, webhooks, third-party integrations with broad scopes

🛡

Super admins

Platform-level access that must never bleed into tenant data

The cognitive load of reasoning about every combination of user type, endpoint, workflow, and tenant at every release is immense. Misconfigurations are not failures of skill. They are an inevitable result of complexity at scale.

Seven layers where authentication and authorization fail — independently

Most apps implement security across multiple layers. The problem: each layer introduces its own misconfiguration surface, and the layers do not protect each other. A failure at any layer is a potential breach.

L1AuthenticationIdentity must be correctly established before anything else matters. Compromised credentials were the #1 initial access vector in Verizon DBIR 2025 (22,000+ incidents analyzed).High

L2SSO / SAML / OAuth5 critical SAML CVEs in 4 months (2025–2026). SAML signature wrapping, XML parser differentials, nonce bypass — one misconfiguration grants admin access to anyone.Critical

L3Session managementToken theft, session fixation, improper expiry. In 2024 SaaS attacks, Obsidian Security documented attackers moving from access to exfiltration in as little as 9 minutes.High

L4AuthorizationRBAC, ABAC, policy engines — must be enforced independently at UI, API, and database layers. AppSecure: 63% of SOC 2 audit failures are access control. UI checks routinely not mirrored at the API layer.Critical

L5Tenant isolationRow-level security, connection pool contamination, async context leaks. CVE-2024-10976 (PostgreSQL) and CVE-2025-8713 showed RLS can leak data via query plan side-channels. RLS alone is not sufficient.Critical

L6API-level enforcement99% of organizations reported at least one API security issue in the past year (SQ Magazine 2026). Direct API calls bypass application-level controls. Only 21% of organizations can effectively detect API attacks.High

L7Collaborator / support scopeTime-limited, resource-scoped, cross-tenant impersonation — the most hand-coded and least tested layer in most apps. Where "support staff" can see one tenant's data is rarely explicitly validated.Medium

The SSO crisis — enterprise authentication is a minefield of misconfigurations

SSO was supposed to simplify authentication. For users, it does. For developers, it introduced a new class of subtle, high-impact misconfigurations that are nearly impossible to detect without runtime testing — and nearly impossible to fix with a patch cycle that keeps failing.

CVE-2024-45409 · CVSS 9.8

Ruby-SAML auth bypass (2024)

Attacker logs in as any user via XML parser differential attack. Impacted GitLab and hundreds of Ruby-based SAML deployments. No authentication required to exploit.

CVE-2024-4985 · CVSS 10.0

GitHub Enterprise SAML bypass (2024)

SAML SSO bypass granting admin provisioning. A misconfigured encrypted assertions feature — not a code bug, a configuration gap. Maximum severity rating.

CVE-2025-59718/19 · Active exploitation

Fortinet FortiOS SSO bypass (2026)

Active exploitation from January 2026. SSO bypass via forged SAML tokens. Succeeded against fully patched devices — new attack paths in old protocols never end.

2025 · OAuth supply chain

Salesloft-Drift breach (2025)

Biggest SaaS breach of 2025. Compromised OAuth tokens from one app gave attackers access to hundreds of downstream environments — 10× blast radius vs. direct compromise (Obsidian Security).

"SAML had five critical vulnerabilities in four months (late 2025 – early 2026). The underlying XML architecture of SAML is inherently fragile — multiple XML parsers with subtly different behavior means patch cycles keep failing. Incremental fixes to SAML's XML signature validation keep failing because the underlying architecture is the problem."

— PortSwigger researcher Zak Fedotkin, Black Hat Europe, December 2025 (WorkOS blog, April 2026)

Multi-tenancy — the architecture that multiplies every permission mistake

Multi-tenancy is the economic foundation of SaaS. One app instance, many customers, shared infrastructure. It is also the architectural pattern that turns a single misconfiguration into a breach affecting every customer simultaneously — and the pattern that makes runtime testing non-negotiable.

The most common multi-tenancy failure is deceptively simple: a database query correctly filters by tenant ID in 99% of cases. The 1% — a new collaborator workflow, a support impersonation scenario, a background job without tenant context — returns data across tenant boundaries. The query looks correct in code review. It fails at runtime, in a specific session context that no static tool can reproduce.

①Connection pool contaminationShared DB connections carry tenant context from prior requests. When pooling serves the same connection to Tenant B with Tenant A's session context still active, cross-tenant data leaks silently.Critical

②Async context leaksIn Node.js, Go, and FastAPI, misuse of AsyncLocalStorage or global singletons storing tenant_id causes data to bleed across concurrent requests from different tenants.Critical

③Row-level security gapsCVE-2024-10976 (PostgreSQL) and CVE-2025-8713 showed RLS can leak sampled data from other tenants via query plan side-channels. RLS is necessary but not sufficient.High

④URL / object parameter manipulationAppSecure found a healthcare SaaS where changing one URL parameter exposed any customer's patient records. The flaw existed in production for 11 months before discovery — through a pentest, not automated testing.Critical

The Snowflake breach pattern (2024): Attackers exploited the fact that MFA was not mandatory. They used compromised credentials to breach approximately 165 customer accounts. Roughly 80% of those accounts had prior credential exposure in infostealer logs. A single authentication misconfiguration — MFA not enforced — across a multi-tenant platform multiplied to 165 customers. Verizon DBIR 2025, 22,000+ incidents.

The permission matrix — why 20,000 combinations cannot be audited by any human process

Sample permission matrix — just one endpoint, across user types. The "?" cells are where breaches happen:

This is one endpoint. A typical AI app has 100. Each has its own matrix. The "?" cells — especially AI agent permissions — are frequently undefined, defaulting to over-privileged service account access that nobody tested.

Why every static tool misses what matters most at runtime

Endpoint	Regular user	Enterprise SSO	Collaborator	Support staff	AI agent	Super admin
GET /data (own tenant)	✓	✓	Scoped only	✓ Read	?	✓
GET /data (other tenant)	✗	✗	✗	Audit only	✗	Admin only
POST /export	If plan allows	If plan allows	✗	✗	?	✓
DELETE /record	✗	✗	✗	✗	✗	✓
GET /admin/users	✗	✗	✗	Own tenant	✗	✓

SAST scans the code

Finds hardcoded secrets, syntax issues, known patterns. Cannot determine if User A should access Object B — that requires runtime business logic. OWASP: "A tool cannot easily determine if User A should have access to Record B."

DAST probes from outside

Finds publicly accessible flaws. Cannot test cross-tenant access — it has no session context representing a real tenant boundary crossing, so the most dangerous misconfigurations are invisible to it.

Pentest — point-in-time, $10K–$50K, 89-day gap

A skilled tester can find access-control logic flaws. But only in the app as it existed during the test window. IBM 2025: organizations take 241 days on average to identify and contain a breach — 181 to detect, 60 to contain.

Bug bounty — reactive, after real user data is live

90% of payouts now go to access-control flaws (2024–2025 bounty analysis). 38% of organizations discovered API breaches only after external reporting, not internal detection (SQ Magazine 2026).

Autonomous runtime testing — the only approach that scales

Tests every permission combination at runtime, across every user type, tenant boundary, and deployment. Finds the "?" cells before researchers do — not after your data is in their report.

Perfai Security — purpose-built for runtime complexity

Perfai Security was built around one insight: the access-control problem in modern apps is a runtime problem, not a code problem. Three agents form a continuous security loop that runs with every deployment.

Vision agent

Maps every workflow, endpoint, role, user type, and tenant config. Builds a live permission model — including the "?" cells nobody documented — that updates with every release.

Security agent

Simulates every user type — regular users, SSO sessions, collaborators, support staff, AI agents, cross-tenant scenarios — against the live application on every deployment.

Fix agent

Translates runtime findings into context-aware remediation. A specific fix scoped to your codebase — not a CVSS score and a recommendation that sits in a backlog.

1OWASP Top 10:2025 (January 2026) — 100% of apps have broken access control; 175K+ CVEs; 248 CWEs; SSRF absorbed into A01. owasp.org/Top10/2025

2IBM Cost of a Data Breach Report 2025 (July 2025) — $4.44M global average; 241-day avg breach lifecycle; 97% of AI-breached orgs lacked access controls; 63% lack AI governance policy. ibm.com

3Verizon DBIR 2025 — 22,000+ incidents, 12,000+ confirmed breaches; credentials #1 initial access vector (22%); third-party breaches doubled to 30%; ransomware in 44% of breaches

4Cloud Security Alliance 2025 State of SaaS Security — 43% cite config complexity as top challenge; 50% distributed management; 65% struggle to fix SaaS misconfigs; only 23% have full SaaS visibility

5Obsidian Security 2024 — 300% YoY increase in SaaS breaches; attackers moved from access to exfiltration in 9 minutes; Salesloft-Drift OAuth breach had 10× blast radius

6AppSecure 2025 — Access control = 63% of SOC 2/ISO 27001 audit observations; UI controls routinely not enforced at API layer

7CVE-2024-45409 (Ruby-SAML, CVSS 9.8, 2024) — auth bypass via XML parser differential; impacted GitLab and hundreds of Ruby SAML deployments

8CVE-2024-4985 / CVE-2024-9487 (GitHub Enterprise SAML, CVSS 10.0, 2024) — SSO bypass allowing admin provisioning via misconfigured encrypted assertions

9CVE-2025-59718/19 (Fortinet FortiOS, active exploitation Jan 2026) — SSO bypass via SAML forgery on fully patched devices. Arctic Wolf 2026

10PortSwigger / Black Hat Europe Dec 2025 — 5 critical SAML CVEs in 4 months; SAML "inherently fragile" due to multiple XML parsers. workos.com/blog/saml-vulnerabilities-2026

11SQ Magazine 2026 — 99% of organizations had at least one API security issue; 38% discovered breaches only after external reporting; only 21% can effectively detect API attacks

12CVE-2024-10976 (PostgreSQL RLS bypass) and CVE-2025-8713 (RLS data leak via query plan side-channels)

13AppSecure 2025 — Healthcare SaaS tenant isolation failure: URL parameter exposed any customer's patient records; flaw in production 11 months before discovery

14Verizon DBIR 2025 — Snowflake breach: 165 customer accounts compromised via missing MFA enforcement; 80% had prior credential exposure in infostealer logs