The rise of artificial intelligence (AI) has brought transformative innovation across sectors, but it has also unleashed a complex array of cybersecurity threats. Of particular concern is the inference phase of AI workloads—where models are actually used in production to generate predictions—which has become a fertile ground for new security vulnerabilities. In May 2024, Databricks and the security firm Noma jointly announced a partnership to address this growing issue. Targeting what many chief information security officers (CISOs) call an “AI inference nightmare,” the collaboration proposes a unified real-time security model embedded directly into AI data workflows. As inference becomes more decentralized and reliant on large language models (LLMs), the need for end-to-end runtime observability and security enforcement around AI processing interfaces has never been more urgent.
The Inference Security Gap
Inference refers to the phase where an AI model is deployed to make real-time predictions or decisions based on live input data. While training security has received significant attention, inference introduces new and distinct threats. These include prompt injection attacks, adversarial inputs that manipulate outcomes without detection, model extraction through repeated queries, and sensitive data leakage from outputs—a phenomenon now well-documented in multiple research papers from OpenAI and DeepMind.
In practical terms, the AI inference process often includes dynamic interaction between LLM APIs, data lakes, and downstream applications that act on model outputs in automated ways. This composability—while powerful—is also perilous. For example, an innocuous employee query sent to a chatbot linked to an enterprise data lake might trigger unauthorized data access, or worse, reveal sensitive responses manipulated via a malicious data payload. According to Noma co-founder Apurva Kumar, many companies are essentially “shipping untrusted code over the network” during AI interactions, often without runtime monitoring (VentureBeat, 2024).
Databricks and Noma’s Collaborative Framework
Databricks, a prominent AI and data analytics company, has already established itself as a go-to platform for artificial intelligence pipelines. With solutions like MosaicML for optimized model commercialization and Delta Lake for unified data storage, its ecosystem plays a key role in many enterprise deployments. However, until now, live inference security wasn’t deeply native to these pipelines. Working with Noma—a startup specializing in runtime AI security—Databricks aims to bring control over the often-blackbox interaction between models, data sources, and user queries.
This collaboration builds upon immediate observability and response mechanisms, integrating them at the data inference layer. Noma’s core innovation lies in hooking into the ever-evolving prompt-response loop, mapping metadata from user requests, and flagging anomalous behavior using causal inference, token evaluation, and context inspection. These techniques are capable of identifying patterns such as data leakage hints or prompt injections—including subtle forms similar to those described in recent work by Anthropic and DeepMind (DeepMind Blog, 2024).
Together, the stack aims to detect exploitable events proactively and adjust behavior accordingly, much like a firewall for inference interactions. Importantly, this approach aligns with shifts in AI development strategies seen in OpenAI’s GPT-4 and other transformer-based models, where hallucinations and output determinism are unpredictable and context-dependent.
Why Security at Inference Matters More Than Ever
Global organizations are moving increasingly from AI prototyping to full-fledged production deployments. According to a recent report by McKinsey, 55% of large enterprises now use generative AI in at least one business function, up from 33% in 2023 (McKinsey Global Institute). More significantly, inference workloads are becoming continuous and mission-critical. Applications include legal contract parsing, financial forecasting, technical customer support, and even autonomous decision-making in insurance claims or logistics.
When inference flaws occur, consequences are immediate. Both users and regulators are beginning to scrutinize LLM decisions the way they might review audit trails in traditional software. For example, any hallucinated legal advice or misjudged insurance claims—triggered by model misbehavior—could result in lawsuits, regulatory fines, and reputational damage. The U.S. Federal Trade Commission has placed increasing scrutiny on AI deployment compliance as early as Q1 of 2024, reinforcing the need for consistent oversight and transparent risk management (FTC News, 2024).
Implications for CISOs and the Future of AI Governance
For CISOs already burdened with maintaining zero-trust frameworks, endpoint protection, and application firewalls, AI inference presents a new layer of complexity. Traditional tools—designed for static systems—are often ill-equipped to monitor or interpret the generative and context-rich behaviors of language models. As a result, CISOs risk flying blind. In fact, a survey conducted by AI Trends and Gallup in March 2024 found that 63% of security executives do not have visibility into how LLMs are generating or processing data within their companies (AI Trends; Gallup Workplace Insights).
Databricks and Noma’s initiative directly addresses this gap by making inference logs traceable, inspectable, and policy-enforceable. This allows organizations to embed custom rules—ranging from specific entity redactions to broader behavior throttling—directly within AI workflows. This mirrors trends in API security, where behavioral firewalls reshape payload evaluation dynamically. And from a regulatory standpoint, this shift aligns with proposed AI governance frameworks by the EU’s AI Act and the U.S. National Institute of Standards and Technology (NIST).
Comparative Landscape of AI Inference Security
Databricks and Noma are not alone in identifying inference as a threat vector. Several companies, such as Robust Intelligence, Lakera, and HiddenLayer, are also building protective solutions for AI systems. However, most competitors focus on testing pre-deployment models or adding wrappers at isolated API endpoints. Databricks’ deep integration into data pipelines—and Noma’s feedback loop architecture—offers a differentiated advantage: native security embedded inside the AI production layer.
Company | AI Security Focus | Differentiator |
---|---|---|
Databricks + Noma | Real-time inference observability and policy enforcement | Embedded in native AI workflows and pipelines |
Robust Intelligence | Model validation pre-deployment | Mitigation before production environment |
Lakera | Safety layer for LLMs and prompt monitoring | Customizable prompt filtering APIs |
HiddenLayer | Adversarial model protection | Intelligent threat detection for AI inference layers |
This comparative landscape highlights the rapid evolution of the AI security sector itself. Even NVIDIA, a hardware and AI model powerhouse, released updates in 2024 aimed at improving inference-side threat mitigation within its DGX AI systems (NVIDIA Blog).
Economic and Strategic Implications
As AI systems transition from developer tools to business-critical infrastructure, spending priorities are shifting too. MarketWatch and CNBC report that enterprise-level security budgets for AI are expanding at a compound annual growth rate (CAGR) of 28%, forecast to reach $10.3 billion by 2026 (MarketWatch, CNBC Markets). In this climate, vendors who bring integrated, scalable, and measurable security experiences—like Databricks and Noma—enjoy strategic technical moats and go-to-market advantages.
Moreover, the rise of AI-native cybersecurity also presents acquisition and fundraising opportunities. Tech analysts from Deloitte suggest that inference security offerings may soon become essential features bundled into broader MLOps subscriptions (Deloitte Insights), moving away from fragmented tooling toward platform-native experience—much like cloud-native security evolved over the past decade.
Conclusion and Outlook
Databricks and Noma’s joint response to the AI inference security puzzle marks a critical step toward making generative AI safer, governable, and enterprise-friendly. By embedding security directly into the transformation stack—before AI decisions are made—the solution fundamentally shifts inference from an invisible blackbox into a controllable, auditable framework.
As models become more multi-modal, autonomous agents expand in use, and real-time AI grows in both consumer and enterprise sectors, the need for inference protection will shift from “best practice” to “mandatory gatekeeper.” Innovations like this not only benefit data and cybersecurity teams but also play a pivotal role in increasing public trust in AI systems—a core focus not only for practitioners but also for regulators, researchers, and civil society at large.