Scan Cadence
Scheduled scans miss fast AI handoffsAI pipelines create embeddings, fine-tunes, responses, and intermediate artifacts faster than periodic scanning can explain.
This whitepaper explains how PumaMesh helps teams understand sensitive AI data, track where it moves, keep policy attached, and produce evidence after it reaches models, tools, and people.
It is a deeper technical resource, but the buyer promise stays simple: one platform keeps movement, lineage, and audit connected across AI workflows.
Traditional posture tools help find sensitive data at rest. AI creates a harder problem: data moves into indexes, fine-tunes, prompts, tools, and outputs faster than periodic scans can explain.
Scan Cadence
Scheduled scans miss fast AI handoffsAI pipelines create embeddings, fine-tunes, responses, and intermediate artifacts faster than periodic scanning can explain.
Platform Boundaries
Platform guardrails stop at platform boundariesCloud and AI platform controls help locally, but they do not show the full upstream and downstream data path.
Lineage Gap
Lineage disappears between systemsWarehouses end at export and AI platforms start at prompt or training input. The gap is where teams lose the story.
Regulatory Pressure
AI reviews now ask for evidenceRegulators, auditors, model-risk teams, and insurers increasingly ask which data was used, where it moved, and what controls applied.
A credible DSPM for AI story covers source posture, movement, training, retrieval, tool-calls, and evidence. Miss one and the chain breaks.
1. Source Data Posture
Classification and sensitivity of records before they enter AI pipelinesTraditional DSPM territory — extended so classifications are machine-readable downstream, not just visible in a dashboard.
2. Transfer Lineage
What moved, from where, to where, under which policyThe gap between warehouse and AI platform. Transfers must carry classification as first-class metadata, not opaque payloads.
3. Training and Fine-Tune Provenance
Which sensitive records entered which model artifactFine-tune sets, embedding indexes, and LoRA adapters inherit the sensitivity of their source data. Provenance has to follow.
4. Retrieval and Prompt Lineage
Which records were retrieved, embedded in context, or returned in a responseRAG systems pull thousands of rows per prompt. Lineage has to tie each retrieval back to source-row sensitivity — not just a vector ID.
5. Agent Tool-Call Policy
Which tools agents are allowed to invoke with which dataAgent frameworks hand out tool access broadly. DSPM for AI has to enforce ABAC on tool-calls the same way it enforces it on files.
6. Evidence and Audit
Exportable artifacts aligned to regulatory frameworksEU AI Act Article 12, NIST AI RMF Measure/Manage, ISO/IEC 42001, and internal model-risk review all require artifacts the platforms don't produce on their own.
The architecture keeps data where it already lives, enforces policy at movement and AI boundaries, and produces one lineage view across the platforms involved.
Data Plane
Keep data on the platforms where it belongsWarehouses, feature stores, object storage, AI platforms, and agent runtimes remain in place. The architecture federates evidence instead of centralizing data.
Control Plane
Enforce policy at the boundaries that matterMovement, retrieval, and tool-call boundaries use data attributes so governance can follow the record across workflows.
Federated Analytics Plane
Produce one lineage and evidence viewPosture, transfer, training, retrieval, and tool-call events feed a neutral view of what data was used and which policy applied.
Non-Goals
Federation, not consolidationNo forced data centralization, no replacement for platform-native guardrails, and no requirement to standardize on one AI platform.
The control plane writes policy against data attributes such as classification, marking, owner, sensitivity, and jurisdiction. That keeps rules portable across file transfer, retrieval, tool-call, and training boundaries.
Transfer Boundary
Restricted records stay inside approved jurisdictionsThe movement path checks the data context before a transfer begins.
Retrieval Boundary
Sensitive records are filtered before model contextRetrieval workflows can use attributes before context is assembled.
Tool-Call Boundary
AI agents inherit data-aware limitsTool access can account for the agent role and the target resource context.
Training Boundary
Training sets respect source sensitivityFine-tune inputs can be checked against record attributes before they are added to a training artifact.
The federated analytics plane can export evidence for auditors, cyber insurers, and model-risk teams from events already created by movement and policy workflows.
Automatic activity logs over the lifetime of each high-risk AI system — inputs, events, and outputs, all traceable.
Artifacts for Measure (MS-1 to MS-4) and Manage (MG-1 to MG-4) functions, with mapped control evidence per model.
AI management system evidence — risk assessment inputs, control logs, and continuous-monitoring output.
Model-card inputs covering training data provenance, sensitive-data exposure, and retrieval-path inventory.
AI data surface inventory, sensitive-data flow map, and incident-response artifacts underwriters now want at renewal.
All 110 CMMC controls for data sharing met by the product. FedRAMP-aligned (80+ NIST SP 800-53 Rev 5). Federal and defense deployments pull AI workloads inside the accreditation boundary with inherited evidence.
DSPM for AI is not a new product category — it is the Understand pillar extended into AI pipelines, with Protect, Move, and Accelerate running alongside.
Training sets, weights, and inference traffic encrypted at rest and in flight. 100% post-quantum. Forwarding nodes never decrypt.
Content inspection, compliance framework matching, customer ontology matching, ABAC access gating — across transfers, RAG, fine-tunes, tool-calls.
Line-rate model delivery (70B in <60s), federated learning across sovereignty zones, Windows + Linux native.
EU AI Act Article 12, NIST AI RMF, ISO 42001, CMMC v1/v2/v3 — evidence packs built from the audit stream continuously.
PumaMesh is the reference architecture in product form. The fabric is the control plane — transfer, classification, and policy all sit in-path. Pulse is the federated analytics plane — posture, lineage, and evidence across every node and every AI platform. Pulse is the Understand pillar, delivered.
Control Plane
Fabric + Shield + TransitABAC checked at every transfer. Classification attached to records inline. Quantum-safe crypto posture held across sovereignty zones.
Federated Analytics
PulseEleven views cover posture, discovery, UEBA, legal hold, audit, and AI Insights. Federated queries reach every node — no central collector.
AI Platform Coverage
Bedrock, Foundry, Vertex, Databricks, Snowflake CortexGateway proxies for each platform capture retrieval, prompt, and tool-call events and feed them into the Pulse lineage graph.
Evidence Packs
Audit stream → framework-aligned artifactsEU AI Act, NIST AI RMF, ISO/IEC 42001, CMMC v1/v2/v3 (all 110 controls for data sharing), and FedRAMP-aligned control packs — all built from the audit stream the fabric emits continuously.