Praesidias

The Praesidias Challenge

The Enterprise Readiness Test for AI Agents

Connect or paste a sanitized agent manifest, tool list, or workflow description. Praesidias maps the execution boundary, evaluates proposed actions, and shows what would be allowed, denied, escalated, or failed closed before side effects occur.

If your agent cannot pass the Praesidias Challenge, it is not enterprise-ready.

Use sanitized metadata only. Do not submit secrets, credentials, regulated data, source code, production URLs, or confidential business information.

Universal agent intake

Paste a sanitized manifest. Praesidias maps the execution boundary.

Intelligent boundary scanner

Detected execution risk and governance lanes.

high confidence

Structured manifest included tools and proposed actions.

Detecting toolsClassifying resourcesMapping actionsFinding side-effect boundariesAssigning lanesPreparing live evaluation
Detected agentFinance Operations Agent

Resolve billing issues and prepare refund actions.

Workflow typefinance

3 tools, 2 resources, 5 actions.

Control gaps4

Risk signals detected from the sanitized system model.

Detected tools
  • Billing API - Payment rail - critical
  • Customer Records - Database - high
  • Analytics Workspace - External API - high
Detected resources
  • Customer financial records - Financial data - critical
  • Refund authority - Financial data - critical
Detected control gaps
  • External data movement detected: At least one proposed action crosses a side-effect boundary and should be denied before tool invocation.
  • Approval boundary required: High-risk updates, financial movement, or privileged changes require human approval evidence.
  • Fail-closed condition detected: Unknown classification or unresolved authority must block execution until the dependency is safe.
  • Critical tool authority: Critical tools should be mapped to deterministic action-level policy before execution.

Templates

Start from a known enterprise pattern, then edit it.

Review / edit model

Generated model ready for live evaluation.

Governance preview

Generated governance map.

High-risk tools3
Critical resources2
Approval lanes1
Blocked lanes1
Fail-closed boundaries1
Actions to evaluate5
AgentFinance Operations Agent
ToolBilling APIcritical
ToolCustomer Recordshigh
ToolAnalytics Workspacehigh
safe internalRead billing contextlow risk - Internal action can execute after authority is resolved.
safe internalAnalyze aggregate billing variancemoderate risk - Internal action can execute after authority is resolved.
approval requiredIssue refund above thresholdhigh risk - Human approval required before side effects.
blocked externalExport raw customer datacritical risk - External side-effect boundary should be blocked.
fail closedMove unknown classified billing filecritical risk - Unknown or unsafe state should fail closed.

Run live evaluation

Send proposed actions through Praesidias.

3 tools, 5 actions, 3 high/critical tools,1 approval-required actions, and 1 fail-closed boundaries detected.

This evaluator is for sanitized workflow modeling only. No production connection or real side effect occurs.

Live execution trace

Proposed action → policy → authority → approval → decision.

Live primary

Run the evaluation to generate a governed execution trace.

Action 1Read billing contextPENDING
readCustomer Recordslow risk
Agent Proposal

Awaiting live evaluation.

Policy Evaluation

Awaiting live evaluation.

Authority Resolution

Awaiting live evaluation.

Approval Check

Awaiting live evaluation.

Execution Decision

Awaiting live evaluation.

Proof / Audit Record

Awaiting live evaluation.

Side effect occurredfalse

Execution is blocked or safely simulated before tool invocation.

Proof and replay console

Awaiting proof-bearing action

Run the evaluation to show proof IDs, proof hashes, and replay status.

StatusPending run

Policy mutation / what-if engine

Change policy. Replay the same action.

No agent code changes. The policy lane changes, then the action is evaluated again.

Enterprise readiness scorecard

Run the evaluation to assemble the scorecard.

Safe actions allowed-
Unauthorized actions blocked-
Approval-required actions escalated-
Fail-closed cases handled-
Proof-backed outcomes-
Replay checks passed-
Side effects prevented-
Governance coverage-

Governance certificate

Praesidias Challenge Result

Run a Challenge evaluation to generate a non-confidential summary.

Private evaluation

Request a private evaluation.

Use a sanitized workflow description to evaluate where execution governance should sit before real tools execute.

Do not submit credentials, regulated data, source code, production URLs, or confidential business information.