Which agent components need technical controls beyond a normal LLM application?

The Singapore IMDA Model AI Governance Framework for Agentic AI (v1.5) recommends adding controls for the components that distinguish an agent from a simple LLM application: planning and reasoning, memory, tools, and protocols, together with the larger attack surface and any multi-agent interactions. Baseline software and LLM controls still apply; the agentic additions sit on top. Examples in the framework include logging the agent's plan and reasoning for review, requiring strict input schemas for tools, withholding write access to sensitive databases unless strictly required, whitelisting and sandboxing protocol servers, and requiring agents to communicate through typed schemas rather than free text.

Why should controls be enforced structurally rather than through the system prompt?

The framework recommends deterministic, system-level safeguards over prompt-layer instructions, especially for higher-risk actions, because prompt-layer safeguards are inconsistently defined across users and can be bypassed, whereas system-level controls can be defined and enforced consistently. The worked illustration in the framework is access control: rather than instructing an agent not to use a tool, implement an access control at the tool layer so the tool cannot be called at all, or can only be called in a constrained way such as read-only. Where a risk is hard to codify into fixed rules, such as detecting harmful content, model-based safeguards can be more effective.

How do you test an agent workflow rather than just its final output?

Because an agent can take multiple steps in sequence without human involvement, the framework recommends testing the entire workflow, including reasoning and tool calling, not only the final output. Suggested test dimensions are overall task execution (can the agent complete the task accurately), policy compliance (does it follow defined standard operating procedures and route for human approval when required), tool calling (does it call the right tools, with the right permissions, inputs, and order), and robustness against errors and edge cases. Tests should run repeatedly across varied datasets in an environment that mirrors production, with evaluation methods matched to each part of the workflow.

What changes when more than one agent is involved (multi-agent testing)?

Multi-agent systems can exhibit emergent behaviour that cannot be predicted from testing each agent in isolation, so the framework recommends testing at the multi-agent system level to surface risks such as miscoordination, conflict, collusion, and the impact on other agents when one agent is compromised. In Modulos this multi-agent lens is governed by a conditional requirement that applies only when the system runs more than one agent; single-agent deployments mark it not applicable. Topology (sequential, supervisor, swarm) drives which additional controls apply, for example typed inter-agent contracts, scoped shared memory, per-sub-agent identity, and system-level emergency revocation.

How are agentic changes categorised and when does a change re-trigger risk assessment?

The framework recommends defining clear triggers for a change review process — technical (model updates, tool modifications), environmental (domain shifts, business-context changes), performance (anomalous or degraded behaviour), and regulatory (compliance changes) — and categorising the resulting changes by risk. Minor changes such as prompt refinements may follow a lighter review; material changes such as model updates or autonomy adjustments may require a full governance review; and critical changes affecting high-stakes decisions may mandate an immediate re-assessment of risk, which loops back to the assess-and-bound dimension.

How is the Model Context Protocol (MCP) treated as a governance gateway?

The framework notes that although the Model Context Protocol (MCP) is usually considered a connectivity protocol, it can act as a governance layer because it sits between the agent and the enterprise systems it accesses. Organisations can define controls at the protocol layer, such as filtering sensitive data passing through servers, logging all agent-to-system interactions, and whitelisting only trusted servers. In Modulos this is framed as one instantiation of a connector and gateway pattern, not a mandated technology: the obligation is to mediate high-impact protocol interactions through a governance layer, and MCP is one valid way to do so.

Implement Technical Controls and Processes

Q: What does gradual deployment and continuous monitoring look like for agents?

The framework recommends rolling agents out gradually to control risk exposure, controlled by users (trained or experienced users first), tools and protocols (more secure or whitelisted servers first), and systems (lower-risk internal systems first). Once deployed, organisations should monitor and log agent behaviour at multiple layers (user-agent interaction, agent-tool invocation, and model reasoning), define alert thresholds with a defined intervention per alert type, keep logs immutable, integrate with observability platforms and tracing standards, and establish feedback loops that route monitoring insights back into training and evaluation datasets and into the upstream risk assessment.

The third dimension of Singapore's Model AI Governance Framework for Agentic AI is where the build-and-operate work happens. The agentic components that differentiate an agent from a simple LLM application — planning and reasoning, memory, tools, and protocols — introduce a larger attack surface and new failure modes, and the framework recommends additional, tailored technical controls at each stage of the implementation lifecycle: during design and development, before deployment (testing), and during deployment (gradual rollout, continuous monitoring, and change management).

This dimension is voluntary best-practice guidance, framed throughout in advisory terms ("organisations should consider", "consider implementing"). In Modulos it maps to four application-scope requirements on the MFF-17 template: MRF-314 (design-and-development controls), MRF-315 (pre-deployment testing), MRF-316 (gradual deployment, monitoring, and change management), and MRF-317 (multi-agent and cross-system governance — conditional). The conceptual foundations these requirements build on — the eight core components, the multi-agent patterns, and the agentic risk taxonomy — are covered on Agentic AI components and risks.

Primary source

This page is a structured guide to Dimension 3 (§2.3, Implement technical controls and processes) of the IMDA Model AI Governance Framework for Agentic AI, v1.5, published 20 May 2026 (updated 5 June 2026) by the Infocomm Media Development Authority (IMDA), Singapore. The authoritative wording, sample controls, and case studies are in the framework text at §2.3.1–§2.3.3. This page paraphrases that guidance and maps it to Modulos; it does not reproduce the framework in full.

Controls for the new agentic components (during design and development)

Per §2.3.1, baseline software and LLM controls remain relevant, but agents need controls added for three areas: the new agentic components (planning, reasoning, tools); the increased security concerns from a larger attack surface and new protocols; and multi-agent interactions. The framework offers a non-exhaustive set of sample controls organised by component, and points organisations to third-party catalogues (CSA's draft addendum on securing agentic AI and GovTech's agentic risk and capability framework) for a fuller list.

Component	Sample controls the framework cites
Planning	Prompt the agent to reflect on whether its plan adheres to user instructions; ask it to summarise its understanding and request clarification before proceeding; log the agent's plan and reasoning for the user to evaluate and verify.
Tools	Require strict input formats; apply least privilege to limit the tools available to each agent, enforced through robust authentication and authorisation; do not grant write access to sensitive database tables unless strictly required; let the user take over when keying in sensitive data such as passwords or API keys.
Protocols	Use standardised protocols where applicable (for example, agentic-commerce protocols when handling a financial transaction); for protocol servers such as MCP servers, whitelist trusted servers and only allow the agent to interact with those, and sandbox any code execution.
Multi-agent interactions	Require agents to communicate through structured schemas such as typed function calls rather than free text, to reduce unintended instructions passing between agents; limit shared-memory access between agents.

In Modulos these design-and-development controls are governed by MRF-314 — Implement technical controls during design and development, which carries MCF-330 (input and output filtering), MCF-331 (privilege control and least-privilege access), MCF-380 (complete mediation), MCF-385 (security controls enforced independently from the LLM), MCF-400 (rate limiting), MCF-403 (sandbox techniques), MCF-407 (limit queued actions and scale robustly), MCF-517 (intent binding and drift detection), MCF-518 (tool invocation policy gate), MCF-520 (attested registries and signed descriptors), MCF-522 (secure inter-agent communication and discovery), and MCF-524 (memory governance and rollback).

Prefer structural enforcement over the system prompt

The framework's central design principle is to prefer structural, rule-based safeguards that operate at the system level through predefined logic over prompt-layer instructions, especially for higher-risk actions. Its worked example is access control: rather than instructing an agent not to use a particular tool, implement an access control at the tool layer so the tool cannot be called at all — or can only be called in a constrained way such as read-only. Similarly, if an agent must follow a set procedure, building that sequence into the workflow is more robust than prompting the agent to follow it. Prompt-layer safeguards are also inconsistently defined across users, whereas system-level safeguards can be defined and enforced consistently.

The framework qualifies this: where a risk is hard to express as fixed rules — detecting harmful content, which can manifest in many ways — model-based safeguards can be the most effective option. And because agents act in real time, static design-time safeguards may not catch every risk, so runtime controls (rate limits to prevent excessive tool use, input validation to catch harmful responses before they are acted on, in-execution interception) complete the picture.

Runtime, planning-layer, tool-layer, and protocol controls

The same MRF-314 controls span the layers the framework names. Runtime controls map to MCF-400 (rate limiting) and MCF-407 (limit queued actions); planning-layer controls are reflected in MCF-517 (intent binding and drift detection); tool-layer controls map to MCF-331, MCF-380, MCF-518, and the input/output filtering of MCF-330; memory controls map to MCF-524 (memory governance and rollback); and protocol controls map to MCF-522 (secure inter-agent communication and discovery) and MCF-520 (attested registries and signed descriptors), with MCF-403 providing sandboxing for any code execution.

Connector and gateway pattern (MCP)

The framework observes that although MCP is usually treated as a connectivity protocol, it can act as a governance layer because it sits between the agent and the enterprise systems it accesses: organisations can filter sensitive data passing through servers, log all agent-to-system interactions, and whitelist only trusted servers at that layer. Modulos frames this as one instantiation of a connector and gateway pattern, not a mandated technology — the obligation is to mediate high-impact protocol interactions through a governance layer, and MCP is one valid way to do so. The Agent2Agent Protocol (A2A) plays the analogous role for agent-to-agent communication.

How Dimension 3 differs from bounding by design (MRF-314 vs MRF-312)

It is worth being precise about where MRF-314 ends and where Dimension 1's Assess and bound the risks begins, because both touch the same components and both prefer structural enforcement.

MRF-312 — Bound agent authority by design sets what the agent is allowed to be: its identity, permissions, least-privilege envelope, sandboxing, and emergency revocation — the authority envelope.
MRF-314 — Implement technical controls during design and development sets how the agent's components are made safe within that envelope: the planning, memory, protocol, and runtime safeguards on the components themselves.

The structural-over-prompt principle is shared by both; the division is between defining the boundary and building the safeguards inside it. The two requirements share several controls (for example MCF-331, MCF-380, MCF-385, MCF-403, MCF-518) precisely because they cover adjacent aspects of the same components.

Test agent workflows before deployment

Per §2.3.2, organisations should test agents for safety and security before deployment. Software and LLM testing practices still apply — unit and integration testing, representative datasets, useful metrics and evaluators — but the framework recommends adapting the approach for agents along several axes:

Testing consideration	What the framework recommends
Test for new risks	Beyond incorrect outputs, agents can take unsafe or unintended actions through tools. Test overall task execution (can it complete the task accurately), policy compliance (does it follow defined SOPs and route for human approval when required), tool calling (right tools, right permissions, right inputs, right order), and robustness against errors and edge cases.
Test entire workflows	Because agents take multiple steps without human involvement, test the whole workflow — reasoning and tool calling — not only the final output.
Test agents individually and together	Test at the multi-agent system level to surface emergent risks and behaviours when agents collaborate, such as competitive behaviour or the effect on other agents when one is compromised.
Test in real or realistic environments	Use a properly configured execution environment that mirrors production (tool integrations, external APIs, sandboxes), while calibrating realism against the risk of letting agents touch the real world prematurely.
Test repeatedly across varied datasets	Agent behaviour is stochastic and context-dependent; test at scale across varied datasets, and re-run to check stability, including minor perturbations.
Evaluate results at scale	Use different evaluation methods for different parts of the workflow (deterministic checks for structured tool calls; LLM or human evaluation for unstructured reasoning), while still evaluating the agent holistically, often with LLM-as-judge approaches incorporating human-in-the-loop review.

In Modulos this is governed by MRF-315 — Test agent workflows and multi-agent behaviour before deployment, carrying MCF-205 (advanced model evaluation), MCF-334 (adversarial testing and attack simulations), MCF-518 (tool invocation policy gate), MCF-549 (multi-agent system testing), and MCF-551 (agent workflow testing and staged rollout). The framework notes that evaluating long, unstructured agentic workflows at scale is a known challenge, which is why the evaluation-method matching above matters.

Testing multi-agent behaviour and compromise propagation

The framework is explicit that some failure modes only appear at the system level. When more than one agent is involved, testing should evaluate emergent behaviour and compromise propagation — what happens to other agents when one agent is compromised — alongside the individual-agent behaviours. In Modulos, multi-agent testing is mandatory whenever the topology is anything other than single-agent: MCF-549 (multi-agent system testing) is shared between MRF-315 and the conditional multi-agent requirement MRF-317, so the testing obligation and the governance obligation reference the same control.

Deploy gradually and monitor continuously

Per §2.3.3, pre-deployment testing establishes a baseline but must be complemented by continuous monitoring and testing during deployment, because agents adapt to real-time conditions and their behaviour may change — and for multi-agent systems, failure modes such as miscoordination or emergent behaviour may only manifest over time under realistic conditions.

Gradual deployment controls risk exposure during the highest-risk early period. The framework recommends controlling rollout by:

Users — roll out to trained or experienced users first.
Tools and protocols — restrict agents to more secure, whitelisted servers (MCP servers, in the framework's example) first.
Systems — use agents in lower-risk internal systems first.

This is the pattern in the framework's GovTech case study, where agentic coding assistants were first limited to internal employees, with no external protocol-server access (the case study describes this as "no MCP allowed") and only low-risk systems, while central logging, monitoring, and a protocol-governance path were built out before broadening the rollout. The case study, like all worked examples in the framework, is illustrative rather than a requirement.

In Modulos, gradual deployment, monitoring, and change management are governed by MRF-316 — Deploy gradually, monitor continuously, manage agentic change, carrying MCF-308 (logging), MCF-309 (monitoring activities), MCF-325 (change management), MCF-404 (comprehensive logging, monitoring and anomaly detection), MCF-521 (emergency revocation and quarantine), MCF-525 (blast-radius limits and circuit breakers), and MCF-551 (agent workflow testing and staged rollout). MCF-308, MCF-309, and MCF-325 carry the Agnostic tag — they are framework-agnostic logging, monitoring, and change-management controls reused across templates, so evidence recorded once also serves other attached frameworks.

Alert thresholds, immutable logs, and feedback loops

The framework's monitoring recommendations are concrete. Organisations should determine what to log from their monitoring objectives, prioritising high-risk activities such as database updates and financial transactions, and should monitor on multiple layers — user-agent interaction, agent-tool invocation, and model reasoning — to localise the source of a failure. It recommends defining alert thresholds of three kinds:

Programmatic, threshold-based — for example, an agent attempting unauthorised access or making too many repeated tool calls in a window.
Outlier / anomaly detection — data-science or deep-learning techniques over agent signals.
Agents monitoring other agents — agents designed to flag anomalies or inconsistencies in real time.

For each alert type the framework recommends defining a proportionate intervention — lower-priority alerts flagged for scheduled review, higher-priority alerts halting agent execution until a human can assess, and catastrophic malfunction or compromise triggering termination and fallback. It also recommends some degree of human-in-the-loop to catch unanticipated emergent behaviour, integrating monitoring with observability platforms and tracing standards such as OpenTelemetry, ensuring log immutability so problematic trajectories cannot be deleted, and establishing feedback loops that route monitoring insights back into training and evaluation datasets — and, in Modulos, back into the upstream risk assessment and design.

Manage agentic change (minor / material / critical)

Because small modifications can cascade in a complex agentic system, the framework recommends defining clear triggers for a change-review process and categorising changes by risk:

Change category	Examples	Review depth
Minor	Prompt refinements	Lighter review process
Material	Model updates, autonomy adjustments	Full governance review
Critical	Changes affecting high-stakes decisions	Immediate re-assessment of risk

Change triggers span technical (model updates, tool modifications), environmental (domain shifts, business-context changes), performance (anomalous behaviour, degraded performance), and regulatory (compliance changes) categories. A critical change re-triggering a full risk assessment is the mechanism that makes the four dimensions an iterative loop rather than a one-way pipeline: a material or critical change loops back to Assess and bound the risks.

Govern multi-agent and cross-system interactions

The framework treats multi-agent systems as a distinct concern (§1.1.2 multi-agent setups, §1.2.3 systemic and multi-agent risks, and the §2.3.1 multi-agent controls), because they introduce risks the single-agent controls do not cover: cascading or compounding effects, agent sprawl, miscoordination, conflict between agents optimising different goals, emergent collusion, unpredictability from non-deterministic agents working together, and the difficulty of testing interactions that cross system or organisational boundaries.

In Modulos this is the conditional requirement MRF-317 — Govern multi-agent and cross-system interactions.

Conditional requirement

MRF-317 applies only when the system runs more than one agent. A single-agent deployment marks it not applicable. It is the multi-agent lens over risks that the framework also touches at single-agent level in suitability (MRF-311), technical controls (MRF-314), and testing (MRF-315) — MRF-317 is where they are governed as a system.

MRF-317 carries MCF-518 (tool invocation policy gate), MCF-519 (agent identities and delegation controls), MCF-522 (secure inter-agent communication and discovery), MCF-524 (memory governance and rollback), MCF-525 (blast-radius limits and circuit breakers), MCF-545 (action-space and autonomy classification), and MCF-549 (multi-agent system testing). The topology — sequential, supervisor, or swarm — drives the baseline:

Any non-single-agent topology requires typed inter-agent contracts, output validation at boundaries, taint and provenance propagation, per-sub-agent identity, and multi-agent emergent-behaviour testing.
Supervisor topologies additionally require delegated-authority limits and no transitive privilege.
Swarm topologies additionally require shared-memory limits, fan-out caps, conflict handling, system-level emergency revocation, and collusion detection.

Topology itself is captured as a dimension of the action-space and autonomy classification (MCF-545), which is the same classification artefact that drives risk scoping in Dimension 1 — see How these concepts drive risk scoping in Modulos.

How to operationalize Dimension 3 in Modulos

Dimension 3 is per-application build-and-operate work, so it lives entirely on the MFF-17 application template; these four requirements have no 1:1 organisation-scope mirror on OFF-17. (The org-side facet of change management lives in ORF-390 — Maintain agentic risk and change-management methodology — which is otherwise framed under Dimension 1.) A pragmatic sequence:

Design-and-development controls — fulfil MRF-314. Select the planning, tool, memory, protocol, and runtime controls appropriate to the agent's risk cell; prefer structural enforcement; treat any protocol layer as a governance gateway. Do not blanket-apply every control — match controls to the risk-cell rubric from Dimension 1.
Pre-deployment testing — fulfil MRF-315. Test whole workflows (reasoning and tool calls), policy compliance, tool calling, and robustness; run repeatedly across varied datasets in a realistic environment; for multi-agent systems, test emergent behaviour and compromise propagation.
Gradual deployment, monitoring, and change management — fulfil MRF-316. Roll out gradually; monitor on multiple layers with defined alert thresholds and interventions; keep logs immutable; establish feedback loops; categorise changes and route critical changes back to a fresh risk assessment.
Multi-agent governance — fulfil MRF-317 only if the system runs more than one agent; otherwise mark it not applicable, with the rationale recorded in the requirement's comments and logs.

Each requirement is evidenced the same way: implemented controls plus linked evidence drive a readiness signal, and the requirement owner then attests fulfilment. Reviews in Modulos apply to control status changes (and other reviewable objects), not to the requirements themselves. Scoping for this framework is not tag-driven — all MRF-31x requirements carry empty tags; risk scoping is driven by the action-space×autonomy classification and the impact×likelihood risk-cell rubric encoded in MCF-545, MCF-546, and MCF-547. For the full template structure and the readiness-plus-attestation loop, see Operationalizing the MGF for Agentic AI in Modulos. For where monitoring and runtime controls surface day to day, see the Runtime operating model.

Cross-framework mapping (preview)

Dimension 3 is the part of the MGF that maps most naturally onto adjacent agentic-security and AI-management frameworks, because it is the most technical:

OWASP Top 10 for Agentic Applications — the structural tool-layer controls, protocol-gateway pattern, and multi-agent testing in this dimension are the closest adjacency to OWASP's agentic threat catalogue (tool misuse, privilege compromise, memory poisoning, cascading multi-agent failures).
ISO/IEC 42001:2023 — the testing, monitoring, and change-management activities here produce the kind of evidence ISO 42001 expects under its performance-evaluation and operational-control clauses.
NIST AI RMF 1.0 — pre-deployment testing and continuous monitoring align with the Measure function's test-evaluation-verification-validation and production-monitoring outcomes.

Preview

These adjacencies are high-level orientation, not control-by-control mappings. Cross-framework reuse in Modulos is implicit at the control layer — several controls used by MFF-17 carry the Agnostic tag and serve multiple templates — rather than asserted as article-level mappings in requirement detail. Detailed mapping artefacts are out of scope for this page.

MGF for Agentic AI overview

The four dimensions, the agentic risk taxonomy, and the MFF-17 / OFF-17 split

Agentic AI components and risks

The eight core components, multi-agent patterns, action-space vs autonomy, and the risk taxonomy

Dimension 1: Assess and bound the risks

Suitability, bounding authority by design, and the risk-cell rubric that scopes these controls

Operationalizing in Modulos

The MFF-17 / OFF-17 playbook, control library, and readiness-plus-attestation loop

Runtime operating model

Where monitoring, logging, and runtime controls surface in the platform

Source attribution

This page summarises Dimension 3 (§2.3, Implement technical controls and processes) of the IMDA Model AI Governance Framework for Agentic AI, v1.5, published 20 May 2026 (updated 5 June 2026) by the Infocomm Media Development Authority (IMDA), Singapore. The framework builds on the Model AI Governance Framework (2nd Edition, 2020). Sample controls, monitoring recommendations, and the change-categorisation guidance are drawn from §2.3.1–§2.3.3; worked examples (Terminal 3, Cyber Sierra, Stability Solutions, the Google–Singapore Government computer-use sandbox, City Developments Limited × Knovel Engineering, and GovTech) are illustrative, not framework requirements. Some sample-control concepts are adapted in the source from third-party material, including CSA's draft addendum on securing agentic AI and GovTech's agentic risk and capability framework. The Modulos requirement codes (MRF-314/315/316/317) and control codes reflect the MFF-17 template.

Disclaimer

This page is for general informational purposes and reflects the voluntary, non-binding nature of the MGF for Agentic AI; it is not legal advice and does not characterise the framework as law or a mandatory standard. For the authoritative text, consult the IMDA Model AI Governance Framework for Agentic AI (v1.5) directly.

Comparison

EU AI Act

Commission guidance

ISO Standards

ISO/IEC 42001

ISO/IEC 27001

ISO/IEC 27701

NIST AI RMF

GDPR (EU)

NIS2 (EU)

DORA (EU)

OWASP for AI Security

UAE AI Ethics

UAE Consumer AI

MAS FEAT

FINMA AI Governance

Singapore MGF for Agentic AI

Microsoft Supplier DPR

Implement Technical Controls and Processes

Controls for the new agentic components (during design and development)

Prefer structural enforcement over the system prompt

Runtime, planning-layer, tool-layer, and protocol controls

How Dimension 3 differs from bounding by design (MRF-314 vs MRF-312)

Test agent workflows before deployment

Testing multi-agent behaviour and compromise propagation

Deploy gradually and monitor continuously

Alert thresholds, immutable logs, and feedback loops

Manage agentic change (minor / material / critical)

Govern multi-agent and cross-system interactions

How to operationalize Dimension 3 in Modulos

Cross-framework mapping (preview)

Source attribution

Commission guidance

ISO/IEC 42001

ISO/IEC 27001

ISO/IEC 27701

Implement Technical Controls and Processes ​

Controls for the new agentic components (during design and development) ​

Prefer structural enforcement over the system prompt ​

Runtime, planning-layer, tool-layer, and protocol controls ​

How Dimension 3 differs from bounding by design (MRF-314 vs MRF-312) ​

Test agent workflows before deployment ​

Testing multi-agent behaviour and compromise propagation ​

Deploy gradually and monitor continuously ​

Alert thresholds, immutable logs, and feedback loops ​

Manage agentic change (minor / material / critical) ​

Govern multi-agent and cross-system interactions ​

How to operationalize Dimension 3 in Modulos ​

Cross-framework mapping (preview) ​

Related pages ​

Source attribution ​

Implement Technical Controls and Processes

Controls for the new agentic components (during design and development)

Prefer structural enforcement over the system prompt

Runtime, planning-layer, tool-layer, and protocol controls

How Dimension 3 differs from bounding by design (MRF-314 vs MRF-312)

Test agent workflows before deployment

Testing multi-agent behaviour and compromise propagation

Deploy gradually and monitor continuously

Alert thresholds, immutable logs, and feedback loops

Manage agentic change (minor / material / critical)

Govern multi-agent and cross-system interactions

How to operationalize Dimension 3 in Modulos

Cross-framework mapping (preview)

Related pages

Source attribution