From Human-in-the-Loop to AI-on-the-Loop: Redesigning Oversight Architectures

In many operational domains, AI systems can now perform oversight and checking more effectively than human reviewers, especially at scale and over long time horizons. However, current governance and regulatory regimes in Europe and Singapore still assume that natural persons remain ultimately responsible for high‑risk uses, and they are not designed for a fully AI‑only oversight layer.

Clarifying what “human oversight” really means

“Human in the loop” (HITL) is a design choice, not a law of nature. It describes architectures where a human with real intervention power is embedded directly into the decision workflow. For high‑risk AI, the EU AI Act, the NIST AI RMF and ISO/IEC 42001 require that humans can monitor, intervene and override, and that this capability is documented and auditable.

Singapore’s Model AI Governance Framework, MAS FEAT/Veritas guidance and sectoral notices take a similar position: human accountability cannot be outsourced to algorithms, and organisations must demonstrate that human decision‑makers can understand and contest AI‑supported outcomes in material use cases.

Once an AI system takes over the checking function from a human, the architecture shifts towards “AI‑on‑the‑loop”, with humans in supervisory or “human‑in‑command” roles rather than transaction‑level reviewers. This remains compatible with both European and Singapore governance expectations if humans retain final authority and can meaningfully intervene, even if they no longer perform every individual check themselves.

When AI oversight outperforms manual review

There is growing evidence and practice that hybrid or automated‑first oversight outperforms purely manual review on several dimensions.

Error detection at scale: AI systems can continuously monitor large volumes of decisions and telemetry for anomalies, model drift and policy violations that would be infeasible for humans to identify in real time.
Consistency and fatigue: Human reviewers exposed to high alert volumes can become symbolic approvers, while AI‑based monitors can maintain stable thresholds and escalate only genuinely high‑risk cases.
Structured simulations: Some architectures already use “dry‑run” simulations in which AI agents evaluate downstream effects of proposed actions before execution and present a condensed risk picture for human sign‑off.

In these designs, the human role shifts from transaction‑level checking to governance of thresholds, escalation logic and exception handling.

From a risk‑engineering perspective, this can be more robust than requiring humans to review each action, provided the oversight AI’s configuration and monitoring are well‑controlled.

Governance changes when AI becomes the primary checker

If an AI‑based system becomes the primary checker, several elements of the governance and assurance model need to be redesigned.

Redefined human oversight: Oversight becomes the configuration and supervision of the checking AI (policies, constraints, KPIs, guardrails) rather than manual review of every decision.
Multi‑layered controls: At least two independent layers should be deployed: the operational AI and a separate assurance or monitoring AI with different training data, governance and change‑control, enforcing separation of duties.
Auditability and traceability: Logs must capture the operational AI’s outputs, the oversight AI’s evaluations, triggers and escalation recommendations, and the resulting human decisions, to satisfy accountability and regulatory audit requirements.
Human fallback and override: The EU AI Act and Singaporean regulators both expect that natural persons can intervene, suspend or deactivate a high‑risk system, so defined human intervention paths must be maintained even if day‑to‑day checking is automated.

The key shift is from “ human as last line of defence on every decision ” to “ human as designer and governor of a layered control system containing multiple AI components ”.

For regulators in both the EU and Singapore, the test will be whether organisations can demonstrate that these control layers are explainable, monitored and subject to accountable human oversight, not whether a human clicked “approve” on each item.

Practical application for smart city and infrastructure operations

In smart city and critical infrastructure environments, a practical “AI as checker” combines domain‑specific agents, independent oversight agents and a human‑in‑command layer.

Operational agents: Domain‑specific AI agents (e.g. traffic optimisation, building energy management, incident triage) operate within predefined policy envelopes and service‑level constraints.
Oversight agents: Separate AI models monitor policy compliance, anomalies, bias and safety signals over shared telemetry, with independent data paths, access controls and model governance to reduce correlated failure modes.
Human‑in‑command layer: Governance forums and operational control centres define which events auto‑execute, which auto‑execute with ex‑post sampling review, and which require human approval supported by AI‑generated simulations, rationales and risk summaries. Clear RACI definitions specify who is authorised to adjust thresholds, suspend agents or approve new policies.

In this model, humans are not “in the loop” for every operation, but they remain accountable for designing, tuning and auditing the AI control stack. This aligns with emerging European standards and Singapore’s emphasis on demonstrable accountability, and it provides a more realistic path to scaling AI in high‑risk domains than either purely manual oversight or fully autonomous operation.