The federal approach to artificial intelligence governance is undergoing a significant shift, but key questions remain unresolved. In early June, the White House released an executive order on AI innovation and security, alongside National Security Presidential Memorandum-11, which addresses AI within national security frameworks. This comes as a new wave of AI models demonstrates increasingly powerful cyber capabilities.
Before the order's own review system was fully operational, the administration used export control authority to block Anthropic's Fable 5 and its underlying Mythos model. OpenAI, meanwhile, held back the release of GPT-5.6 pending government approval. These moves unfold as agencies and private companies race to integrate AI agents into routine operations. Beyond the national security headlines, however, lie deeper concerns about secure deployment and whether these systems can be trusted to act independently.
Agentic AI is fundamentally about delegation. A system that can draft emails, search databases, file forms, write code, monitor networks, or route requests is no longer just generating information—it is being trusted to act, often through multiple steps before human review. These systems are improving rapidly. The Model Evaluation and Threat Research organization, which assesses frontier AI systems, tracks how long a task would take a human expert and whether AI can complete it reliably. In 2025, that threshold was doubling every seven months; recent estimates suggest it is now closer to four months. The trajectory is clear, and institutions must build the capacity to govern agents while meaningful human control remains possible.
Trust and Security at Stake
Properly deployed, agents could transform citizen-government interactions. A small business navigating licensing could spend more time serving customers and less on paperwork. A veteran filing a benefits claim could shave weeks off a process that now takes months. Agencies could use agents to remove unnecessary steps, reduce backlogs, and deliver higher-quality services. But these possibilities depend on trust, reliability, and security. A poorly governed agent could move information to the wrong place, act outside its authority, or bury an error in a chain of automated steps that no one can later reconstruct. The results could include misallocation of benefits, crippling of critical infrastructure, or even conflict escalation. AI decision-support systems are already generating target recommendations for military commanders and being incorporated into sensitive systems, while guidance and technical standards lag behind.
Much of the AI policy debate focuses on access: who gets the models, chips, data, and energy. As AI systems begin to act, however, policymakers also face the urgent question of what makes delegation to these systems trustworthy—whether institutions can use them reliably, securely, accountably, and in ways understood by government and the public. To build the infrastructure that guides responsible use, we need trained personnel, sound procurement, clear lines of authority, audit logs, and the ability to reconstruct decisions after the fact. That deployment infrastructure will determine whether agentic AI strengthens public institutions or makes them more brittle.
Cybersecurity and Export Controls
Cybersecurity makes the stakes concrete. Anthropic's Mythos model, which has sharply outperformed prior systems at finding software vulnerabilities, shows how quickly agentic capabilities can serve both defenders and attackers. Industry has responded: Anthropic's Project Glasswing and OpenAI's Daybreak programs extend access to advanced tools to vetted defenders, an approach known as differential access. As noted above, the White House is focusing on access. This alone, however, will not help the hospitals, utilities, state agencies, and municipal systems most exposed to cyber threats if they lack the staff, standards, integrations, and operational practices to use those tools responsibly. The Cybersecurity and Infrastructure Security Agency and the National Security Agency, along with Australian and allied cyber agencies, recently issued careful adoption of agentic AI services guidance, highlighting permissions, segmentation, monitoring, accountability, and human oversight.
Two priorities stand out. First, policymakers should invest in evaluation and auditing capacity. For the government to rely on agentic systems, it needs to understand how those systems use tools, whether they stay within authorized boundaries, how they behave under stress, performance in cyber-relevant tasks, and what risks appear with multi-agent interactions. The June executive order operationalized a version of this—asking developers to submit their most capable models for government review before broad release—but how it works in practice is still developing. Even so, evaluation science must keep pace with the frontier; models that know they are being evaluated, for example, complicate measurement and demand new tests. The Center for AI Standards and Innovation is responsible for much of this work, including the evaluation capacity any such review depends on. It has agreements to support pre-deployment evaluations and targeted research on frontier AI capabilities, including collaborations with OpenAI and Anthropic. That is the right direction, but the center lacks Congressional authorization, and its $10 million budget is too small. Its deep evaluation expertise should complement NSA's in national security deployments.
Second, policymakers should clarify and strengthen export controls and enforcement measures that protect U.S. and allied advantages in frontier AI. The question is not only who has the most capable models, but whose institutions and values shape how agentic systems are built, used, and relied upon. Recent bipartisan bills from the House Foreign Affairs Committee, including the one discussed in Ocasio-Cortez's push for antitrust action and a moratorium on AI data centers, highlight growing congressional attention to these issues. Meanwhile, public trust in federal institutions remains fragile, as a recent Fox News poll showing record low trust in the federal government underscores the stakes of getting AI governance right.
