Everything Microsoft announced about AI, agents, and PCs at Build 2026

16 Jun 2026 19:08 18,982 views

Microsoft’s Build 2026 keynote was all about “frontier intelligence” – from AI PCs and Windows dev tools to cloud-scale agent runtimes, new MAI models, and even quantum. Here’s a clear breakdown of the biggest announcements and what they mean for developers and companies.

Microsoft’s Build 2026 keynote was packed with announcements that all circle around one theme: how developers and companies can fully participate in a new “frontier intelligence” ecosystem. Instead of focusing on a single model or product, Microsoft is trying to build a complete stack for AI agents that spans devices, cloud, data, security, and even scientific discovery.

The new AI stack: from edge to cloud

Microsoft framed its AI platform as a layered stack. At the bottom is a ubiquitous compute fabric that stretches from edge devices (PCs and new form factors) to massive cloud data centers. On top of that sit models, context (your data and knowledge), and tools. The top layer is the runtime where agents and AI applications actually execute. All of this is wrapped in tooling, security, compliance, and governance.

The key idea: you’re not just calling a model anymore. You’re building long-running, tool-using agents that reason, act, and learn across this whole stack.

AI PCs and “unmetered intelligence” on Windows

On the edge, Microsoft is betting heavily on AI PCs. Every CPU, GPU, and NPU in a Windows machine is now seen as part of a local AI fabric. Many features in Microsoft 365 already run locally (like Outlook summaries, PowerPoint text features, and Teams super resolution), and Microsoft wants third-party developers to tap into that same power.

Windows ML and Windows AI are being expanded so developers can target the full installed base of GPUs and NPUs. Microsoft also announced two local models designed for Windows:

• MAI Instruct – a local reasoning model for general instructions and analysis.
• MAI Plan – a planning model for orchestrating multi-step tasks.

Together they enable a full “agentic loop” entirely on-device: you can give agents tools, local context, and let them act without any round trips to the cloud.

New hardware: Surface Ultra, Surface RTX Spark, and DGX on Windows

To push local AI further, Microsoft highlighted a wave of new hardware from partners like Intel, Qualcomm, and NVIDIA. The headline products:

Surface Ultra – a next-generation Windows device built around an NVIDIA SoC that combines CPU, GPU, and AI acceleration with unified memory and high-end display, aimed at creators and power users.

Surface RTX Spark DevBox – described as a “dream machine” for developers, with:

• 1 petaFLOP of AI compute
• 20 CPU cores
• 128 GB unified memory shared across CPU, GPU, and AI accelerators

This box is designed to run very large local models (100B+ parameters) and multiple agents in parallel without worrying about token costs.

On the extreme end, NVIDIA’s DGX Station is now being positioned as a “desktop data center” capable of running trillion-parameter models locally, with Windows support coming to that platform as well.

Windows as a first-class dev and agent platform

Microsoft is re-positioning Windows as the best place to build and run AI agents, not just traditional apps. Several developer-focused improvements were announced:

• Calm default dev experience – a distraction-free Windows configuration (no news feed, no widgets, dark mode by default) shipping as a public configuration file you can apply to your own device.

• Vertical taskbar – long-requested support for a vertical taskbar is coming to Windows Insider builds.

• DevBox configuration – a public repo with a configuration file that installs Python, Node, and other dev tools, and tunes Windows settings for development.

• PowerToys upgrades – new utilities like “Grab and Move” (move windows by holding Alt and dragging anywhere) and “End Task” directly from the taskbar.

• Dev Drive – a developer-optimized drive type with Defender scanning running asynchronously to improve performance, plus Git-aware views with branch names and file status.

• Intelligent Terminal – a new terminal experience with a built-in AI agent pane (GitHub Copilot or any other agent) that can detect errors in terminal output, suggest fixes, and work alongside you.

• Native containers and WSL improvements – first-class container support on Windows (including GPU access), better WSL profiles with common tools (zsh, Homebrew, btop), and a more Linux-friendly environment.

• Local large models for coding – Surface RTX Spark can serve large local coding models (e.g., 120B parameters) and track token usage entirely on-device.

Voice-driven development is also being pushed: Copilot’s voice feature can launch and direct agents using a local model, offloading many subtasks to the GPU for cost-efficient automation.

Microsoft’s cloud: training, inference, and agent runtimes

On the cloud side, Microsoft is optimizing everything around a simple equation: tokens per dollar per watt. The goal is to turn electrons into tokens as efficiently as possible, from data center design to silicon.

Microsoft’s Azure footprint now spans over 500 data centers, with more capacity added in the last 18 months than in the first decade of Azure. New AI-focused facilities like “Fairwater” are designed from the ground up for large-scale training and inference:

• Two-story architectures to pack GPUs densely with high-bandwidth, low-latency networking
• Rethought power delivery to minimize conversion losses
• Closed-loop cooling systems that can operate with effectively zero water consumption (annual water use comparable to a single restaurant)

Silicon: NVIDIA, AMD, and Microsoft’s own Cobalt CPUs

Microsoft is taking a heterogeneous approach to AI silicon:

• NVIDIA Grace Blackwell – Microsoft claims to be deploying the largest number of Grace Blackwell GPUs in the world, in systems like Fairwater and the upcoming “Vera Rubin” architecture optimized for agentic workloads and secure, encrypted data paths.

• AMD MI300 (MIO 200) – Microsoft’s own MIO 200 AI accelerator is live in Arizona and being rolled out globally. It reportedly delivers ~30% better tokens per dollar than leading GPUs today and will power services like Microsoft 365 Copilot.

• Cobalt CPUs – Microsoft’s in-house Cobalt 100 CPUs are optimized for cloud-native and agent workloads, with new Cobalt 200 VMs in preview. Using GitHub Copilot traces, Microsoft reports:

• 33% lower latency for agent calls
• 14% faster speed
• 23% higher throughput

The key point: agents are CPU-heavy as well as GPU-heavy, so co-designing CPUs and accelerators for agent patterns is becoming critical.

Project Solara: new agent-first devices

Beyond PCs and servers, Microsoft introduced “Project Solara” – a new ecosystem of agent-first devices designed specifically for AI agents rather than traditional apps. The core idea is that your “next computer” isn’t one device; it’s a constellation of devices that share the same agents and context.

Two early Solara device categories were shown:

1. Stationary desk device
Built on MediaTek silicon, this sits on your desk and acts as a dedicated ambient Copilot endpoint. It supports Windows Hello for Business for secure sign-in and can act as:

• A front-end to Microsoft 365 Copilot grounded in your work data
• A companion to a Windows Copilot+ PC
• A Windows 365 endpoint when connected to a monitor

2. Wearable access badge
Built on Qualcomm wearable silicon, this reimagines the standard workplace access badge as an agent device. With a fingerprint unlock and built-in camera and microphones, it can:

• Capture content (e.g., photos and video) for social posts or documentation
• Support healthcare workflows (patient check-ins, vitals, medication scanning, hands-free voice documentation)

The same underlying software stack can be adapted for retail, industrial, hospitality, financial services, legal, and more by swapping agents, sensors, and form factors.

Web IQ and Microsoft IQ: grounding agents in real data

Microsoft emphasized that agents are only as good as the context they’re given. To solve this, it introduced a unified “IQ” layer that combines web data, enterprise data, and work procedures.

Web IQ
Web IQ is a new web grounding service designed for LLMs and agents. It’s:

• Model-agnostic and MCP-native (so any agent runtime can use it)
• Built on Microsoft’s global search infrastructure but re-architected for AI
• Tuned for quality, speed, and cost

Agents can use Web IQ to pull fresh, verifiable web content (news, images, video) into their reasoning.

Microsoft IQ
Beyond the web, Microsoft IQ unifies your organization’s internal intelligence by combining:

• Fabric IQ – enterprise ontologies and live telemetry (e.g., a power grid model with real-time status)
• Work IQ – your procedures, playbooks, and documents in Microsoft 365 (SharePoint, OneDrive, etc.)
• Web IQ – external web knowledge

In a demo for a power utility, an agent:

• Used Web IQ to get current electricity prices
• Queried Fabric IQ to identify at-risk substations from a live grid model
• Applied Work IQ procedures to generate an incident response plan
• Sent a brief via Teams automatically

The key benefit: agents answer from the same live sources humans use, not from stale uploads. When a procedure changes, the agent’s behavior changes too, without re-uploading data or rebuilding prompts.

MXC: secure agent execution on Windows

As agents become more powerful and autonomous, Microsoft is adding a new OS-level containment layer called Microsoft Execution Containers (MXC). MXC lets Windows enforce isolation and policies around agent actions using native primitives.

MXC supports multiple levels of isolation:

• Process-level isolation – for lightweight agent actions
• Session-level isolation – for separating user sessions on Windows and Linux (including WSL)
• Full isolation via Windows 365 – for maximum separation in a managed cloud environment

The idea is that regardless of who built the agent, Windows can enforce containment policies around what it can access and modify.

OpenAI-style agents in the open: OpenClaw on Windows

One of the most concrete MXC demos involved OpenClaw, a popular open-source agent framework that can control your files, machines, and chats. Microsoft worked with the OpenClaw community to ship a native Windows companion app that:

• Helps users set up and connect to OpenClaw gateways
• Sandboxes tool calls using MXC
• Lets IT or users configure granular permissions (folders, clipboard, network access, etc.)

In a live demo, an OpenClaw agent tried to delete every file on the desktop. OpenClaw’s own safety layers were disabled, but MXC blocked the deletion because the agent only had read access to that folder. The files remained intact.

This shows how OS-level containment can protect users even when agents are misconfigured or overly permissive.

Foundry: a full agent platform in Azure

On the cloud side, Microsoft is turning Azure AI Foundry into a full application platform for the agent era. Foundry now offers:

• Hosted agent runtimes – long-running agents with their own durable state, sandboxed execution, and access to IQ tools.

• Tooling and safety – automatic sandbox provisioning, guardrails, and a self-improvement loop where agents can be evaluated and optimized over time.

• Model choice – over 121,000 models in the catalog, including OpenAI, Anthropic, MAI models, and now models from partners like Fireworks AI.

Foundry is also becoming a place to manage tools via MCP, connect to data, and deploy agents into Microsoft 365 or your own apps.

GitHub Copilot app and Rayfin: agents that actually ship

GitHub is evolving from a code repo into a control plane for agents. Nearly every GitHub metric is growing faster thanks to agentic workflows: repo creation, PRs, API usage, and Actions.

To make this more manageable, Microsoft introduced a new standalone GitHub Copilot app:

• Session manager – you can spin up multiple parallel agent sessions, each with its own Git worktree, so agents can work on different issues without clobbering each other.

• Agent merge – Copilot can babysit PRs through CI, code review, and merging.

• My work view – a focused view of your repos, PRs, and automation tasks.

• Canvases – a new concept where agents can build custom UIs (e.g., a camera-based canvas that approves PRs with a thumbs-up gesture).

To solve the back-end problem, Microsoft introduced Rayfin, an agent-first SDK that connects agents to a managed backend-as-a-service on Microsoft Fabric. With a single command like rayfin up, you can deploy a fully managed backend (storage, auth, database) for an AI app built by an agent, while keeping it inside your enterprise tenant.

Rayfin is also integrated with Replit, so you can build apps in Replit while deploying data and backend services into your own Fabric environment. If you’re interested in the business side of this, it pairs nicely with strategies covered in how to build and sell AI-powered services in 2026.

Agent 365: identity, governance, and security for agents

Microsoft is treating agents as first-class entities in the enterprise, similar to users, apps, and devices. Agent 365 is the control plane for this.

Key pieces include:

• Identities and access control – agents can have their own identities and licenses (e.g., “autopilot” agents in Microsoft 365) and can act on behalf of users with clear boundaries.

• Defender integration – real-time defense for agents, extending Microsoft Defender to AI workloads.

• Purview integration – always-on data protection, compliance, and auditing for agent actions.

The Agent 365 SDK is now generally available and being expanded to cover local agents on Windows and frameworks like OpenClaw.

MDASH: defending against AI-powered attacks

Security isn’t just about protecting AI; it’s also about using AI to find vulnerabilities. Microsoft introduced MDASH, a multimodel, agentic security harness that uses over 100 specialized agents and multiple models to discover exploitable bugs.

MDASH runs as a CLI and integrates into the GitHub Copilot app. It:

• Scans codebases for traditional and AI-specific vulnerabilities
• Uses multiple agents to spot suspicious patterns, debate them, and build proof-of-concept exploits
• Generates detailed reports and can propose fixes directly in your local environment

In one example, MDASH found a subtle bug in an open-source project where a time-based map of an object was out of date, leading to a crash. The flaw was spread across three files and even had reassuring comments from developers – exactly the kind of issue that fools single-model scanners but can be caught by a coordinated agent harness.

Copilot’s evolution: chat, cowork, code, and autopilots

Microsoft 365 Copilot is evolving from a chat assistant into a full “super app” with three main modes:

• Chat – conversational assistance grounded in your work data
• Cowork – multistep task execution and artifact creation (documents, presentations, etc.)
• Code – bringing GitHub Copilot-style coding into knowledge work

On top of these, Microsoft is introducing autopilots – enterprise-grade, long-running agents that live inside your tenant with full compliance. Autopilots can have their own names, personalities, connectors, context, and memory.

The first autopilot is Scout, which:

• Joins Teams group chats to monitor work and answer questions
• Handles email threads in Outlook
• Runs in the background to reduce toil and keep you up to date

Scout is available for Copilot Frontier customers now, with more autopilots and a full “digital team” concept coming in the months ahead.

MAI models: new family of reasoning, voice, and coding models

Microsoft’s MAI (Microsoft AI) group announced seven new models across images, voice, transcription, and code, all tuned for practical, efficient use.

MAI Image 2.5 & Image 2.5 Flash
High-quality image generation and editing models:

• 2.5: maximum fidelity and professional-grade performance
• 2.5 Flash: faster, more efficient for production workloads

They’re already live in PowerPoint and rolling out to OneDrive, and available in Foundry at competitive quality-per-dollar.

MAI Transcribe 1.5
A state-of-the-art transcription model across 43 languages, optimized for real-world use:

• Higher accuracy than rival models from Google and OpenAI
• Up to 5x faster than competing transcription models
• Integrated into GitHub Copilot, Dynamics 365 Contact Center, and Foundry

MAI Voice 2 & Voice 2 Flash
Speech generation models with natural prosody and fine-grained control in 15 languages, with a Flash variant for ultra-low-latency voice experiences. Both include protections against unauthorized voice cloning and watermarking by default.

MAI Thinking 1
A 35B-parameter reasoning model focused on complex tasks:

• Human raters prefer it over Claude Sonnet 4.6 on overall quality
• 97% on AMI 2025 (reasoning benchmark)
• 53% on BenchPro, comparable to Claude Opus 4.6 on tough coding tasks

Crucially, Thinking 1 was trained “from the bottom” without benchmark-specific distillation, using a clean, commercially licensed data pipeline for enterprise use.

MAI Code 1 Flash
An inference-efficient coding model tuned for VS Code and GitHub CLI:

• ~5B parameters (similar to Claude Haiku in size)
• 51% on BenchPro
• Lower cost and strong performance for coding workloads

All MAI models are being optimized for Microsoft’s own MIO 200 accelerators, with reported 1.4x performance-per-watt gains compared to running on NVIDIA GB200. They’re also coming to Windows N1X devices, bringing frontier-level reasoning and coding to the edge.

Frontier tuning: your own hill-climbing machine

One of the most important strategic ideas in the keynote was “frontier tuning” – the notion that every serious company will need its own hill-climbing machine: a system that continuously improves AI performance against private objectives and evals.

With Frontier tuning in Foundry, you can:

• Start from a base model like MAI Thinking 1
• Define your own evaluation rubrics and tasks
• Run training loops (including custom reinforcement learning environments, or RLEs) against your data and workflows
• Continuously optimize models and agents using production traces

In one example, Microsoft tuned a model for Excel agentic use cases and achieved performance on par with GPT-4.5 at roughly 10x lower cost. For McKinsey, a tuned model reportedly beat GPT-4.5 on their internal tasks while also being 10x more cost-efficient.

Unlike shared model providers, Microsoft emphasized that the benefits of your RLEs and tuning stay with you: the resulting models and improvements are your moat, not shared back into a global model.

Frontier tuning in practice: Land O’Lakes

To show this in action, Microsoft demoed an environment built for Land O’Lakes, one of the largest agricultural businesses in the US. Their use case: generating highly precise butter reports, where 80% accuracy is not good enough.

Frontier tuning supports:

• Environments – encapsulating skills, knowledge, and tools for a domain (e.g., butter report generation).

• Skills and rubrics – not just what the agent can do, but how to judge whether it did it well. Rubrics define “what good looks like.”

• Automatic skill discovery – Microsoft 365 usage (Teams, Outlook, Word, Excel, PowerPoint) is used to suggest candidate skills and rubrics based on how your organization already works.

• Knowledge integration – connecting OneDrive and SharePoint for branding, guidelines, and domain knowledge.

• Tool simulation – agents can practice using tools in a simulated environment, learning without impacting live systems.

By generalizing the learnings into both the main model and the embedding model, Microsoft claims it achieved >90% accuracy on these complex tasks and ~10x efficiency versus baseline models. The result is an agent that writes butter reports that feel distinctly like Land O’Lakes, not a generic AI.

Microsoft Discovery: agentic scientific discovery

Looking beyond business applications, Microsoft is also applying the agentic approach to science through Microsoft Discovery. The goal is to turn the traditional, linear scientific method into a continuous, programmable loop.

Discovery combines:

• Models (for reasoning and simulation)
• HPC compute
• Scientific knowledge graphs
• Automated labs and simulations
• Agent orchestrators

In a demo focused on plastics recycling, a scientist used Discovery inside VS Code to:

1. Ask for a scientific paper on using proteins to recycle plastic instead of melting it.
2. Launch a Discovery engine composed of specialized agents following the scientific method.
3. Generate candidate protein sequences using HPC and evolutionary search (millions of variations).
4. Produce DNA sequences for the most promising proteins.
5. Send those sequences to an automated wet lab, controlled via an agent and Copilot interface, to synthesize and test the proteins.

Every step is logged in a knowledge graph, giving scientists full visibility and control. This same pattern can be applied to materials science, life sciences, drug discovery, and more, and aligns with the broader trend of AI in research covered in tools and reports under agentic data center and custom silicon discussions.

Quantum progress: Majorana 2

Microsoft also gave an update on its long-term quantum computing efforts. After announcing its first QPU and experimentally confirming a new state of matter (Majorana-based topological qubits), Microsoft has now reached “Majorana 2.”

Majorana 2 qubits reportedly achieve:

• Qubit lifetimes of 20 seconds to over a minute (versus microseconds in many other approaches)
• The ability to perform complex quantum computations within that lifetime
• A form factor small enough to fit ~1 million qubits on a chip smaller than a credit card

This combination of reliability, speed, and density is what Microsoft believes will eventually enable scalable, fault-tolerant quantum machines. Quantum is being developed alongside Microsoft Discovery so that once hardware is ready, it can plug directly into agentic scientific workflows.

What this all means for developers and companies

Across the keynote, a few themes stood out:

• Agents are the new apps – from Windows to Azure to Microsoft 365, everything is being reshaped around long-running, tool-using agents.

• Edge + cloud is one fabric – local models and cloud models are meant to work together, with “unmetered intelligence” at the edge and massive capacity in the cloud.

• Data and context are the real moat – Microsoft IQ, Frontier tuning, and RLEs are all about turning your data and workflows into a private hill-climbing machine that compounds over time.

• Security and governance are built-in – MXC, Agent 365, MDASH, and Purview integration are meant to make agents safe to deploy at scale.

• New form factors are coming – Project Solara and AI PCs show that AI will live in many devices, not just phones and laptops.

For developers, this means learning how to design, deploy, and govern agents across this stack. For companies, it means thinking beyond “using a model” and toward building your own frontier-tuned systems that reflect your unique knowledge and goals.

Microsoft’s bet is clear: the next decade of computing will be defined by agents, and it wants Windows, Azure, and Microsoft 365 to be the default place where those agents live.