How Anthropic’s Product Team Ships AI Features in Days, Not Months

22 May 2026 22:37 206,561 views
Anthropic’s product lead for Claude Code explains how her team ships AI-native features at an insane pace, what modern PMs actually need to be good at, and how to turn Claude Code and CoWork into real leverage instead of just shiny demos.

Anthropic’s product teams are shipping AI features at a pace most companies can’t even imagine. Timelines that used to be six months are now one month, one week, or even a single day. Underneath that speed is a very different way of doing product management, building with AI, and organizing teams.

This guide breaks down the core ideas behind how Anthropic’s Claude Code and CoWork teams operate, what skills modern PMs actually need, and how you can apply the same patterns to your own AI products and workflows.

How Anthropic’s Product Org Is Structured

Anthropic’s product organization is small but highly leveraged, with roughly 30–40 PMs across a few focused teams:

Model & Research PMs. This group collects feedback on Anthropic’s models, feeds it to the research org, and shepherds model launches end to end.

Claude Developer Platform. This team owns the Claude APIs and platform features like managed agents that other products (including Claude Code) are built on.

Claude Code & CoWork. This group builds the core AI coding assistant (Claude Code) and the general-purpose work assistant (CoWork).

Enterprise. Focused on everything needed for large customers: RBAC, security, cost controls, compliance, and deployment models that enterprises trust.

Growth. A cross-cutting team that drives adoption and usage across Claude Code, CoWork, and the developer platform.

Across all of these, the culture is the same: low process, high trust, and a bias toward shipping quickly.

Why AI-Native Product Management Is Different

Most PMs are still operating with a pre-AI mindset: long roadmaps, heavy coordination, and expensive engineering cycles. Anthropic’s experience suggests that doesn’t work anymore.

1. Timelines have collapsed. With AI-accelerated engineering and rapidly improving models, feature timelines have shrunk from six months to weeks or days. Planning on 6–12 month horizons is often too slow.

2. Code is cheap, decisions are expensive. As models write more of the code, the scarce skill isn’t implementation—it’s deciding what to build, why, and for whom. Product taste becomes the real leverage.

3. Roles are blurring. Engineers are doing PM work. Designers are writing code. PMs are building evals and even small tools. The most effective people are comfortable wearing multiple hats and caring more about outcomes than job labels.

4. The job is to compress idea → live feature. The best AI PMs obsess over shortening the distance between a promising idea and a real feature in users’ hands—ideally within a week.

If you want a deeper dive into using Claude Code this way in your own work, check out this guide to being truly productive with Claude Code.

The Process Behind Shipping Weekly (or Daily)

Anthropic’s speed isn’t just about having access to frontier models. It’s about how the teams work.

1. Ruthlessly clear goals

LLMs are general-purpose, which creates huge ambiguity: Who is this for? What problem is it solving? What work out of the box?

PMs narrow this down aggressively. A typical goal might look like:

“Our key user is professional developers at enterprises. The problem: too many permission prompts and fatigue. The goal: safely get them to zero permission prompts for most workflows.”

That level of clarity instantly rules out a lot of noise and keeps the team focused.

2. Ship in “research preview” by default

Most Claude Code features launch first as clearly labeled research previews. That means:

– Users know it’s early, experimental, and may change or disappear.

– The team can ship in a week or two without committing to full long-term support.

– Real-world feedback arrives quickly, which shapes whether the feature becomes core, gets redesigned, or gets killed.

3. A tight launch pipeline, not heavyweight process

Instead of big launch committees, Anthropic uses a lightweight but reliable launch pipeline:

– Engineers dogfood a feature internally.

– When it feels ready, they post it in an internal “evergreen launch room.”

– Docs, PMM, and DevRel jump in immediately to write docs, announcements, and examples—often within a day.

This makes shipping the default, not the exception. PMs are responsible for setting up and maintaining this system so anyone can take an idea to production with minimal friction.

4. Metrics and principles instead of giant PRDs

Anthropic still uses PRDs, but only when a feature is truly ambiguous or infra-heavy. Day to day, two things matter more:

Weekly metrics readouts. The entire team reviews core metrics together, so everyone understands what’s working, what’s not, and what drives the business.

Team principles. Each team writes down who their key users are, why, and what tradeoffs they’re willing to make. This lets engineers and designers make decisions without waiting on PM approval.

The result: fewer documents, more shared context.

The New Core Skill: Product Taste for AI

As models get better, the hardest part isn’t coding—it’s knowing what to build and when. Anthropic keeps coming back to one phrase: product taste.

Product taste in the AI era looks like:

Understanding model capabilities now—not just in a sci-fi future. It’s easy to imagine a world where a super-intelligent model can do anything from a single text box. The hard part is designing products that get the out of today’s imperfect models.

Guiding users onto the “golden path.” Good AI products gently steer users into workflows that align with model strengths and hide or patch over weaknesses.

Choosing what not to build. Anthropic gets tens of thousands of GitHub issues and requests. Taste is knowing which 5% are worth building and how to design them in a way that feels delightful, not bolted on.

Having a feel for effort vs. impact. An engineering background helps—at least for now—because it lets you quickly tell when something is a one-hour change vs. a multi-month project, and prioritize accordingly.

How Anthropic Actually Uses Claude Code and CoWork

Claude Code and CoWork aren’t just products—they’re the internal engine that lets Anthropic move so fast. Here’s how they’re used in practice.

When to use Claude Code vs. Desktop vs. Web/Mobile

Claude Code in the terminal. Used for one-off or small batches of coding tasks, especially when you want the very latest features. This is the most powerful surface and often gets new capabilities first.

Claude Code Desktop. Ideal for front-end work and anything visual. The preview pane lets you see a live web app as Claude edits it. It’s also friendlier for non-technical users who don’t love the terminal.

Web and mobile. Best for kicking off tasks on the go. You don’t always want to be tethered to a laptop; mobile lets you dispatch and monitor tasks from anywhere.

If you want to go deeper on turning Claude Code into a real engineering multiplier, see this walkthrough of using Claude Code as an AI engineering team.

When to use CoWork

CoWork is for everything where the output is not code:

– Writing and polishing docs, specs, and launch plans

– Preparing customer meetings and follow-ups

– Summarizing Slack, email, and call notes

– Creating slide decks and internal presentations

The first step is always the same: connect your real work data. Anthropic PMs hook up:

– Google Calendar

– Slack

– Gmail

– Google Drive (and sometimes Figma)

Once CoWork has that context, it can do surprisingly heavy lifts. For example, one talk for Anthropic’s Code with Claude conference was drafted almost entirely by CoWork:

– The PM fed it: PMM notes, an old deck they didn’t like, internal launch channels, and public launch posts.

– CoWork proposed an outline, the PM edited it, then CoWork generated a full 20-page deck.

– Because it had access to Anthropic’s slide template and design system, the deck looked like it came from a designer, not an AI.

The PM’s job shifted from “build a deck from scratch” to “decide what story to tell and refine the details.”

Building Internal Tools on Top of Claude

One of the biggest second-order effects of Claude Code inside Anthropic is the explosion of small, custom internal tools.

Because it’s so easy to build prototypes, teams no longer wait for generic SaaS tools to fit their workflows. They just build what they need:

Example: Auto-customized sales decks. A sales rep built a web app that:

– Stores a library of proven Claude Code decks (101, 201, “mastering Claude Code”).

– Pulls customer context from Salesforce, Gong, and notes.

– Automatically customizes slides based on stack (e.g., Bedrock, Vertex), concerns (security, code review), and deployment model.

What used to be 20–30 minutes of manual work per customer is now a few seconds.

Similar patterns show up across applied AI, support, and product: small, purpose-built apps that give teams leverage instead of forcing them into generic workflows.

How Anthropic Thinks About Models, Harnesses, and Product

Every new model release forces product teams to revisit what they’ve built. Two big patterns show up.

1. The model will eat your harness

Early versions of Claude Code needed a lot of scaffolding to get reliable behavior. For example, to ensure a refactor touched every call site, the team added a to-do list tool and repeatedly reminded the model to finish all items.

With newer models (Opus 4.x, Sonnet 4.6), Claude naturally uses that to-do list without heavy-handed prompting—or sometimes doesn’t need it at all. The feature is still nice for users, but it’s no longer a crutch.

With each model jump, the team:

– Rereads the entire system prompt.

– Removes reminders and hacks the model no longer needs.

– Simplifies the harness so the model can just do its job.

2. New models unlock previously impossible features

Other times, model upgrades make previously marginal ideas suddenly viable. Code review is a good example:

– Earlier models could do basic code review, but not reliably enough to be trusted as a gate.

– With the latest models, Anthropic runs multiple review agents in parallel across the codebase, synthesizes real issues, and uses Claude’s review as a serious signal before merging PRs.

The pattern: build the product you think will be possible in the next six months, then swap in new models as they arrive and see if the gap has closed.

What Modern PMs Need to Learn (and Unlearn)

If you’re a PM, founder, or cross-functional leader trying to stay relevant in an AI-native world, a few themes stand out.

1. Be the right amount of “AGI-pilled”

Everyone can imagine the end state where a super-intelligent model can do anything from one prompt. But you can’t build for that world yet.

Your job is to:

– Understand what today’s models can and can’t do.

– Design products that squeeze the maximum capability out of them.

– Continuously adjust as capabilities improve every few months.

2. Spend real time with the model

You can’t outsource this. The best PMs at Anthropic:

– Use Claude heavily in their own work.

– Ask the model to introspect: “Why did you skip this step?” “Why didn’t you test the UI?”

– Use those answers to refine prompts, tools, and harnesses.

– Build and run small evals for critical behaviors (e.g., memory quality, code review accuracy).

You don’t need hundreds of evals—10 good ones for a feature can be transformative.

3. Find your “human eval” partners

Some users are much better than others at evaluating models and articulating what’s off. Anthropic leans heavily on:

– A small set of internal power users (e.g., the Claude Code team itself).

– Specialists like the person who shapes Claude’s character and tone.

Their qualitative feedback guides what hypotheses to test and what metrics to pull.

4. Build automations that are actually 100% reliable

Many people stop when an automation is “95% good.” But if it fails 1 in 20 times, you can’t trust it—and you’ll keep doing the work manually.

Anthropic’s advice:

– Pick a real workflow you do often (e.g., triaging email, preparing customer briefs, updating a dashboard).

– Invest the time to get it from 95% to 100% reliability.

– Only then treat it as an automation and move on to higher-leverage work.

This is where the real time savings come from.

How to Personally Thrive in an AI-First World

Underneath all the process and tooling, there’s a mindset shift that Anthropic looks for and encourages.

Lean into the chaos. The pace of change is only going to increase. People who stay calm, optimistic, and curious under constant P0s are the ones who don’t burn out.

Just do things. Don’t wait for permission or perfectly defined roles. If you see a gap—whether it’s a missing feature, a broken process, or an opportunity to help a customer—take a first step, learn, and adjust.

Use AI to remove your grunt work. Any time you find yourself doing the same manual task more than a couple of times, ask: could Claude Code or CoWork do this for me? Then invest in making that automation solid.

Focus on real, recurring value. It’s easy to build shiny one-off demos. The real leverage comes from tools and workflows you use every day that compound over time.

AI isn’t just another feature area. For teams like Anthropic’s, it’s the foundation of how they work, ship, and make decisions. The sooner you align your own habits and skills with that reality, the more upside you’ll capture from what’s coming next.

Share:

Comments

Laura Evans 1d ago
I think Anthropic's success also depends on having access to their own frontier models. Not everyone has that advantage. It's a unique position.

More in Claude