Make your agents trustworthy at complex tasks

Open-source Claude Code plugin that wraps structure around your agent — workflows, quality gates, and automated reviews.

Get Started in Claude Code!
Click to copy then paste in terminal
$ claude plugin marketplace add Unsupervisedcom/deepwork ...

No telemetry. Open source. Ask Claude to run a security review on it.

“I used to spend an hour going back and forth to get a few pieces of content. With Deepwork I do 5x that and waste much less time on revisions.

“I kick off a workflow, let it run in the background, and feel assured I’ll get great quality output when I check back.”

“I feel like I’m pair programming with Claude Code. Deepwork makes it more like I’m assigning tasks to someone whose output I can trust to be pretty good.

How it works

You’re already using Claude Code. Deepwork makes the output reliable.

1

Create a job from any conversation

Midway through a task, realize it should be repeatable? Just say /deepwork create a job from what we just discussed. Deepwork reverse-engineers your conversation into a structured workflow — steps, quality gates, input/output contracts.

2

Quality gates catch mistakes before they compound

Each step must pass its quality gate before the agent moves on. The agent can’t skip ahead, invent shortcuts, or ignore its own errors. An MCP server enforces the process — not a prompt, not a suggestion.

3

Learn from output quality

Run /deepwork learn after a job completes. Deepwork diagnoses what went wrong, updates the job definition automatically, and makes the next run better. No manual editing required.

4

Reviews verify every change automatically

Define rules like “check every markdown doc for consistent formatting” or “verify all config files match the schema.” Targeted review agents fire only when matching files change. Scales to thousands of rules without slowing anything down.

Five systems that reinforce each other

Even human geniuses are not renowned for reliability on process — models have many of the same issues. These five systems reduce the variance.

Workflows

Make anything a repeatable process with quality gates the agent can’t skip.

See how

Do a task once with Claude — research, write, iterate until you’re happy. Then tell DeepWork to capture it. It reverse-engineers your conversation into a step-by-step process with quality gates between each step. Next time, the agent follows the same process. It literally can’t skip ahead until each gate passes.

Learn

After each run, DeepWork rewrites its own instructions to improve the next one.

See how

After a workflow runs, /deepwork learn reads the conversation and figures out what went well and what didn’t. It sorts findings into three buckets: things that improve the workflow for everyone, context specific to this project, and issues that can be prevented with automation. Then it rewrites the job instructions, drops context files where the agent will find them, and creates new review rules or schemas so the same class of mistake can’t happen again.

DeepPlan

Explores competing approaches and stress-tests the plan so it runs right the first time.

See how

DeepPlan explores competing approaches and asks you a few sharp questions, so the workflow it builds has already been stress-tested before it runs for the first time. The output isn’t a document — it’s a workflow definition, ready to go.

Reviews

Targeted rules that only run on files that actually changed.

See how

Config files that say “when these files change, check for these things.” Each rule is scoped to a file pattern, so you can have hundreds without slowing down. If no matching files changed, the rule doesn’t even wake up.

DeepSchemas

Auto-generated file contracts that validate output the moment it’s written.

See how

Files have unwritten rules — this config needs these fields, this report needs that structure. DeepWork generates schemas automatically when you create a workflow, and Learn refines them after each run. Any file matching a schema gets validated the moment it’s written. The agent finds out immediately, not three reviews later.

Example jobs

Each job is a multi-step workflow where the agent validates its own output at critical moments — so you get a finished result, not a first draft.

Blog Post from a Topic

Give it a topic and get back a publish-ready post. The agent researches, outlines, writes, reviews it from your reader’s perspective, and optimizes for SEO.

7 steps, checks itself twice

Product Research from a URL

Point it at a product and get a thorough research report. The agent browses the web autonomously, makes 10+ passes, and won’t stop until every section has real substance.

3 steps, checks itself 3 times

Competitive Intel from Live Websites

Run it monthly and get a report on what your competitors changed. The agent visits their sites, captures screenshots as evidence, and flags what’s different since last time.

4 steps, checks itself 4 times

Data Analysis from Your Warehouse

Connect a data source and get a presentation-ready analysis. The agent explores the data, takes notes, then critiques its own findings from six different angles before delivering.

6 steps, checks itself twice

Daily Email Triage from Gmail

Run it on your inbox every morning. The agent labels, archives, and drafts replies in your voice — handles 100+ emails without losing context.

2 steps, checks itself twice

Jobs can chain together — one job’s output becomes the next job’s input.