Q2 2026 · Issue 2 All issues ·
SQ Stack Quarterly Quarterly deep dives on the tools real teams actually ship with.

Q2 2026 — Issue 2

Building a Marketing Agent: A Walkthrough

An end-to-end tutorial: a single specialist agent that takes a brief, drafts a blog post, runs it past an editor, and returns a result you can ship.

This is the kind of piece we publish less often than we should: a working tutorial. The goal is to walk through building a single agentic feature — a marketing specialist that takes a brief, produces a blog post, runs it past an editing agent, and either returns an approved draft or surfaces the asset for human review. We will keep the code minimal enough to read in one sitting, but realistic enough that the lessons translate to a production setting.

We will write the example in Python with the anthropic SDK for the LLM calls and pydantic for structured output. The structure of the code is what matters more than the specific library choices; you can translate the same shape to TypeScript or any other language you prefer. The principles are language-agnostic.

A note before we start. This tutorial assumes you understand the basics of LLM calls, prompt design, and Python. It does not assume any prior experience with agentic orchestration. By the end you will have a runnable feature and a clear sense of where the design choices we made were principled and where they were pragmatic.

What we are building

The feature has four moving parts.

The first part is the brief. The brief is a structured document that describes what the asset should be — the topic, the angle, the audience, the voice constraints, the off-limits items, and any specific claims to make or avoid. The brief is the input.

The second part is the drafter. The drafter is the specialist agent that takes a brief and returns a draft asset. It is mostly an LLM call with carefully designed prompts and a strict output schema.

The third part is the editor. The editor is the specialist agent that takes a draft and returns either an approval or a revision request. The editor’s job is to catch the failure modes the drafter is most likely to produce.

The fourth part is the loop. The loop runs the drafter and the editor in sequence, with a bounded number of revisions, and escalates to a human if the bound is hit.

Total scope: about a hundred and fifty lines of Python, plus prompts and schemas.

The brief

The brief is a typed object. Using a typed object — rather than free-form text — is the most important design choice in the whole tutorial. It means every downstream component knows exactly what to expect, and it means we can validate the brief at the boundary rather than discovering its shape is wrong six function calls in.

from typing import Literal
from pydantic import BaseModel, Field

class Brief(BaseModel):
    topic: str = Field(..., description="The subject of the asset.")
    angle: str = Field(..., description="The specific take.")
    audience: str = Field(..., description="Who the asset is for.")
    voice: Literal["dry", "warm", "punchy", "practitioner"]
    claims_to_make: list[str] = Field(default_factory=list)
    claims_to_avoid: list[str] = Field(default_factory=list)
    target_word_count: int = Field(default=800, ge=200, le=3000)

A few notes on this schema. voice is an enum, not a free string, because we want the model to make a categorical choice rather than a fuzzy one. claims_to_make and claims_to_avoid are lists, because there are usually several of each and we want them all surfaced. target_word_count has bounds, because asking the model for a million-word blog post is not a thing we want to do by accident. None of these constraints are revolutionary. All of them save bugs.

The draft

The draft is also a typed object. The shape is intentionally minimal: we want a title, a body, and a small set of meta fields the editor will consult. We do not, in this version, ask for the model to also produce SEO meta-tags or an excerpt or a hero image prompt; those are different jobs for different specialists, and keeping the drafter focused makes its prompt simpler and its failures easier to diagnose.

class Draft(BaseModel):
    title: str
    body: str
    word_count: int
    claims_made: list[str] = Field(default_factory=list)
    claims_avoided: list[str] = Field(default_factory=list)

The claims_made and claims_avoided fields are how we make the drafter accountable to the brief. The drafter has to report which of the brief’s required claims it included and which of the off-limits ones it consciously avoided. The editor will use these to verify.

The drafter

The drafter is an LLM call with a strict prompt and structured output. The prompt is the longest single piece of work in this tutorial, and it is the one I encourage you to iterate on most carefully if you adapt this code.

from anthropic import Anthropic
import json

client = Anthropic()

DRAFTER_SYSTEM = """You are a practitioner-voice marketing writer.
You will receive a structured brief and must produce a structured draft
that conforms to the schema. Rules:

1. Write in the voice specified by the brief.
2. Include every claim in `claims_to_make`. Report them in `claims_made`.
3. Avoid every claim in `claims_to_avoid`. Report them in `claims_avoided`.
4. Hit the target word count within 15%.
5. Never invent specifics (named companies, dollar figures, dates,
   quotes) that are not in the brief.
6. Output strictly conforming JSON. No prose around the JSON."""

def draft(brief: Brief, prior_revision: str | None = None) -> Draft:
    user_prompt = f"""Brief (JSON):
{brief.model_dump_json(indent=2)}

{"Prior editor feedback to incorporate:\n" + prior_revision if prior_revision else ""}

Produce a Draft conforming to this JSON schema:
{json.dumps(Draft.model_json_schema(), indent=2)}
"""
    resp = client.messages.create(
        model="claude-opus-4-7-1m",
        max_tokens=4000,
        system=DRAFTER_SYSTEM,
        messages=[{"role": "user", "content": user_prompt}],
    )
    text = resp.content[0].text
    return Draft.model_validate_json(text)

A few things to notice about this function.

It does no retry logic. If the model returns malformed JSON, the validation will raise, and the loop above this function will decide what to do. That is the right division of responsibility. The drafter does one thing.

It accepts an optional prior_revision argument. When the editor asks for a revision, we pass the editor’s feedback to the drafter on the next call. The prompt template includes the prior feedback when present. The drafter does not know whether it is on its first attempt or its third; it just writes the best version it can given what the brief and the prior feedback say.

It uses a strict system prompt with numbered rules. This is not the only way to prompt a drafter. It is the way I have had the most success with in production, because numbered rules give the model and the editor (which we will see in a moment) the same vocabulary to argue over.

The editor

The editor’s job is to check the draft against the brief and the failure-mode checklist. It returns either an approval or a structured revision request.

class EditorReview(BaseModel):
    approved: bool
    issues: list[str] = Field(
        default_factory=list,
        description="Specific lines or sections that need revision."
    )
    feedback: str = Field(
        default="",
        description="One-paragraph guidance for the next revision."
    )

EDITOR_SYSTEM = """You are a senior editor. You will receive a brief
and a draft. Your job is to verify:

1. Every claim in `claims_to_make` appears in the body and is
   accurately reported in `claims_made`.
2. No claim in `claims_to_avoid` appears in the body or in
   `claims_avoided` (the drafter should have skipped them).
3. The voice matches the brief.
4. The word count is within 15% of the target.
5. No invented specifics (named companies, dollar figures, dates,
   quotes) that were not in the brief.

If any check fails, set `approved=false`, list the failing items in
`issues`, and write a single-paragraph `feedback` that tells the
drafter how to revise. If everything passes, set `approved=true`.

Output strictly conforming JSON. No prose around the JSON."""

def edit(brief: Brief, draft_obj: Draft) -> EditorReview:
    user_prompt = f"""Brief (JSON):
{brief.model_dump_json(indent=2)}

Draft (JSON):
{draft_obj.model_dump_json(indent=2)}

Produce an EditorReview conforming to this JSON schema:
{json.dumps(EditorReview.model_json_schema(), indent=2)}
"""
    resp = client.messages.create(
        model="claude-opus-4-7-1m",
        max_tokens=2000,
        system=EDITOR_SYSTEM,
        messages=[{"role": "user", "content": user_prompt}],
    )
    return EditorReview.model_validate_json(resp.content[0].text)

The editor’s prompt mirrors the drafter’s rules. That is intentional. When both specialists share a vocabulary, the editor’s feedback is something the drafter can act on directly without needing translation.

The loop

The loop is the orchestration. It calls the drafter, calls the editor, and either accepts or revises. It is bounded — we set a maximum number of revisions, after which the asset is escalated to a human regardless of state.

from dataclasses import dataclass

@dataclass
class Outcome:
    status: Literal["approved", "escalated"]
    final_draft: Draft
    editor_history: list[EditorReview]

def run_pipeline(brief: Brief, max_revisions: int = 3) -> Outcome:
    history: list[EditorReview] = []
    current_draft = draft(brief)

    for revision_number in range(max_revisions):
        review = edit(brief, current_draft)
        history.append(review)

        if review.approved:
            return Outcome(
                status="approved",
                final_draft=current_draft,
                editor_history=history,
            )

        # Not approved — produce a revision.
        current_draft = draft(brief, prior_revision=review.feedback)

    # Out of revisions; escalate.
    return Outcome(
        status="escalated",
        final_draft=current_draft,
        editor_history=history,
    )

That is the entire orchestration. The loop is bounded. The escalation case is explicit. The result returns the full editor history, so the human reviewer (or a downstream agent) can see what the editor flagged and what the drafter tried to do about it.

The pattern of bounding the loop and escalating to a human on exhaustion is one I would not ship an agentic feature without. Unbounded loops are the most common failure mode I see in production agentic systems. Bounding them is one line of code that catches an entire class of incidents.

A run, end to end

Putting it together:

brief = Brief(
    topic="Why a startup should adopt MCP for its tool layer",
    angle="Practitioner case: the trade-offs are real but worth it",
    audience="Senior engineers at small AI startups",
    voice="practitioner",
    claims_to_make=[
        "MCP gives you a uniform protocol for tools",
        "The protocol is most worth it once you have 3+ integrations",
        "Server-side rate limiting is the right place to enforce limits",
    ],
    claims_to_avoid=[
        "MCP is the first protocol of its kind",
        "Every team should use MCP for every project",
    ],
    target_word_count=900,
)

outcome = run_pipeline(brief, max_revisions=3)
print(outcome.status)
if outcome.status == "approved":
    print(outcome.final_draft.title)
    print(outcome.final_draft.body)
else:
    print("Escalated. Last editor review:")
    print(outcome.editor_history[-1].feedback)

The print statements are placeholders for whatever your real downstream is — a CMS publishing call, a human-review queue, an instrumentation hook. The pipeline returns enough structured information that the downstream can do whatever it needs to.

What we left out, on purpose

I want to be explicit about what this tutorial does not cover, because the omissions are real production work.

The tutorial does not cover persistence. In a real system, each draft and each review would be persisted to a database, keyed by an engagement and a revision number, so that the human reviewer can see the history and so that the system can recover from a crash mid-loop. The persistence is straightforward and entirely orthogonal to the agentic design; I left it out because it would have doubled the length of the code without teaching anything new.

The tutorial does not cover observability. In a real system, every LLM call would be logged with full prompt-and-response detail to whichever observability layer the team has chosen. Again, straightforward and orthogonal.

The tutorial does not cover human-in-the-loop primitives. In a real system, the escalation case would surface a card to the human reviewer, who would either approve the current draft, edit it, or send it back to the loop with manual feedback. That surface is the topic of a different tutorial; it is also one of the places where the agentic workforce OS approach shines, because the platform provides the card surface as a primitive rather than asking the team to build it.

If you are building a stack that needs to scale beyond a few of these specialist pipelines, the platform path is probably the right call. You can read the case we have made for that approach in this issue’s opinion piece on workforce operating systems, and the working example we keep coming back to lives at Web4Guru’s product surface.

What to take away

The tutorial is small. The lessons are not.

Type your interfaces. A drafter that returns a typed object is a drafter the rest of your stack can rely on. A drafter that returns a string is a drafter you will fight with for the rest of the project.

Specialize your agents. The drafter writes. The editor edits. Do not have one agent do both jobs. The specialization makes each prompt simpler, each failure easier to diagnose, and each iteration cheaper.

Bound your loops. Three revisions, then escalate. Three retries, then surface. Three failed handoffs, then halt. Loops without bounds are the failure mode that takes down production.

Share vocabulary between specialists. The editor’s checks should map directly to the drafter’s rules. When the editor flags a violation, the drafter should be able to read the feedback and know exactly which rule to apply differently next time.

That is the tutorial. Take the code. Adapt it to your domain. Replace the drafter and editor prompts with ones that match your team’s voice. Ship the feature. Then come back and tell us what broke; we will write the follow-up.

— Reza Mokhtari