
Why the Highest-Leverage Work is No Longer Creation
“Is this design beautiful?” is a hard question for Claude to answer. “Does this follow our principles for good design?” gives it something concrete to grade against.
This quote is from one of the most important blog articles released in the past few months, largely unheralded and one that has continued to stick with me for months.
It’s written by an Anthropic engineer, who was using Claude to create application designs. The work functioned. The pages loaded, the logic held, and the output was usable. But it still felt generic.
Claude often would identify problems when asked to review its own work. It would point out that a design lacked originality or craft… but then still “pass” the work, explaining away the issues.
The breakthrough came when “beautiful” was defined more clearly. Instead of asking the model to make something better in the abstract, the team broke beauty into specific criteria: design quality, originality, craft, and functionality.
Importantly… and the WOW moment for me … was that this fundamentally changed the role of the human. The human was no longer responsible for designing every screen. The human was, instead, now responsible for defining the standard the work had to meet. And the strength of that standard would dictate the quality and differentiation of AI generated output.
If you believe in the trajectory of AI, then ‘Defining Beautiful’ is no longer just a creative exercise. It is the single most important point of leverage for human expertise.
The Trap of “Acceptable”
Most organizations are currently using AI to create more. More drafts, ideas, summaries, and slides. But “more” is not the same as “better.”
Most AI output is not bad, which is part of the problem. It is usually fine. Complete enough, coherent enough, and professional enough. It is “good” in the most dangerous sense of the word: it is acceptable.
Beautiful is different. In a business context, beautiful does not just mean visually polished. It means the work has judgment, taste, and fit. It understands the audience and the moment. It feels like it came from someone who understands the business, not just the assignment.
- A good campaign brief is complete. A beautiful campaign brief makes the work better before the work begins.
- A good strategy deck has the right sections. A beautiful strategy deck helps a room make a decision.
- A good piece of messaging is on brand. A beautiful piece of messaging sounds like the brand, fits the audience, and resonates deeply.
The Ratchet: Engineering Excellence
To move from “acceptable” to “beautiful,” we have to look at how AI actually improves.
Andrej Karpathy, a leading AI researcher, describes a mental model called The Ratchet. Imagine a tool that only clicks forward. You give an AI a specific goal—for example, “make this code run faster”—and let it experiment. If a change works, the ratchet clicks forward and locks. If it doesn’t, it reverts.
In technical fields, this is easy because “good” is a number. In marketing and strategy, it’s harder. There is no simple test that tells you whether a positioning idea is sharp enough. So, we often treat this work as too “subjective” to define.
This idea goes much deeper than prompt engineering. In Defining Beautiful 2.0, we break down the agent loops, evaluator systems, and technical patterns that are changing the role of human expertise itself.
But experienced marketers evaluate this every day. They know when a brief is too broad or when a creative idea is clever but not ownable. The issue is that most organizations have not written down how that evaluation works. This “Institutional Taste” historically has lived in the heads of your best people, which made it impossible to scale. Until now.
The Shift: Institutional AI
This is where Institutional AI becomes a competitive advantage. Many companies still treat AI as an individual productivity tool by giving people access and hoping for the best.
The larger value of Institutional AI is teaching the machine how your specific organization thinks, decides, and judges. It requires you to capture the standards that already exist in fragmented ways:
- What makes a recommendation credible here?
- What does leadership need to see before making a decision?
- What separates polished work from powerful work?
- How do we break these up into the smallest meaning pieces?
When you define and codify these, you create an “evaluator” the AI can use to grade itself.
From “In the Loop” to “On the Loop”
This doesn’t make humans less important; it changes where we create the most value.
In the early stages of AI adoption, the human stays in the loop. You ask for a draft, you review it, you fix the weak parts, and you move on. You remain the quality control system. You are the bottleneck.
A more scalable model puts the human on the loop.
The human defines the goal, provides the “Golden Record” examples of excellence, and builds the rubric. When you are on the loop, you aren’t checking the work; you are designing the conditions for beautiful work.
How to Start Defining Beautiful
The limiting factor for your AI strategy isn’t the model you use; it’s whether you have given the AI a clear enough definition of excellence to improve.
Start with the things your organization creates every day: campaign briefs, audience strategies, and executive decks. Then get specific:
- Identify the “Golden Records”: What are the three best examples of this work we’ve ever produced?
- Decompose the Judgment: What did our best people notice in these examples that others missed?
- Build the Rubric: Create 4-5 concrete criteria that move the work from “acceptable” to “beautiful.”
The organizations that get the most from AI will not simply generate the most output. They will evaluate better. They will build systems that turn individual taste into an institutional asset.
The work isn’t the output anymore. The work is Defining Beautiful.

