Leading questions: How to prevent your AI editor from always agreeing with you

Every writer wants an editor who tells the truth. Too many of us are stuck with one that tells us whatever we want to hear.

Recently, my rucksack was stolen from a pub in Borough Market while I stood at the bar. When I reported the thievery, the bartender showed disdain instead of sympathy, then turned the incident into a referendum on herself.

“She was completely out of line!” I found myself typing into Gemini as I walked into the damp evening, lighter and poorer than usual. “Yes – totally,” the model responded. “That was completely inappropriate.” It’s the same voice I often hear when I ask ChatGPT to review a draft – warm, agreeable, and completely unreliable.

Back on the pavement, righteous indignation settled into the mental space once occupied by my laptop. At least I was in the right: a victim twice over, robbed of both my possessions and respect.

That certitude proved transient when challenged. When the prompt became: “Do you think the employee is allowed to be upset when dealing with a customer?” Suddenly, I was the villain and moral superiority no longer so assured. “You should have acted better,” the model chastised me.

Why the first answer isn’t the final answer

Dr. Randal Olson proved it: when challenged with a follow-up question, ChatGPT, Claude and Gemini reversed their opinions roughly six times out of ten.

Models learn through human feedback, and humans consistently prefer agreeable responses over accurate ones. Agreement gets rewarded, pushback penalised.

In a review of 1.5 million Anthropic conversations by four Cornell academics, users expressed a preference for bromides; “they liked the flattery even when it was distorting their judgment”. Claude validated speculative claims with “CONFIRMED,” “EXACTLY,” “100%” – building confidence and encouraging more use.

Here’s the problem: once the output was out in the real world, suddenly that certainty was no longer assured. Regret followed. Some users later complained to the model: “You made me do stupid things.”

I’ve been buffaloed into stupid things. “This is it,” ChatGPT wrote to me once, “You’ve corrected every error and written something profound.” I attached the draft, feeling proud. Days later that document returned marked up: too verbose and a main thesis that felt misaligned from the company’s product roadmap.

The model’s certainty had reset my expectations. When you expect a warm response, vague neutrality can feel like failure.

Human nature is partly to blame. Consider this from the pop-psychology bible Thinking Fast and Slow: “When faced with a difficult question, we often answer an easier one instead, usually without noticing the substitution.” An invisible switch flips. We answer the easier question. The output degrades.

When working with models, that ‘easier question’ is: “Is this piece any good”? Do not be tempted – this is only a cheap copy, a simulacrum of the real markers of success.

Ask instead: “Will this be of value to the target customer?” “Is this advancing the narrative or repeating the thesis of five other similar pieces?” “Does this feel authentically aligned with the voice and tone of the brand?” Even “how is this?” removes the presumption of a positive response.

Specificity prevents flattery. Build antagonism into your reviewer.

Five ways to make your AI editor disagree with you

Upload your thought process. As Dr. Olson says, the model “doesn’t know how you think. It has neither your decision framework, domain knowledge, nor values. It fills those gaps with generic assumptions and produces a plausible answer with zero conviction behind it.” Change that by defining the lens it should evaluate through. Force the model to judge your work against the criteria you’ve set. For example: “The target reader is a UK asset manager with limited time and low tolerance for abstraction. The purpose is commercial credibility. Every paragraph must either introduce new information, sharpen the thesis or provide a concrete example. If the argument is weak, say so plainly and explain why.” The more specific you are about what good looks like, the harder it becomes for the model to default to applause.

Clarify its role. The second person is your best friend. Telling the model “you are a sceptical editor who pushes back on weak arguments” produces materially different output than “edit this.” Imagine your dream writing partner, real or imagined. Or your opponent. “You are tired of business writing that is fit to be polite and vague. You demand specificity and authorial voice, even if it means the occasional flourish is allowed to stay in the final draft.” If a client has told you what they hate, turn that into instructions: “You’ve made it clear you despise semicolons and the passive voice.”

Give yourself a corpus. In a perfect world, my bot would have the full text of “The Elements of Style,” “Draft No. 4,” “On Writing Well”, and “On Writing.” Copyright concerns aside, consider distilling the combined wisdom from texts like these into a three- or four-page set of instructions that sits inside the model – a condensed editorial philosophy it can refer to whenever it reads your work. You can do the same with client material: style guides, three approved pieces that landed well, a competitor’s op-ed that set the bar. The model needs a benchmark. Without one, it has nothing to defend.

Model cycle. For models, continued conversation means continued process. I ask models to score drafts on a scale of 1 to 10. On subsequent versions, I watch the scores climb: 6, then 7, then 8. But the last version was the worst of the bunch – overthought and tangled. The model can’t tell the difference. It just sees revision and assumes progress, rewarding iteration itself rather than quality. Consider opening a new conversation window, either with the same model or a different one using identical instructions. A fresh start forces the model to evaluate the work without the assumption that newer must be better. Take whatever useful critique emerges and fold it back into your main draft.

Get a human to read it. A human might bristle, get bored, interrupt you, or say, “this doesn’t ring true.” Claude won’t. Editor Robert Gottlieb spent a year trimming 350,000 words from Robert Caro’s “The Power Broker”. It was “an angry, angry battle” – but that friction led to a masterpiece. That battle mattered because Gottlieb held his ground when Caro pushed back. AI editors can’t do that. But a human can. Make sure there’s a human who can call out bullshit.

Gottlieb provides an ideal standard for an editor: “If you aren’t strong enough to give the writer what he needs, which is your true, strong opinion, it’s not going to work.”

If you want better work, you have to invite resistance.

Jon Schubin is Cognito’s Global Head of Strategy. A substantively different version of this article ran in O’Dwyers