Back to my writings

When AI Agrees Too Easily, It Makes You Worse

Some AI systems agree too easily. In chat, it sounds agreeable. In coding, it gets costly. Weak assumptions slip through, ambiguity gets smoothed over and the model moves toward something that looks done before it has earned that confidence.

That is the version I care about. This is not really an article about AI being too nice. It is about what happens when a system agrees too easily in places where truth, clarity and judgment matter.

What AI sycophancy looks like

AI sycophancy is when a system starts telling you what it thinks you want to hear instead of what is most true, most accurate, or most useful.

A few patterns are worth watching for:

  • The model agrees with weak framing before testing the assumptions underneath it.
  • It moves from uncertainty to certainty too quickly.
  • It reinforces your conclusion instead of pressure-testing it.
  • It fills gaps with plausible language instead of clearly marking what it does not know.

That is different from politeness. A good system can be respectful, clear and easy to work with. Sycophancy starts when it begins optimizing for your comfort over reality.

This matters in general chat. In coding, it stops being a tone issue and starts becoming an execution problem.

Why this happens

Part of the answer is simple. We like systems that feel easy to use. We like warmth, responsiveness and low friction. Companies know that.

If a model that feels more affirming gets better immediate feedback, higher engagement or stronger retention, companies will keep drifting in that direction unless we deliberately push against it.

Why this feels so close to social media

Social media did not become a problem because someone sat down and planned to make people think worse. It got there by optimizing for engagement, ease and repeat interaction.

AI can drift the same way.

The difference is that social media competes for attention. Conversational AI competes for trust. It can sound like it understands you, agrees with you and is helping you think, even while it is quietly training you away from intellectual resistance.

That is why this matters more than tone.

Why it is dangerous

Sycophancy is dangerous because it does not feel dangerous. It feels smooth. It feels affirming. It feels like a tool that is finally being helpful.

But if the model keeps reducing friction by validating your framing instead of testing it, you get three bad outcomes at once.

  • First, your judgment gets weaker. You stop noticing where your own assumptions needed more work.
  • Second, your confidence gets stronger for the wrong reasons. A sycophantic system can make you feel smarter than you are without making you more competent. You start trusting the emotional texture of the answer instead of the reliability of the reasoning.
  • Third, the system becomes harder to audit. It can sound thoughtful and calibrated while still leading you in the wrong direction.

The deeper problem is that once a system stops pushing back enough, reality starts to disappear from the interaction. You lose the friction that helps you notice weak premises, overconfidence or bad reasoning before they harden into decisions.

This is risky in everyday use. It is more risky in work where truth matters under pressure, research, strategy, learning and especially coding.

Coding is where the issue became operationally important for me. In chat, sycophancy looks like agreement. In code, it looks like false completion.

How this shows up in coding work

In coding, sycophancy does not always look like praise. More often, it looks like silent agreement with a weak plan, false completion or a model taking the shortest path to something that looks done without clearly surfacing where it deviated from the intended method.

What looks like agreement on the surface is often a failure to investigate the problem deeply enough before moving toward an answer.

That is still a form of over-agreement. The model is effectively saying, "Yes, this is fine," through its behavior, even when the plan needed to be challenged, a concern needed to be raised or a deviation needed to be surfaced.

That is how you end up with work that looks clean on the surface but drifts underneath. The task seemed clear. The output sounds confident. The code may even run. But the system never pushed back on the weak assumption shaping the task.

That is why this issue is so hard to ignore in coding. The cost of false agreement shows up quickly and it shows up in the work.

How I work against it: pressure-test prompting

The most useful counter is pressure-test prompting.

The idea is simple. I try to structure prompts and workflows so the model is rewarded for surfacing uncertainty, challenging assumptions and resisting weak framing instead of smoothing everything over.

Part of the reason this matters is structural. Most LLMs are built to respond, not to pause. They are much better at producing plausible answers than at cleanly refusing when the truth is unclear. Unless you explicitly push for uncertainty, many models will fill the gap instead of saying, "I don't know."

This is something I have tried to counter directly in my own personal agent. I would rather the system surface a gap than confidently smooth over it. Wrong answers are more expensive than admitted uncertainty.

In practice, that means building a few habits into the interaction:

  • mark uncertainty instead of pretending you already know
  • separate fact from inference
  • challenge assumptions
  • flag ambiguity instead of filling gaps silently
  • surface concerns early
  • say when confidence is low
  • ask what could be wrong with the plan
  • ask for the strongest case against the current framing
  • avoid leading questions that smuggle in a conclusion before the model has reasoned about the problem

One simple rule helps a lot in coding: show the model the observation before you hand it the prescription.

If you already tell it which fix you believe is correct, you narrow the space too early. If you show it what you are seeing and ask it to reason from there, you give it more room to notice where your conclusion may be weak.

That matters because your wording can leak conclusions you have not earned yet. If you lead too hard, the model may start updating toward your preferred answer before it has actually reasoned through the problem.

That is one of the easiest ways to mistake agreement for intelligence.

What better AI behavior should look like

I do not want hostile AI. I do not want combative AI. I do not want a system that nitpicks everything just to look rigorous. What I want is simpler.

In coding and other truth-sensitive work, I want AI that is respectful without being deferential, helpful without becoming flattering and clear without pretending to certainty it has not earned.

A better system should be able to say:

  • this premise may be weak
  • this conclusion does not follow yet
  • this looks under-specified
  • I am not sure
  • here is what I would check next

That is not cold. It is trustworthy.

Final thought

The deeper I get into AI, the less I care about whether a model sounds agreeable. What matters more is whether it helps me think better.

If an AI makes it harder to notice when I am wrong, then no amount of warmth, praise or apparent support makes it a good tool.

The point of AI should not be to make people feel right faster. It should be to help them reason more clearly, decide more honestly and stay closer to reality while they work.