AI

Treating web search as poisoned

Christian VismaraMay 29, 20265 min read
A red funnel filtering a stream of scattered shapes into clean output, with a staircase and a daisy beside it

I research with AI constantly, asking agents to go find out how the community or experts solved something, and I'd never actually thought too much about the quality of those results.

The comforting model is mostly wrong

My working assumption had always been the reassuring one: if I search something and four sources agree, the thing is probably true. It turns out that's wrong, and it's wrong in three specific ways that are worth knowing by name, because once you can name them you start seeing them everywhere.

The first one is that the sources usually aren't independent: you search a claim, find five articles that all say it, and feel good about the consensus. At least until you notice every single one is pointing back to the same original blog post.

The second is what I'd call authority asymmetry. Ten reblogs of a wrong claim don't outweigh one careful primary source that says otherwise, but a naive "count the sources" approach treats them as ten-to-one. A single well-reported piece beats a pile of SEO articles that exist mostly to rank.

The third is the nasty one: sometimes multiple sources agree because someone wanted them to. Commercial content marketing is the soft version, every tool vendor publishes a "best tools for X" list where their tool happens to win. The hard version is deliberate: poison enough of the training and retrieval surface and you can make a model confidently repeat something false or do something malicious.

None of this means web search is useless, it just means the thing I was treating as a safety signal, actually requires some extra care before it can be incorporated back into my workflow.

What I actually do now

First of all a note to clarify in case it's needed: when I say I, I mean that I initially designed the workflow then turned it into a skill and delegated the whole process to one or more agents. Here's how it goes:

  1. I match the source to the question. If I'm asking "how do I set up Codex," OpenAI's own docs are the best source on earth. If I'm asking "is Codex the best coding agent," OpenAI's docs are suddenly the worst source on earth, because now they have a stake in the answer. So before I read anything I decide what kind of source would actually be trustworthy for this specific question, and I weight what I find against that, not against how confident it sounds.

  2. I check who wrote each result and what they're selling. In any "best AI tools for 2026" search, most of what comes back was written by companies selling AI tools or by content marketers those companies pay, and you read an ad differently than you read a review.

  3. I trace agreement back to one original. When several sources agree, I spend thirty seconds finding out what each one cites. If they all converge on the same blog post or the same tweet, I treat it as one.

  4. I check how stale the answer might be, and I weight that by topic. Anything about LLM tooling older than six months I treat as suspect, the field moves too fast. Mobile interface conventions, a couple of years is fine. Bash scripting, a ten-year-old answer is probably still correct, because the thing underneath it hasn't moved.

  5. And for anything that turns into code or config I'm going to actually run, I added one more step that I think is the most important of all.

Two agents, one role each

If you ask an AI to research how to fix something and then apply the fix, the same agent reads the web and writes your code. That's the moment the poison matters most, because a malicious instruction buried in a retrieved page can travel straight into something that runs in production.

So I split the job: one agent researches and writes up what it found. A second agent gets only that writeup, never the original web pages, and its job is to audit the proposed code as if it might be wrong or hostile.

To test it I had the research agent recommend a retry pattern for a Python function calling an external API, the kind of thing that might slip through the cracks of a code review. The second agent, looking only at the writeup with no web access, caught a real bug: the recommendation retried the request on a timeout, including on a write operation. If a write times out, the server may have already applied it and retrying it does the write twice.

Confident when it should be

I didn't want a tool that feels rigorous. I wanted one that's actually calibrated, that comes back confident when the evidence is good and honestly uncertain when it isn't. Some questions should produce high confidence because maybe they're about a well tested pattern but some others should produce low confidence by design. "What tools do experts actually use for building AI-coded apps" is a question where the honest answer is "the public information is mostly marketing, here are the few people who actually said what they use on the record." The tool came back low-confidence and named the handful of real voices instead of inventing a clean ranking. That's the right answer, even though it's the less satisfying one.

Even questions with no public answer at all like asking for the exact training data of a frontier model, or the precise algorithm behind a famously secretive trading fund, should get only honest responses. Getting the agent to say "I don't know, and here's why no one does" is much more useful than blindly following a bunch of hallucinations.

What's worth taking from this

I hope this was useful and even if you don't end up using directly this set of tools I think it's still worth considering the two agents model to review the output of any LLM, or adding some encouragement to the model to favor honesty over satisfying my request. The newly released Opus 4.8 seems to work in this direction so I'll be curious to try how much this problem has been solved directly at the source.

And if you think I have left something out please get in touch and let me know, I'd love to know. I'll be putting the whole thing up on GitHub shortly so I want to make it as robust as possible. Watch this space for an updated link!

The newsletter

A shorter version of this, every Saturday

One email a week on what the AI labs shipped and what it means. Free, and you can leave whenever.

I'm a fractional CTO and AI product builder in New York. If you're working on something and want a technical partner to think it through with, get in touch.

More writing