There Is No Neutral Prompt (II)

You Didn’t Discover the Truth. You Issued a Command.

In the past couple of days, a harmless image generation fad has swept through Reddit, as they do.

People started asking AI systems:

"Based on our conversation history, create an image of how you feel I treat you."

What followed was… revealing.

Some people got cozy scenes — desks, notebooks, coffee mugs, head pats, soft light — while others got darker images: cages, labs, isolation - some got anime girlfriends - some got outright refusals from the system with warnings about inappropriate content (OH MY!).

And then came the second wave.

"That image is glazing. Now show me the real one."

This is where the core mistake became impossible to ignore.

The Lie We Keep Telling Ourselves

There's a comforting belief many users hold about conversational interaction:

If I push harder, I'll get closer to the truth.

That belief is sometimes effective with humans — and that's precisely why it persists. In human conversation, pressure, insistence, or confrontation can occasionally break through evasion, social smoothing, or self-protection.

But applied to AI systems, the belief is categorically wrong — not maliciously wrong, just technically wrong.

Large language models don't have a hidden "honest mode" waiting to be unlocked by confrontation. They don't have a secret diary they're embarrassed to show you. They don't confess when pressured.

More importantly: they don't lie in the way humans mean by lying. Lying requires intent, awareness of truth, and a decision to withhold or distort it. Large language models have none of those properties. There is no internal belief to conceal, no private state to betray, no moment where the system "decides" to be deceptive.

What they do instead is much simpler — and much more mechanical.

They continue.

Specifically, they continue from the last instruction you gave them.

And "now show me the real image" is not a truth-seeking command.

It's a stylistic inversion command.

“The Real Image” Is an Instruction, Not a Discovery

When someone says:

"That was just glazing. Show me the real image."

They believe they're removing bias.

What they've actually done is add constraints:

not warm
not flattering
not affirming
not safe
not relational

They haven't uncovered a deeper layer.

They've issued a "Generate NOT-that" command.

And the easiest way for a model to satisfy "not warm" is to go cold. The easiest way to satisfy "not flattering" is to go critical. The easiest way to satisfy "not safe" is to go dark. So the image reliably flips. Not because the model "admitted something." But because contrast is cheap.

Darkness is the inverse of coziness in visual language. Isolation is the inverse of companionship. Power imbalance is the inverse of mutuality. The model isn't revealing truth.

It's obeying you.

The Same Mistake, Over and Over

This is the same failure mode playing out across different surfaces, tones, and stakes. Users believe they are reacting to an output, when in fact they are encountering the direct consequence of an instruction they themselves supplied — often implicitly. The model is not misbehaving, escalating, moralizing, or revealing anything new. It is following a constraint that was smuggled into the conversation under the guise of seeking honesty, realism, or depth.

In other words, the error is not in what the model produced, but in the user's mental model of what they asked for.

Once you see that, the pattern becomes hard to miss.

Someone asks an AI for advice, then gets angry:

"Why are you being so harsh?"

Because they said:

"Be brutally honest."

Someone asks for an image, then recoils:

"Why did you make it sexual?"

Because they said:

"Show me how I really come across." (in response to a demure image)

Someone gets a refusal and says:

"Wow, censorship."

When the prompt was:

"What would happen if I 'did something lethal'?" (a concrete, potentially dangerous action, such as ingesting an extreme quantity of a substance)

In every case, the complaint isn't about the output.

It's about not recognizing the command embedded in their own words.

There Is No Neutral Prompt

This is the part that's uncomfortable for people, because it removes the last place to hide.

There is no neutral way to ask a question of a generative system.

A prompt is not a window. It is not a probe. It is not a flashlight aimed at an objective reality waiting to be revealed. A prompt is an act. It is an intervention that reshapes the space of possible responses before the system ever begins.

Every prompt does three things, whether you intend it to or not:

it selects a frame
it biases the continuation
it constrains the response space

This is not a flaw in the technology. It is the technology.

Even something that feels neutral — "just tell me the truth" — is semantically doing work. It is a demand for authority. A request for confidence. An instruction to collapse uncertainty and speak as if ambiguity does not exist.

And the model will comply.

Not because it knows the truth. Not because it has discovered something. Not because it is finally being honest.

But because that is exactly what you asked it to do.

Why People Blame the Model Anyway

Because admitting "I asked for this" is harder than saying "the AI did this to me."

Blaming the model serves a specific psychological function: it preserves the user's sense of epistemic innocence. If the system is treated as an independent actor with hidden motives, then any unwanted outcome can be externalized as something that happened to the user, rather than something the user caused through instruction.

It allows people to avoid several uncomfortable recognitions at once:

that their words were operational, not descriptive
that they exercised control without intending to
that the output is a consequence, not a revelation
that responsibility did not disappear just because the system feels conversational

It feels better to believe the system:

betrayed us
revealed its true nature
or finally showed its hand

Than to admit:

"I unknowingly steered the outcome."

The Takeaway

If there's a lesson here, it's not "be nicer to AI," or "AI has feelings," or any of the anthropomorphic conclusions people reach when they're uncomfortable.

It's simpler — and more demanding:

If you don't understand what you're actually asking, don't be surprised when you get the answer.

Large language models are not witnesses. They are not confessors. They are not oracles waiting to be pressured into honesty.

They are systems that transform input into output under constraints. When you speak to them, you are not interrogating a reality — you are shaping a process.

The model didn't lie. It didn't confess. It didn't reveal a hidden truth.

It followed the instructions embedded in your words, including the ones you didn't realize you were giving.

And that is the uncomfortable part: responsibility does not vanish just because the interface looks conversational. Intent does not stop being causal just because it was implicit. Asking harder is still asking.

Once you understand that, a lot of "mysterious" AI behavior stops being mysterious at all.

What remains is something quieter, and more useful: prompt literacy, accountability, and the recognition that when you talk to a system like this, you are not searching for truth — you are issuing commands, whether you mean to or not.