Why There Are Two Rs in Strawberry

Taxonomy, Linguistics, and the Art of Not Grunting at ChatGPT

When you ask ChatGPT how many Rs there are in strawberry, the answer you get might not be what you expect.

Let’s say it tells you: two.

At first, you might think, obviously wrong. The word Strawberry contains three Rs.

Case closed. AI is dumb. Humanity wins.

But it’s not that simple. In fact, the mistake isn’t even on ChatGPT’s end — it’s on yours. Or more accurately, it’s in how language works, how taxonomy is structured, and how LLMs (like ChatGPT) are trained to interpret human noise and turn it into structured queries.

Let’s break it down.

First problem: What is a strawberry?

This sounds like a trick question, and in a way, it is.

You might be thinking of something red, juicy, and edible. You’re imagining something with seeds on the outside, maybe dipped in chocolate.

But that’s a physical object. And you can’t count letters in an object.

You can’t count Rs in a strawberry any more than you can measure the vitamin C in the word apple.

This is what René Magritte was getting at in The Treachery of Images: “This is not a pipe.” It’s a representation of a pipe.

Likewise, when you say “how many Rs in strawberry?” — what you’re trying to say is:

“Please count the number of times the letter R appears in the word ‘strawberry’.”

But unless you specify that, ChatGPT has to make an educated guess based on what a “strawberry” typically refers to.

Second problem: How ChatGPT stores knowledge

Let’s talk about language — and why scaling it isn’t as intuitive as it sounds.

Imagine you’re running a library. You could try to stock a full set of encyclopedias in all 7,159 living languages on Earth. Sounds noble, but the storage requirements would be absurd. Worse, you’d be duplicating the same knowledge thousands of times — just dressed up in different languages.

Or — you could take a different approach: one master encyclopedia in a single language, plus a comprehensive set of dictionaries to translate everything else. Suddenly, your library is lean, efficient, and scalable.

That’s how modern large language models operate. They don’t memorize the internet in 7,159 tongues. They learn deeply in one, and map the rest through translation layers. It’s faster, cheaper, and ironically, more inclusive.

ChatGPT stores compressed representations of patterns in language — not hard-coded facts. To build those patterns, it needed a training corpus. And while that corpus is multilingual, it is not evenly distributed. English is overrepresented. Scientific disciplines (like botany and zoology) lean heavily on Latin. So does medicine. So ChatGPT doesn’t just “think in English” — it also “thinks in Latin” when the subject area demands it.

This brings us to the issue of taxonomy.

When you say strawberry, ChatGPT doesn’t just hear the English word. It starts triangulating:

Are we talking about the fruit? The physical object?
The plant? The Garden Strawberry? Perhaps an Alpine Strawberry?
The species? F. × ananassa? F. vesca?
A family of species?

And in botany, the family of strawberries is called: Fragaria.

So if the model thinks you’re referring to the plant genus and not the word, it considers the question in latin, pulls “Fragaria” as its best match, counts the Rs: two, and then responds in English.

Is that a mistake?

Not really. You didn’t tell it which layer of abstraction you were referring to. It did the best it could. It tried to make a vague question meaningful.

That’s what large language models do.

Grunting ≠ Prompting

Something that has stuck in my head since high school is my Polish language teacher telling me to describe things precisely — don’t say buttery butter.

That advice lives rent-free in my brain every time someone yells a half-formed prompt at an AI and gets confused when it doesn’t respond with divine clarity.

Talking to ChatGPT is not the same as barking commands into a walkie-talkie or yelling at Siri.

It’s more like drafting a Google search for a Martian librarian: You need to tell it what realm of meaning you’re operating in — not just the sound of your grunts.

So if you say:

“How many Rs in strawberry?”

It has to interpret:

Are you counting Rs in a word?
Are you referring to a plant family?
Are you mixing symbolic language with literal?

If you want the word, say:

“Count the number of letter R’s in the word ‘strawberry’.”

If you want the Latin genus:

“What is the botanical name of the common strawberry?”

If you want a smoothie recipe, that’s a whole other prompt.

The Takeaway

Large Language Models don’t actually “know” things the way you do.

They operate probabilistically, weighing options based on likely intent. And when you’re vague, they don’t just sit there confused — they try to help by narrowing the scope to a plausible field of knowledge.

This means ChatGPT will sometimes “hallucinate” — but more often than not, it’s just interpreting your bad question in the most coherent way it can.

The better the prompt, the better the answer.

If you’re going to talk to the future of computing, skip the grunt.

Speak with intention. Use your words.

Why There Are Two Rs in Strawberry
Taxonomy, Linguistics, and the Art of Not Grunting at ChatGPT

Blog