LLMs don't see words. They see tokens, which is why previously they had a hard time counting the r's.

You can certainly prove that mathematically, and giving that proof to an LLM it will give you the correct answer. Which is a prompting technique btw to improve accuracy of an LLMs results.

Although the more recent LLMs have built in approaches to answer this question.

Gemma4 for example:

-----

The user is asking for the number of 'r's in the word "strawberry".

* S

* T

* *R* (1)

* A

* W

* B

* E

* *R* (2)

* *R* (3)

* Y

There are 3 'r's.

-----

GPT-OSS-120b example.

-----

The user asks: "How many r's in strawberry." This is a straightforward counting problem: count the letter 'r' in the word "strawberry". The word "strawberry" contains letters: s t r a w b e r r y. Count r's: there are three 'r's? Let's check: s(1) t(2) r(3) a(4) w(5) b(6) e(7) r(8) r(9) y(10). Actually the word is "strawberry". Let's write out: s, t, r, a, w, b, e, r, r, y. So there are three r's: at positions 3, 8, 9. Yes, three r's. So answer: 3.

But we may also note that "r" appears three times.

Thus answer: 3.

We can provide a short answer.

Thus final: There are three r's in "strawberry".

----

kang8 hours ago | | | parent | | on: 47762761
Doubt if you can make a dumb model smart by feeding it proofs
EagnaIonat7 hours ago | | | parent | | on: 47763448
Tade05 hours ago | | | parent | | on: 47764180
Sohnds like a great way to fill up the context before you even start.
falcor845 hours ago | | | parent | | on: 47765050
Yes, what's your point? That is literally what it does - it adds relevant knowledge to the prompt before generating a response, in order to ground it me effectively.