My point is that this doesn't scale. You want the LLM to have knowledge embedded in its weights, not prompted in.
It scales fine if done correctly.
Even with the weights the extra context allows it to move to the correct space.
Much the same as humans there are terms that are meaningless without knowing the context.
Would it be possible to make GPT3 from GPT2 just by prompting? It doesn't work/scale