The core idea is very hard to grasp here.
The story goes... We want to write programs. In languages. Programs are represented as text of such language, i.e code. We see the code. We can read an comprehend the code. It's not the program we want (i.e. it's blank). We want to change that. Not by adding or removing text. But via XXX. After doing XXX we see the new code. The new code is as expected. The new code represents a valid program, aka. correct/compiles/runs.
For that we learn a new language. XXX commands. Cyclic. To break this cycle, XXX commands must be in the same language as the program. But we weren't supposed to be writing the language.
A similar problem/solution is this. We want to write programs. In languages. Programs are represented as text. We alter programs by editing such text. The more text we type the greater chance of getting it wrong. We want to write as little text as possible. We make abstractions. We split up concerns. We give complex procedures short names. We type only the short name instead of typing out lengthy procedures. The less we type, the less we worry about getting it wrong. We eventually end up with one last short name, the name of our program.