When developers talk about building software (cross compiling) for a different machine (target) compared to their own (host) they use the term toolchain
to describe the set of tools necessary to build binary files. That's because when you need to build an executable binary, you need more than a compiler.
You need routines (crt0.o) to initialize runtime according to requirements of operating system and standard libraries. You need standard set of libraries and those libraries need to be aware of the kernel on target because of the system calls API and several os level configurations (f.e. page size) and data structures (f.e. time structures).
On the hardware side, there are different set of ARM architectures. Architectures can be backward compatible but a toolchain by nature is binary and targeted for a specific architecture. You can have the most widespread architecture by default but then that won't be too fruitful for an already constraint environment (embedded device). If you have the latest architecture, then it won't be useful for older architecture based targets.
When you build a binary on your host for your host, compiler can look up all the necessary bits from its own environment or use what's on the host - so most of the above details are invisible to developer. However when you build for a different target than your host type, toolchain must know about hardware, os and standard library details. The way you tell these to toolchain is... by building it according to those details which might require some level of bootstrapping. (or you can do this via extensive set of parameters if toolchain supports / built for it.)
So when there is a generic (stock) cross compile toolchain, it has already some target specifics set and that might not meet your requirements. Please see this recent question about the situation on Ubuntu for an example.