Let's say I have an instruction like this in x86 which would like to read data from an address in memory
mov eax, word_123456
Presumably this will fetch the data from memory. Now let's say I store it
mov word_123456, eax
I know from CPU architecture diagrams that there are caches in between random access memory and the CPU. If I ask to store the contents of a register in memory, does it always go to the L1 cache first? Who decides which cache it ends up in? Also, I'm curious if you can write/hint your x86 commands to specify whether a move operation should be stored in the cache or is going to be a rare read/write, etc.