I'm trying to disable all level of cache for my machine Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz in Xen. I wrote a tool to call the following assemble code to disable/enable the cache and show the CR0 register's value.
case XENMEM_disable_cache:
__asm__ __volatile__(
"pushq %%rax\n\t"
"movq %%cr0,%%rax\n\t"
"orq $0x40000000,%%rax\n\t"
"movq %%rax,%%cr0\n\t"
"movq %%cr0, %0\n\t"
"wbinvd\n\t"
"popq %%rax"
: "=r"(cr0)
:
:);
// gdprintk(XENLOG_WARNING, "gdprintk:XENMEM_disable_cache disable cache!
// TODO IMPLEMENT\n");
printk("<1>printk: disable cache! cr0=%#018lx\n", cr0);
rc = 0;
break;
case XENMEM_enable_cache:
__asm__ __volatile__(
"pushq %%rax\n\t"
"movq %%cr0,%%rax\n\t"
"andq $0xffffffffbfffffff,%%rax\n\t" /*~0x4000000*/
"movq %%rax,%%cr0\n\t"
"movq %%cr0, %0\n\t"
"popq %%rax"
: "=r"(cr0)
:
:);
printk("<1>printk: enable cache; cr0=%#018lx\n", cr0);
rc = 0;
break;
case XENMEM_show_cache:
__asm__ __volatile__(
"pushq %%rax\n\t"
"movq %%cr0, %%rax\n\t"
"movq %%rax, %0\n\t"
"popq %%rax"
: "=r"(cr0)
:
:);
// gdprintk(XENLOG_WARNING, "gdprintk:XENMEM_show_cache_status! CR0 value is
// %#018lx\n", cr0);
printk("<1>printk: XENMEM_show_cache_status! CR0 value is %#018lx\n", cr0);
return (long)cr0;
The code can compile and run. After I run the disable cache code, the system becomes extremely slow, which confirms the cache is disabled. In addition, the value of CR0 shows the CD bit is set when I run the disable cache code.
However, when I run the show cache code, the output shows the CD bit of CR0 is 0, no matter I disable/enable cache.
My question is:
Is the CD bit(30bit) of CR0 register always set 1 when cache is disabled?
If not, there must be something wrong with my code, could you please help me point out the error I made?
ANSWER:
The above code only set the CD bit of the CR0 register on the core where the code is running. We need to use the smp_call_function() to call the code on all cores!
My new question is:
If I disable cache and then enable cache using the above code, the CD bit of CR0 is cleared. However, the system's performance is still very very slow, just like when I disable the cache. So it seems to me that enabling the cache code does NOT work? However, since CD bit has been cleared, the enabling cache code should have worked! So the question is: How long should I wait after I enable cache so that I can have the same performance just like the performance before I disable cache?
BTW, when I run the enble cache code, the printk output shows that the CR0's CD bit is 0.
smp_call_function()
? It's theoretically possible that your show-cache code is running on a different processor. I also recommend you read Intel's Software Developer Manual, Volume 3, Chapter 11 (Memory Cache Control), specifically section 11.5.1 "Cache Control Registers and Bits". – Iwillnotexist Idonotexistinclude/linux/smp.h
– Iwillnotexist Idonotexist