3
votes

Context and goal

I'd like to run two fully standalone applications on my Olimex A20 Lime platform that run a ARM Cortex-A7. The goal is to dedicate one core to each application. So far so good.

Now I'd like to divide the L2 cache between the cores in the following manner:

       L2 cache (256KB)
---------------------------
|    CPU0    |    CPU1    |
|   (128KB)  |   (128KB)  |
---------------------------

Therefore, each core would have only access to his private 128KB of L2 cache.

Question

How can I divide the L2 cache between the cores on a ARM Cortex-A7?

From my understanding, on previous SoC, an external cache controller like the PL310 was often used. Now, newer SoC like the Cortex-A15 and the Cortex-A7 uses an integrated cache controller. This controller is somehow integrated into the SCU component.

I've found in the CP15 system some registers that are related to cache like the CSSELR, CCSIDR, CLIDR, etc., even the System Control Register (SCTLR). But none of them seems to let me configure a size for each core. Is that still possible to do?

Thanks for your help.

Edit

Here, by standalone application I mean in fact Linux OS. So the goal is to dedicate one core to one OS. Therefore each OS runs on (see) a monoprocessor system underneath. The whole framework is already running, so far so good.

Thanks to the answers I've received, I now understand that it should be OK for the cores to both use the L2 even if they are standalone OS not using the same virtual mapping. Actually it's indeed the same as 2 processes having they own virtual address space.

However the last thing that bothers me is the security aspect:

If both cores share the whole L2 cache, is it technically possible for one core to access cached data of the other core?

References

2
What problem are you trying to solve by doing this?unixsmurf
It maybe possible depending on the configuration of the L2 cache PL310 or PL4xx, but as unixsmurf implies, it may not be helpful. Say task 1 is memory bound and task 2 is CPU bound, then you want the L2 to go to the first task. So while it is possible (given different AXI bus interfaces to L2) it might not be beneficial. There are probably better ways to spend your effort to make the system better...artless noise
Think about it - the two OS instances could hit the same cache entries if and only if they access the same physical addresses - if accesses to two different physical addresses were able to return the same data, the cache would be fundamentally broken. If you really care about isolation then run SMP Linux with KVM on the board, then run a single-core application VM pinned to each host CPU.Notlikethat
Are you using trustzone? The L2 is trustzone aware and will attempt to keep L2-secure locked and not evicted by normal world L2 activities (same for L1). There is a remote information leak like as per Colin Percival's cache miss with hyper-threading; but this is even more difficult with TrustZone as the context switch granularity is larger. If you don't use TrustZone, then either OS may map the physical memory and the cache is the least of the problems.artless noise
HDL is hardware description language. ARM gives code for the L2 logic and a vendor may set parameters to this cache. They may have two AXI BUS interfaces to the L2 and there is some sort of prioritization on this data; but not all PL310 have this feature as it is a parameter. There are feature registers in the PL310 interface to determine what parameters have been used.artless noise

2 Answers

3
votes

Two pieces of code that don't use the same physical memory will not cause any cache conflicts, as cache is physically tagged on A7 processors (any ARM processor with virtualization extensions).

On A7, cache lines are also VM id tagged. So if you want to enforce separation between codes running on two cores you could setup a second stage pagetable for each core and mark them with different VM id's. Any violation of address space by EL0/1 will cause a trap to EL2 (Hypervisor). This is very similar to how EL1 enforces separation of EL0 address spaces.

To configure this you will have to have access to the bootcode. Usually from secure EL1/EL3 bootcode directly switches to Non-Secure EL1 mode. You will have to modify this flow and switch to EL2 mode instead. While in EL2 mode setup and enable non-intersecting 2nd stage page table for each core. Also setup an EL2 vector table to catch your 2nd stage MMU exceptions.

This will result in a minor drop in performance. This will be more efficient than using KVM (last time I checked KVM is not very suited for ARM v7 and causes a lot of overheads due to design). XEN is more suited for ARM, but will require a lot of setup from your side.

If you are not planning to use virtualization extensions/ 2nd stage page tables / SMP; you could also probably turn off ACTLR.SMP bit. This might give you a bit of boost in performance as L1 cache concurrency blocks will be turned off.

Note: This answer is for the edited question

1
votes

In addition to being a cache, the L2 cache also helps with cache coherency between L1 caches of different cores. If you somehow manage to pull it off (private L2 caches for each core) you will lose your SMP characteristics. Moreover the L2 cache controller would be already taking care of loading up the cache with data/code used by all cores, this would be better than statically dividing your caches at bootup.