January 14, 2025

Driving an LED panel - what is up with the ARM cache

This page is by no means specific to running a HUB75 panel. This is entirely about caches on the Zynq.

I am working with a xc7z010clg400-1 on a S9 Antminer board.

This particular Zynq has a dual core ARM section. The cores are ARM Cortex-A9 (ARMv7-A) and each has a 32K instruction and a 32K data cache. There is also a 512K L2 cache shared by both cores.

The I and D caches are 4 way set associative. The L2 cache is 8 way set associative.

How do we enable and/or disable these caches?
How do we determine if they are enabled or not?

I'll also note that enabling and disabling is not as simple as flipping the right bit somewhere. If you disable a cache, you must flush it first which typically involves some kind of loop through cache lines and knowledge of the cache structure. Enabling a cache requires proper setup of the MMU. Some parts of the address space (such as device registers) should not be cached and the MMU can enable or disable caching for each page in the address space.

Kyu startup messages

I put a bunch of code in armv7/cache.c that reads and displays various cache related registers. These tell us a lot about the cache structure, but not whether or not the various caches are enabled or not.
CLIDR = 09200003
 CLIDR - LoUU = 1
 CLIDR - Loc = 1
 CLIDR - LoUIS = 1
 CLIDR - L1 type = 3 I/D
CTR = 83338003
 CTR - minimum line in I cache = 32
 CTR - minimum line in D cache = 32
 CTR - CWG = 32
 CTR - ERG = 32
CCSIDR, L1-D = 701fe019
 supports Write back
 supports Read allocate
 supports Write allocate
 256 sets as 4 way
 line size = 8 words (32 bytes)
CCSIDR, L1-I = 201fe019
 supports Read allocate
 256 sets as 4 way
 line size = 8 words (32 bytes)
CCSIDR, L2-D = 701fe019
 supports Write back
 supports Read allocate
 supports Write allocate
 256 sets as 4 way
 line size = 8 words (32 bytes)
CCSIDR, L2-I = 201fe019
 supports Read allocate
 256 sets as 4 way
 line size = 8 words (32 bytes)
ARM cache line size: 32

The SCTLR register

This is a fundamental ARM register, the "system control register". It is a 32 bit register with bits to control the cache, mmu, and other resources.
Bit 0 -- 0001 -- enable the MMU
Bit 1 -- 0002 -- enable alignment checking
Bit 2 -- 0004 -- enable cache
Bit 12 - 1000 -- enable I cache 
So, what about bit 2, which cache(s) is/are being enabled? All indications are that this is the D cache, although the ARM documents are ambiguous about this.

During Kyu boot, I display these values:

orig SCTLR = 08c5187f
     SCTLR = 08c5187d
The "orig" value shows the state the processor is in as set up by U-boot. The only change I make is to clear the alignment checking bit. This lets me do improperly aligned writes to memory without generating a processor fault. Indeed the ARM can handle misaligned writes correctly, albeit with a performance penalty. There are quite a few of these that pop up in network code, and turning off this bit is by far the easiest way to deal with those.

We can see that bit 2 is indeed set, enabling the D cache, as well as bit 12 to enable the I cache. It would be an interesting experiment to disable the I cache (which would involve flushing it first) to see what the performance penalty would be.

What about the L2 cache?

This can be different from chip to chip, i.e. it is not dictated by the ARM core. The Zynq TRM has a section on this beginning on page 92 (section 3.4).

Things get spooky. Page 101 in the TRM says that before anything else, you must write 0x020202 to register F8000A1C. This is in the slcr (system level control registers), and when you look in that section, the register is labeled as "reserved" with mandatory values for certain fields.

The L2 cache controller has its own registers that are not part of the slcr. These are documented on page 1393 of the TRM. There are lots of complexities. See section 3.4 in the TRM for all the details. The control register at 0xF8F02100 has a single bit that enables or disables the L2 cache. Writing some code to look at this register, I find:

Zynq L2 control: 00000000
So, it looks like the L2 cache is off. I will need to study all the details and try turning this on.
Have any comments? Questions? Drop me a line!

Tom's Electronics pages / tom@mmto.org