January 18, 2023

ARM Processor 101 -- 32 bit special registers

This is for 32 bit ARM, 64 bit plays the same game, but with different rules.

There are a bunch of these registers and a special duo of instructions to read and write them. As an example, consider this code:

        mrs     r0, cpsr                /* disable interrupts */
        orr     r0, r0, #PSR_INT_DIS    /* both IRQ and FIQ */
        msr     cpsr, r0
The "mrs" register reads a special register (here it reads the special register "cpsr" into the general register "r0".

The "msr" register writes a special register (here it writes the value contained in the general register "r0" into the special register "cpsr".

The CPSR register

This is the all important "current program status register". It holds the condition codes (the top 8 bits) and 8 control bits (the low 8 bits). The 5 "M" bits determine the processor mode. Most have to do with exceptions and interrupts. Only 7 of the 32 possible values are defined. Supervisor mode is what is used by a typical operating system.

Some newer processors define other modes like 0x16 (Monitor) and 0x1a (Hypervisor) though these may only exist in arm64.

System mode is special, and I have never used it. It is not as disruptive as switching to supervisor mode, but allows access to privileged resources.

On a project I am working on right now I see the CPSR = 60000193. So the mode is 0x13 (supervisor) as handed me by U-Boot. It looks like the I interrupt is masked off (but not the F).

The SPSR register

This saves the CPSR register during an interrupt or exception, something we aren't going to talk about here or now.

While we are not talking about exceptions, I'll note that during an exception some registers are "banked", i.e. the processor switches to a different register specific to that exception. This sort of thing is what puts the "fast" into FIQ. When an FIQ interrupt happens, a new banked set of registers r8 to r14 gets switched to. All other exceptions switch to a new r13 and r14 during the exception.

Coprocessor registers

This is the ugliest and most obnoxious things about ARM 32. They redid all of this in ARM 64, making the world wonderful again. The idea here is that we have a pair of mrc and mcr instructions, very much like the mrs and msr instructions that have already been described. These read and write a coprocessor register. The acronym "MRC" is intended to stand for "move to arm Register from Coprocessor register".

The important thing here is "Coprocessor 15", which is the system coprocessor. This is really a part of the ARM processor (and an essential part I might add) but you deal with it as if it was a coprocessor. As a taste of all this, here is some gcc inline assembly to access the CCNT register:

#define get_CCNT(val)   asm volatile ( "mrc p15, 0, %0, c9, c13, 0" : "=r" ( val ) )
#define set_CCNT(val)   asm volatile ( "mcr p15, 0, %0, c9, c13, 0" : : "r" ( val ) )
Here "p15" specifies coprocessor 15 and the rest of the rubbish you look up in some table that tells you that it is the magic to access the CCNT register.

Lots of important registers are accessed in this way. Most sane people look them up and then define macros or routines with sensible names to access them, as I did above.

Registers to set up the MMU, control caches, and such are all accessed through this crazy scheme.

Honestly, why was this indignity foisted upon us? It is almost as though we are forced to do what an assembler should do and hand generate the instruction encoding for these register accesses. Why instead doesn't the assembler do the work and allow us to write sensible things like:

    mrc	r0, CCNT
    mcr	CCNT, r0
Why indeed? We can certainly write code like this for ARM 64. In fact for ARM 64 the whole mrc,mcr and coprocessor business has been done away with and all these registers have been promoted to the status of proper citizens.
    mrs	x0, mpidr_el1
    msr	daifclr, #0x2

The SCTLR and ACTLR registers

MCR p15,0,,c1,c0,0    ; Write CP15 System Control Register
MCR p15,0,,c1,c0,1    ; Write CP15 Auxiliary Control Register
The last bit (M) is interesting. The idea is that all registers (including MMU) would be set up before enabling the MPU (this bit is zero after reset)

The ACTLR (auxiliary system control register)

The ARMv7-A manual calls this "implementation defined". What this means is that it is unique to each Cortex variant, and you need to be on your toes.

The tables below give a birds eye view of things, but you will almost certainly have to read the details in the specific Cortex reference manual.

Cortex-A9

Bits Name Function
[31:10] - Reserved.
[9] PARITY Support for parity checking, if implemented.
[8] AOW Enable allocation in one cache way only.
[7] EXCL Exclusive L1/L2 cache control.
[6] SMP Enables coherent requests to the processor.
[5:4] - Reserved.
[3] WFLZM Enable write full line of zeros modea.
[2] L1PE Dside prefetch.
[1] - Reserved.
[0] FW Cache and TLB maintenance broadcast.

Cortex-A8

Bits Name Function
[31] L2RSTDIS L2 hardware reset disable.
[30] L1RSTDIS L1 hardware reset disable.
[29:21] - Reserved.
[20] CMP Disable pipelined cache maintenance
[19] CLKSTOPREQ Disable clock stop
[18] CPSERIAL serialize CP14 and CP15 instructions
[17] CPWAIT wait for memory for all CP14 and CP15 instructions
[16] CPFLUSH flush pipeline for all CP14 and CP15 instructions
[15] ETM prevent ETM clock from being stopped
[14] NEON prevent NEON clock from being stopped
[13] MAIN prevent MAIN clock from being stopped
[12] NEON_S force NEON single issue
[11] LDSTR_S force load/store single issue
[10] ALL_S force ALL instructions to be single issue
[9] PLDNOP execute PLD instruction as a NOP
[8] WFINOP execute WFI instruction as a NOP
[7] BTB prevent BTB mispredicts
[6] IBE invalidate BTB enable
[5] L1NEON enable NEON caching in L1
[4] ASA enable speculative acesses on AXI
[3] L1PE enable L1 cache parity detection
[2] - Reserved
[1] L2EN enable L2 cache
[0] L1ALIAS enable L1 cache alias checks

Cortex-A7

Bits Name Function
[31:29] - Reserved.
[28] DDI Disable Dual Issue
[27:16] - Reserved.
[15] DDVM Disable Distributed Virtual Memory transactions.
[14:13] L1PCTL L1 Data prefetch control.
[12] L1RADIS L1 Data Cache read-allocate mode disable.
[11] L2RADIS L2 Data Cache read-allocate mode disable.
[10] DODMBS Disable optimized data memory barrier behavior.
[9:7] - Reserved.
[6] SMP Enables coherent requests to the processor.
[5:0] - Reserved.

Cortex-A5

Bits Name Function
[31:29] - Reserved.
[28] DBDI Disable Branch Dual Issue
[27:19] - Reserved.
[18] BTDIS Disable indirect Branch Target Address Cache (BTAC).
[17] RSDIS Disable return stack operation.
[16:15] BP Branch prediction policy.
[14:13] L1PCTL L1 Data prefetch control.
[12] RADIS Disable Data Cache read-allocate mode.
[11] DWBST Disable AXI data write bursts to Normal memory.
[10] DODMBS Disable optimized data memory barrier behavior.
[9:8] - Reserved.
[7] EXCL Exclusive L1/L2 cache control.
[6] SMP Enables coherent requests to the processor.
[5:1] - Reserved.
[0] FW Cache and TLB maintenance broadcast.


Have any comments? Questions? Drop me a line!

Kyu / tom@mmto.org