January 18, 2023

ARM Processor 101 -- the MMU

All of my 32 bit ARM boards are ARMv7-A instruction set. This means that what I learn about the MMU for one applies to all of them.

Manuals and documentation

ARM makes a vast and overwhelming amount of documentation available. Care must be taken that you are (for example) reading the documents that pertain to the ARMv7-A (or whatever architecture you are working with). I have found myself mistakenly reading about MMU details for some other architecture -- which may be very similar, but have vital details different.

Basics

The ARM mmu can be used in many different ways. In the most general case it is a 2 level MMU with 4k pages (as an example). This is how something like the Linux operating system might use it.

The way I use it is as a 1 level MMU with 1M "sections". My aim is to make it transparent and forget about it. You might ask, "why not just turn it off entirely", but as near as I can tell that will not allow caches to be enabled. Also mapping as much of the address space as possible "invalid" will catch certain bugs.

Translation table

The translation table must begin on a 16K boundary. When a virtual address needs to be translated, the top 12 bits of that address select the first level entry in the translation table. Hence each first level entry deals with 1M of the virtual address space. Each entry is 32 bits in size.

There are 4 types of table entries as indicated by the low 2 bits of the entry:

As a side note, some ARM devices support 16M "supersections", but none that I am working with.

I use section descriptors, which are as follows. My old printed manual (for ARMv5 it turns out) showed bits 19-12 were as "should be zero". ARMv7 has added many new definitions. Also note that attribute meanings have been redefined. It is vital to use the relevant online documents.

NS is "non secure" and only is significant on chips with security extensions
NG is "non global" and could be used to divide memory into global/non-global regions
S is "shareable"
AP (3 bits) are "access permission"
TEX along with C,B are the new "attribute bits". The v7 document says that these bits now have more general meaning than their names imply
XN is "execute never" and prohibits instruction fetch if set.

A bit in the SCTLR register (namely the TRE bit) can change the meaning of the TEX bits).
When TRE (TEX remap enable) is 0, then TEX along with C and B work in the traditional way.

The 5 bits give 32 possibilities, many of which are "reserved", see this table:

The AP bits are as per this table. Note that there are actually 3 bits, but separated in the descriptor. Note that 011 will give full access (r/w).

The DOM bits specify one of 16 domains. The DACR (domain access register) has a 2 bit field controlling access to the domain. A reasonable approach for me is to put everything into domain 0 and set up the DACR accordingly.

The DACR register

This is a 32 bit register consisting of 16 2-bit groups. The 2 bits specify access for the given domain as follows:
MCR p15, 0, , c3, c0, 0    ; Write Rt to DACR
To entirely get this business out of your hair, set the domain field in all the descriptors to zero (actually any value will do), then do this:
set_DACR ( 0xffffffff );

Translation table registers

There are 3 of them. Two base registers and a control register
MCR p15,0,,c2,c0,2    ; Write CP15 Translation Table Base Control Register
MRC p15,0,,c2,c0,0    ; Read CP15 Translation Table Base Register 0
MRC p15,0,,c2,c0,1    ; Read CP15 Translation Table Base Register 1

TTBCR - translation table base control register. Above all else, this selects whether TTBR0 or TTBR1 is selecting the translation table.
It is zero after reset, all bits are zero except the following:

The value of N when set to 0 says to use TTBR0 for the base address and to use bits 31-14 as the base. This is how I intend to do business, setting non-zero values activates logic that selets TTBR0 or TTBR1 depending on the virtual address. Note that when the PD bits are set, values in the TLB will still be used. If the TLB is flushed and the PD bits are set, translations will yield a translation fault

I see no reason not to just set the TTBCR register to 0. TTBR1 - translation table base register 1. I don't intend to use this register (I will set TTBCR to zero), so I skip describing it. TTBR0 - translation table base register 0.

This register has an alternate definition when "multiprocessing extensions" are enabled. Also, I have not plumbed the depths of regions, inner, and outer, for which, see the online pages:


Have any comments? Questions? Drop me a line!

Kyu / tom@mmto.org