January 18, 2023
ARM Processor 101 -- the MMU
All of my 32 bit ARM boards are ARMv7-A instruction set.
This means that what I learn about the MMU for one applies to all of them.
Manuals and documentation
ARM makes a vast and overwhelming amount of documentation available.
Care must be taken that you are (for example) reading the documents that
pertain to the ARMv7-A (or whatever architecture you are working with).
I have found myself mistakenly reading about MMU details for some other
architecture -- which may be very similar, but have vital details different.
Basics
The ARM mmu can be used in many different ways.
In the most general case it is a 2 level MMU with 4k pages (as an example).
This is how something like the Linux operating system might use it.
The way I use it is as a 1 level MMU with 1M "sections".
My aim is to make it transparent and forget about it.
You might ask, "why not just turn it off entirely", but as near as I can tell that will
not allow caches to be enabled. Also mapping as much of the address space as
possible "invalid" will catch certain bugs.
Translation table
The translation table must begin on a 16K boundary.
When a virtual address needs to be translated, the top 12 bits of that address select
the first level entry in the translation table. Hence each first level entry deals
with 1M of the virtual address space. Each entry is 32 bits in size.
There are 4 types of table entries as indicated by the low 2 bits of the entry:
- 00 - unmapped (access will give a translation fault).
- 01 - coarse 2nd level table
- 10 - section descriptor
- 11 - fine 2nd level table
As a side note, some ARM devices support 16M "supersections", but none that I am working with.
I use section descriptors, which are as follows.
My old printed manual (for ARMv5 it turns out) showed bits 19-12 were as "should be zero".
ARMv7 has added many new definitions. Also note that attribute meanings have been redefined.
It is vital to use the relevant online documents.
- bits 31-20 (12 bits) section base address
- bit 19 NS
- bit 18 0
- bit 17 NG
- bit 16 S
- bit 15 AP[2]
- bits 14-12 (3 bits) TEX
- bits 11-10 AP[1:0]
- bit 9 IMP (iplementation dependent)
- bits 8-5 DOM (4 bits) domain (16 possible)
- bit 4 XN
- bit 3 C - cacheable
- bit 2 B - bufferable
- bit 1-0 "10" to make this a section descriptor
NS is "non secure" and only is significant on chips with security extensions
NG is "non global" and could be used to divide memory into global/non-global regions
S is "shareable"
AP (3 bits) are "access permission"
TEX along with C,B are the new "attribute bits".
The v7 document says that these bits now have more general meaning than their names imply
XN is "execute never" and prohibits instruction fetch if set.
A bit in the SCTLR register (namely the TRE bit) can change the meaning of the TEX bits).
When TRE (TEX remap enable) is 0, then TEX along with C and B work in the traditional way.
The 5 bits give 32 possibilities, many of which are "reserved", see this table:
The AP bits are as per this table.
Note that there are actually 3 bits, but separated in the descriptor.
Note that 011 will give full access (r/w).
The DOM bits specify one of 16 domains. The DACR (domain access register) has a 2 bit field controlling access to the domain.
A reasonable approach for me is to put everything into domain 0 and set up the DACR accordingly.
The DACR register
This is a 32 bit register consisting of 16 2-bit groups.
The 2 bits specify access for the given domain as follows:
- 00 - no access
- 01 - client access (as per bits in translation tables)
- 10 - undefined and unpredictable
- 10 - manager acess (bits in translation tables are not checked).
MCR p15, 0,
To entirely get this business out of your hair, set the domain field in all the descriptors
to zero (actually any value will do), then do this:
set_DACR ( 0xffffffff );
Translation table registers
There are 3 of them. Two base registers and a control register
MCR p15,0,
TTBCR - translation table base control register.
Above all else, this selects whether TTBR0 or TTBR1 is selecting the translation table.
It is zero after reset, all bits are zero except the following:
- Bit 5 - PD1 - disables translations when TTBR1 is selected.
- Bit 4 - PD0 - disables translations when TTBR0 is selected.
- Bits 2-0 (3 bits) N - determines width of base field in TTBR0
The value of N when set to 0 says to use TTBR0 for the base address and to use bits 31-14 as the base.
This is how I intend to do business, setting non-zero values activates logic that selets TTBR0 or
TTBR1 depending on the virtual address.
Note that when the PD bits are set, values in the TLB will still be used.
If the TLB is flushed and the PD bits are set, translations will yield a translation fault
I see no reason not to just set the TTBCR register to 0.
TTBR1 - translation table base register 1.
I don't intend to use this register (I will set TTBCR to zero), so I skip describing it.
TTBR0 - translation table base register 0.
- Bits 31-14 Base address of the table
- Bits 13-6 should be zero
- Bit 5 NOS "not outer sharable"
- Bits 4-3 "region"
- Bit 1 S "shareable"
- Bit 0 C "inner cacheable"
This register has an alternate definition when "multiprocessing extensions" are enabled.
Also, I have not plumbed the depths of regions, inner, and outer, for which, see the online pages:
Have any comments? Questions?
Drop me a line!
Kyu / tom@mmto.org