September 29, 2016

U-boot internals

If you are looking at a comprehensive study of U-boot, you are not going to find it here. I have spent a little bit of time looking inside and using the U-boot sources to learn what I can about the program. It is big. It inherits a lot of the linux include file interdependency hell. It has its own way of doing things, but all in all is a well written program that is worth spending some time with.

By my count (a quick look at the 2015.10 version) there are about 15,400 source files. There are 1065 different config files in the "configs" directory. This is no small potatoes by anyones count. The uncompressed tar file is 69 megabytes. This is a big program. The good news is that a lot of it can be ignored.

My interest is in learning what I can about the BBB. Here is a vital tip. Almost all of the source files are irrelevant for one specific target! You are constantly digging through reams of material you ought to just ignore. But the hot tip is this. Once you configure and build for your chosen target, the object files act like breadcrumbs and you can use them to learn which source files are actually involved in the build for your target! For the BBB this turns out to be about 200 files. Now that is a manageable number! Here is the list from the 2015.01 sources. I need to repeat the exercise for the 2010.10 source tree one of these days, but this will have to do for now. Most of the file names have not changed.

My immediate interest is in learning about how U-boot sets up the mmu and cache for the BBB. A quick look at the above list suggests examining the following files:

So this approach quickly narrows down our study to 8 files in 2 directories. Imagine groping around through a tree with over 15,000 source files. I can easily imagine it because I have done it.

Note that this approach leads you quickly to the relevant C source files. The header files are another ball of string. I have thought of writing a script to extract only those files relevant to the BBB build into a special pruned source tree for study. The makedepend program has the logic to determine which source files a given C file is dependent on an perhaps could be pressed into service.

BBB cache and mmu setup

It turns out that the above 8 files are no help here, but it was a nice thought. The term "omap" is some kind of Texas Instruments product lingo for the family of parts the AM3359 is derived from or is part of.
We find in omap-cache.c this function:
void enable_caches(void)
        /* Enable D-cache. I-cache is already enabled in start.S */
There are also two other functions in this file that merit careful study (I am not taking space to show the source here).
void dram_bank_mmu_setup(int bank)
void arm_init_domains(void)
We find start.S (and two other files) in the following location. I will note in passing that the file cache_v7.c has the code I "borrowed" to flush and invalidate selective cache lines.
Other extremely relevant files are these:
The jackpot seems to be arch/arm/lib/cache-cp15.c. In this file dcache_enable() is conditionally defined based on the macro CONFIG_SYS_DCACHE_OFF and it is a null routine if this macro is defined. The MMU setup is done in this file also and it is worthy of careful study.

Build U-boot with the D cache disabled

I add the line CONFIG_SYS_DCACHE_OFF=y to my bbb_defconfig file and rebuild U-boot to see if this will provide an environment Xinu can boot from. It does not work.

U-boot cache handling

Relevant files:
Interesting code follows, collected from both of the above files. Note that both the D-cache and MMU (and I-cache for that matter) are all controlled by bits in the control register. CR_M controls the MMU, CR_C controls the D cache, CR_I controls the I cache.
 * CR1 bits (CP#15 CR1)
#define CR_M    (1 << 0)        /* MMU enable                           */
#define CR_A    (1 << 1)        /* Alignment abort enable               */
#define CR_C    (1 << 2)        /* Dcache enable                        */
#define CR_W    (1 << 3)        /* Write buffer enable                  */
#define CR_P    (1 << 4)        /* 32-bit exception handler             */
#define CR_D    (1 << 5)        /* 32-bit data address range            */
#define CR_L    (1 << 6)        /* Implementation defined               */
#define CR_B    (1 << 7)        /* Big endian                           */
#define CR_S    (1 << 8)        /* System MMU protection                */
#define CR_R    (1 << 9)        /* ROM MMU protection                   */
#define CR_F    (1 << 10)       /* Implementation defined               */
#define CR_Z    (1 << 11)       /* Implementation defined               */
#define CR_I    (1 << 12)       /* Icache enable                        */
#define CR_V    (1 << 13)       /* Vectors relocated to 0xffff0000      */
#define CR_RR   (1 << 14)       /* Round Robin cache replacement        */
#define CR_L4   (1 << 15)       /* LDR pc can set T bit                 */
#define CR_DT   (1 << 16)
#define CR_IT   (1 << 18)
#define CR_ST   (1 << 19)
#define CR_FI   (1 << 21)       /* Fast interrupt (lower latency mode)  */
#define CR_U    (1 << 22)       /* Unaligned access operation           */
#define CR_XP   (1 << 23)       /* Extended page tables                 */
#define CR_VE   (1 << 24)       /* Vectored interrupts                  */
#define CR_EE   (1 << 25)       /* Exception (Big) Endian               */
#define CR_TRE  (1 << 28)       /* TEX remap enable                     */
#define CR_AFE  (1 << 29)       /* Access flag enable                   */
#define CR_TE   (1 << 30)       /* Thumb exception enable               */

#define PGTABLE_SIZE            (4096 * 4)

static inline unsigned int get_cr(void)
        unsigned int val;
        asm volatile("mrc p15, 0, %0, c1, c0, 0 @ get CR" : "=r" (val) : : "cc");
        return val;

static inline void set_cr(unsigned int val)
        asm volatile("mcr p15, 0, %0, c1, c0, 0 @ set CR"
          : : "r" (val) : "cc");

static void cp_delay (void)
        volatile int i;

        /* copro seems to need some delay between reading and writing */
        for (i = 0; i < 100; i++)
        asm volatile("" : : : "memory");

/* cache_bit must be either CR_I or CR_C */
static void cache_enable(uint32_t cache_bit)
        uint32_t reg;

        /* The data cache is not active unless the mmu is enabled too */
        if ((cache_bit == CR_C) && !mmu_enabled())
        reg = get_cr(); /* get control reg. */
        set_cr(reg | cache_bit);

/* cache_bit must be either CR_I or CR_C */
static void cache_disable(uint32_t cache_bit)
        uint32_t reg;

        reg = get_cr();

        if (cache_bit == CR_C) {
                /* if cache isn;t enabled no need to disable */
                if ((reg & CR_C) != CR_C)
                /* if disabling data cache, disable mmu too */
                cache_bit |= CR_M;
        reg = get_cr();
        if (cache_bit == (CR_C | CR_M))
        set_cr(reg & ~cache_bit);

// -------------------------------------------------

static inline unsigned int get_cr(void)
        unsigned int val;
        asm volatile("mrc p15, 0, %0, c1, c0, 0 @ get CR" : "=r" (val) : : "cc");
        return val;

static inline void set_cr(unsigned int val)
        asm volatile("mcr p15, 0, %0, c1, c0, 0 @ set CR"
          : : "r" (val) : "cc");

Note the isb instruction.

The D cache can be enabled or disabled by a read/modify/write exercise on the bits in this register. Note that when the cache is disabled, you must flush the cache first.

Feedback? Questions? Drop me a line!

Tom's Computer Info /