Here is the code for the whole project on Github:
And here is the heart of the code. I am not showing all of it here (see Github for that). A detailed discussion follows.
An important note first. This file is blink.S. The capital "S" extension is very important. It triggers the C preprocessor to run, allowing me to use #define macros as part of the assembly code, which I consider a life saver. See the makefile, as I also use gcc rather than "as" to assemble the code. Yes that probably sounds odd, but it works and allows the use of macros.
/* This is just one of 32 bits */ #define R_IO_BANK0 0x20 #define GPIO_25 0x02000000 /* An odd thing. The disassmbly (dump) shows "subs", but we must * write it here as "sub" or we get errors. Some gnu as quirk. * * This fits in only 76 bytes as compared to 124 for the C version */ .cpu cortex-m0 .thumb .text @ execution starts here. @ note that we don't need a stack @ in fact we don't need SRAM at all. start: @ reset IO Bank 0 ldr r1,=RESET_BASE_CLR mov r0,#R_IO_BANK0 str r0, [r1] @ loop/poll until done ldr r1,=RESET_BASE_RW wait_loop: ldr r2, [r1,#8] tst r0, r2 beq wait_loop @ Set function select to software IO @ for GPIO 25 ldr r1,=IO_BANK0_BASE_RW mov r2, #0xcc mov r0,#5 str r0, [r1,r2] @ Enable output for GPIO 25 ldr r1,=SIO_BASE ldr r0,=GPIO_25 str r0, [r1,#SIO_OE_SET] blink_loop: str r0, [r1,#SIO_OUT_SET] bl blink_delay str r0, [r1,#SIO_OUT_CLR] bl blink_delay b blink_loop @ ================================== @ This blinks at about 1 Hz #define DELAY_COUNT 0x80000 blink_delay: ldr r3,=DELAY_COUNT delay_loop: sub r3, r3, #1 cmp r3, #0 bne delay_loop bx lrThe heart of the code is the "blink_loop". Everything else is just setup.
Let's talk about registers. I use r3 in the delay function and it is important not to forget this and also use it in the mainline code. Other than that, I get everything done using only the 3 registers r0, r1, and r2. I use r2 to hold one offset. I use r1 to hold a base address and r0 to hold data.
Notice how I use the "=" trick to make the assembler find a place to stick data. The way this works is that the assembler puts the value somewhere for you and inserts the address in the instruction to go fetch it.
Consider addresses. We need 32 bit addresses. What the assembler does for us is to generate PC relative addresses. The instruction holds an offset from the current PC, which is small enough to fit somewhere in the 16 bit thumb opcode. It does this for constants and for branch targets (such as "wait_loop"). Take note of two things. One is that this is all transparent. I just code up a label and a branch and the assembler does the dirty work. The other is a happy side effect. The code is position independent. With no hard addresses in the code, it can be relocated anywhere.
You might be asking just where this code does run. This is specified in the linker script (see blink.lds). Linker scripts are one of the things embedded programmers have to deal with that regular programmers never care or know about. The one for this project is very simple, as follows:
SECTIONS { . = 0x10000000; .text : { *(.text*) } }
This gathers all of the text sections together and locates them at 0x10000000. The careful reader will notice that this is the address of the XIP (execute in place) area in the RP2040 chip. However, don't be fooled by this.
Given that all the code is position independent with PC relative addressing used every place where a definite address might be used, the code could be loaded anywhere and would run! The value in the linker script can be anything at all. In fact I changed it to 0x90000000 and everthing worked just fine.
Finding out just what really goes on will require some study of the bootrom code. Whether or not XIP is being used (and I doubt seriously that it is in this case) is entirely unknown and probably academic.
Here is what we know so far. The bootrom pulls 256 bytes (or a bit less), loads it someplace into SRAM and starts it running. This is intended to be a second stage boot loader. In our case here, it is all it is, but that works fine for this tiny demo. For something bigger we will have to learn more.
Both the bootrom details and XIP are a topic for another writeup when I learn more about them.
So there is a place for assembly coding after all. This code avoids use of the stack entirely, and in fact does not use SRAM at all. But as far as the end result (blinking the LED, both get the job done).
A negative aspect of the assembly code is several hard coded constants (like 0xcc for the offset to get to the proper function select register). The C code was able to let the compiler figure this out by using a well designed "struct". The code has several other constants and honestly they all should be set up using informative macros (such as the value "5" to select software IO). This doesn't matter terrible much for this little demo project, but for a bigger project it would lead to code that would be hard to understand and maintain.
Tom's Computer Info / tom@mmto.org