January 20, 2025

Antminer S9 board - examination of BOOT.BIN

In my U-boot build, in the spl directory, we find:
-rw-r--r--  1 tom tom  135484 Jan 19 22:15 boot.bin
I have a program "bootex" that dumps the Zynq bootrom header. Here is what it sees:
Image read: 135484 bytes
Source offset: 000008c0
Image load : 00000000
Image start : 00000000
Image length : 0002113c (135484 bytes)
Image total length : 0002113c (135484 bytes)
Image header pointer : 00000000
Partition header pointer : 00000000
At least two things do not look right.

One is that the image length is shown as 135484 bytes, but this matches the entire length of the BOOT.BIN file -- it should describe the image contained in the file, which would be 135484 minus the size of the bootrom header. The bootrom header is 2240 bytes (0x8c0). So the image size should be 135484 - 2240 = 133244, which is exactly the size of u-boot-dtb.bin as it should be. This is wrong, but won't hurt as it will just cause some extra rubbish to be loaded by the bootrom. It could hurt if the size of the SPL got near the 192K limit and that 2240 pushed us over.

I thought that the start address of 0 was an issue, but it is not at all.

What about that start address?

I use objdump to disassemble the elf file (u-boot-spl). It turns out that this yields incomplete and shabby output for reasons that I don't understand. For example why does it not disassemble the first 32 bytes? When I run OpenOCD, it can disassemble them and that turned out to be very useful. In particular, it informs me that the instruction at address 0 is a branch to 0x44!!
00000000 <__image_copy_start>:
       0:   0f 00 00 ea 14 f0 9f e5 14 f0 9f e5 14 f0 9f e5     ................
      10:   14 f0 9f e5 14 f0 9f e5 14 f0 9f e5 14 f0 9f e5     ................
00000020 <_undefined_instruction>:
      20:   00000040    .word   0x00000040
00000024 <_software_interrupt>:
      24:   00000040    .word   0x00000040
00000028 <_prefetch_abort>:
      28:   00000040    .word   0x00000040
0000002c <_data_abort>:
      2c:   00000040    .word   0x00000040
00000030 <_not_used>:
      30:   00000040    .word   0x00000040
00000034 <_irq>:
      34:   00000040    .word   0x00000040
00000038 <_fiq>:
      38:   00000040    .word   0x00000040
      3c:   deadbeef    .word   0xdeadbeef
00000040 :
      40:   eafffffe    b   40 

00000044 :
      44:   ea000012    b   94 
00000048 :
      48:   e10f0000    mrs r0, CPSR
      4c:   e200101f    and r1, r0, #31
I do a hex dump of u-boot-spl-align.bin and see the following that corresponds to the disassembly above of the elf file:
00000000 0f00 00ea 14f0 9fe5 14f0 9fe5 14f0 9fe5
00000010 14f0 9fe5 14f0 9fe5 14f0 9fe5 14f0 9fe5
00000020 4000 0000 4000 0000 4000 0000 4000 0000
00000030 4000 0000 4000 0000 4000 0000 efbe adde
00000040 feff ffea 1200 00ea 0000 0fe1 1f10 00e2
00000050 1a00 31e3 1f00 c013 1300 8013 c000 80e3
The label "__image_copy_start" is something that relocation code uses. We should not (and hopefully are not) relocating the SPL code, though maybe if we are, that is what is making things blow up?

I use "readelf -h u-boot-spl" and get (among other things):

Entry point address:               0x0
A look at u-boot-spl.lds helps explain why we don't just start out with code from startup.S -- it drags in vectors from somewhere.
 .text :
 {
  __image_copy_start = .;
  *(.vectors)
  arch/arm/cpu/armv7/start.o (.text*)

So, try the 0x44 address

I rebuild a BOOT.BIN with the 0x44 start address. I try it and get nothing on the serial port. Attaching JTAG tells me:
halt
zynq.cpu0: MPIDR level2 0, cluster 0, core 0, multi core, no SMT
target halted in Thumb state due to debug-request, current mode: Supervisor
cpsr: 0x400001f3 pc: 0xfffff934
MMU: disabled, D-Cache: disabled, I-Cache: enabled
This is really no different from when I starting it running at 0.
I examine memory at the start of OCM, and things are as expected:
mdw 0 32
0x00000000: ea00000f e59ff014 e59ff014 e59ff014 e59ff014 e59ff014 e59ff014 e59ff014
0x00000020: 00000040 00000040 00000040 00000040 00000040 00000040 00000040 deadbeef
0x00000040: eafffffe ea000012 e10f0000 e200101f e331001a 13c0001f 13800013 e38000c0
0x00000060: e129f000 ee110f10 e3c00a02 ee010f10 e59f0084 ee0c0f10 eb000006 eb00001d

arm disassemble 0 32
0x00000000  000f ea00	b	#0x44
0x00000004  f014 e59f	ldr	pc, [pc, #0x14]
0x00000008  f014 e59f	ldr	pc, [pc, #0x14]
0x0000000c  f014 e59f	ldr	pc, [pc, #0x14]
0x00000010  f014 e59f	ldr	pc, [pc, #0x14]
0x00000014  f014 e59f	ldr	pc, [pc, #0x14]
0x00000018  f014 e59f	ldr	pc, [pc, #0x14]
0x0000001c  f014 e59f	ldr	pc, [pc, #0x14]
0x00000020  0040 0000	andeq	r0, r0, r0, asr #32
0x00000024  0040 0000	andeq	r0, r0, r0, asr #32
0x00000028  0040 0000	andeq	r0, r0, r0, asr #32
0x0000002c  0040 0000	andeq	r0, r0, r0, asr #32
0x00000030  0040 0000	andeq	r0, r0, r0, asr #32
0x00000034  0040 0000	andeq	r0, r0, r0, asr #32
0x00000038  0040 0000	andeq	r0, r0, r0, asr #32
0x0000003c  beef dead	cdple	p14, #0xa, c11, c13, c15, #7
0x00000040  fffe eaff	b	#0x40
0x00000044  0012 ea00	b	#0x94

0x00000048  0000 e10f	mrs	r0, apsr
0x0000004c  101f e200	and	r1, r0, #0x1f
0x00000050  001a e331	teq	r1, #0x1a
0x00000054  001f 13c0	bicne	r0, r0, #0x1f
0x00000058  0013 1380	orrne	r0, r0, #0x13
0x0000005c  00c0 e380	orr	r0, r0, #0xc0
0x00000060  f000 e129	msr	cpsr_fc, r0
0x00000064  0f10 ee11	mrc	p15, #0, r0, c1, c0, #0
0x00000068  0a02 e3c0	bic	r0, r0, #0x2000
0x0000006c  0f10 ee01	mcr	p15, #0, r0, c1, c0, #0
.....
arm disassemble 0x94 32
0x00000094  ffeb eaff	b	#0x48
0x00000098  0000 e3a0	mov	r0, #0
0x0000009c  0f17 ee08	mcr	p15, #0, r0, c8, c7, #0
0x000000a0  0f15 ee07	mcr	p15, #0, r0, c7, c5, #0
0x000000a4  0fd5 ee07	mcr	p15, #0, r0, c7, c5, #6
0x000000a8  f04f f57f	dsb	sy
0x000000ac  f06f f57f	isb	sy
This is all very interesting. First of all, we can see that there is nothing wrong with starting at address 0, as that will just branch to 0x44 anyway. Who knows why my objdump generated disassembly was incomplete and misleading. Also notice that 0x44 branches to 0x94, which branches to 0x48, which seems to start the show.

Note the spin loop at 0x40 that all of the vectors take you to. Next I try this:

resume 0x44
halt
target halted in ARM state due to debug-request, current mode: Abort
cpsr: 0x40000197 pc: 0x00000040
MMU: disabled, D-Cache: disabled, I-Cache: enabled
Data fault registers        DFSR: 00000001, DFAR: 78000023
Instruction fault registers IFSR: 00000000, IFAR: 00000000
Well, look at that. Now it has made its way to the spin loop at 0x40
Now I try this:
> resume 0x94
> halt
target halted in Thumb state due to debug-request, current mode: Supervisor
cpsr: 0x400001b3 pc: 0xfffffd86
MMU: disabled, D-Cache: disabled, I-Cache: enabled
> resume 0x94
> halt
target halted in ARM state due to debug-request, current mode: Undefined instruction
cpsr: 0x4000019b pc: 0x00000040
MMU: disabled, D-Cache: disabled, I-Cache: enabled
It seems to be a crap shoot whether it goes to the 0x40 spin loop to die or transitions the Thumb mode running in the last page (0xfffffxxx).

Source for the startup code is in arch/arm/cpu/armv7/start.S

Dig deeper using JTAG

We should be able to set a breakpoint at "main" using:
bp 0x4e0
But it doesn't work. Time to start reading, starting with my own old notes which have useful links.
Feedback? Questions? Drop me a line!

Tom's Computer Info / tom@mmto.org