April 19, 2026

Allwinner H5 network driver -- debug an abort in the EMAC driver

A status report is in order first. I finished work on the PHY driver, adding code to detect 1000 Mbit autonegotiation and to set up the EMAC driver accordingly. This driver is now shared by both the h3 and h5, and the code works fine on the h3 chip.

I then decided to take the working emac driver for h3 and share it for both h3 and h5. This means that I set aside a new start on an emac driver for the h5. It also required a small amount of code to be added to the driver to support 1000 Mbit operation.

Then I just gave it a try. The console messages all look fine, but the first sign of trouble is that the Kyu shell does not respond in any way. After a few seconds I get:

Synchronous Abort
cur_thread: 400ae9c8 (net-timer)

SP:      40581eb0
LR:      40004678
ELR:     4002a2b8
A synchronous abort is an ARM processor exception. It is synchronous in that it is related to the current instruction. Typically memory is being accessed via a wild pointer or at an odd address or something of the sort.

The ELR address is the address of interest. I look for this in kyu.dump. Interestingly enough this is in memcpy(). That this is taking place in the net-timer thread is perplexing, as this is part of the TCP code.

There is another ARM register of interest that I would like to add to this dump and that is the DFAR (Data Fault Address Register). This register gives the exact address that was referenced to cause the exception. However this seems to be an arm32 register and it isn't clear if there is a 64 bit equivalent.

I comment out the call to emac_activate(), which means that the emac receiver and transmitter are not enabled and no interrupts should occur -- now Kyu starts up fine. I see the following threads running, which all looks as it should:

  Thread:       name (  &tp   )    state     sem       pc       sp     pri
* Thread:    blinker (400ae438)  REPEAT I          00000008 40584000   10
  Thread:      shell (400af4e8)   READY I          00000008 40578000   11
  Thread:     net-in (400aef58)     SEM J  net-inq 00000008 4057c000   20
  Thread:    net-out (400aec90)     SEM J net-outq 00000008 4057e000   21
  Thread:  net-timer (400ae9c8)  REPEAT I          00000008 40580000   22
  Thread:       idle (400af220)   READY C          4000df68 4057a000 1234

Cache routines

This was on my list of things to do that were important to the emac driver on aarch64. I see two calls:
flush_dcache_range ( addr_t start, addr_t stop )
invalidate_dcache_range ( addr_t start, addr_t stop )
These are routines in Kyu/src/armv8/cache_v8.c and they call assembler code in Kyu/src/armv8/startup.S. Notes there indicate that the code came from U-Boot in arch/arm/cpu/armv8/cache.S -- so this looks proper. It is not stubs waiting for work as I had feared.

more PHY setup

There is more to dealing with this chip than autonegotiation. After autonegotiation finishes, you know what the options are, but you need to set bits to determine speed and duplex. I defaulted to 100 Mbit, but needed to improve the code so that it could set either 100 or 1000 (or 10 in that unlikely case). I had hopes this would solve my problems, but it did not.

More debugging

Now I get this:
Synchronous Abort
cur_thread: 400ae438 (blinker)

SP:      40585eb0
LR:      40004678
ELR:     4002a2f4
The ELR value again puts this in memcpy(), but reporting this as from the "blinker" thread is very suspicious. The blinker thread is about 10 lines of code that is run from a repeat timer and does nothing but make gpio calls -- there is no way this could/should call memcpy. This sort of confusion starts to look like general tromping on memory, perhaps stack corruption. It only happens when the emac is active.

My guess is that this involves pointers and the move from 32 to 64 bit. See more on this in the next page


Have any comments? Questions? Drop me a line!

Tom's electronics pages / tom@mmto.org