August 8, 2025

GCC inline assembly - using it with the m68k

Nine years after I first put these notes together, here I am again, although this time with the m68k. I am doing some work with sun3 machines that date back to 1988 or so -- nearly 40 years ago. In particular I am working on building bootrom images from source: Anything is a project that involves more than a week of work.

I am using objdump to see results and trying various things. Here is our first example:

	int rv;
	asm volatile ( "movel #5, %0" : "=d" (rv) );
	return rv;
In the above "rv" is an int variable in the C code. Note in particular, a unique thing for the m68k, the "=d" designation indicates that we want a D register to be used.
This yields:
   a:   7005            moveq #5,%d0
   c:   2d40 fffc       movel %d0,%fp@(-4)
  10:   202e fffc       movel %fp@(-4),%d0
The compiler chooses the "d0" register for the instruction, Note the unfortunate and needless "bounce" through a location on the stack that the compiler assigned to the "rv" variable.

The constraint "=d" is our way of indicating we want the value to end up in a D register. Using "=r" or (surprisingly) even "=a" yields the same result.
Take a look at this list of constraint codes:

Use the optimizer

It works great! When I add "-O" to the gcc line, the generated code is dramatically better. Here is the C code:
u_long
getfc3 ( u_long size, char *addr )
{
        u_long rv;

        asm volatile ( "moveq #3, %d0" );
        asm volatile ( "movec %d0, %dfc" );
        asm volatile ( "movel #5, %0" : "=r" (rv) );
        return rv;
}
And here is what gets generated. It couldn't be any better.
00000000 :
   0:   7003            moveq #3,%d0
   2:   4e7b 0001       movec %d0,%dfc
   6:   7005            moveq #5,%d0
   8:   4e75            rts
No allocation of space on the stack and the value goes into the d0 register as it should for a value returned from a function

Read from a memory address

This is a simple matter of using the "m" constraint.
u_long
getfc3 ( u_long size, char *addr )
{
        u_long rv;

        asm volatile ( "moveq #3, %d0" );
        asm volatile ( "movec %d0, %dfc" );
        asm volatile ( "movesb %1, %0" : "=r" (rv) : "m" (addr) );
        return rv;
}
With the "-O" option still being used, we get this, which again could not be better.
00000000 :
   0:   7003            moveq #3,%d0
   2:   4e7b 0001       movec %d0,%dfc
   6:   0e2f 0000 0008  movesb %sp@(8),%d0
   c:   4e75            rts
I'll note that the gcc documentation says that the optimizer is free to reorder individual "asm()" statements. If you must have a specific order, you must put the statements in question into a single "asm()".
Something like this works:
       asm volatile ( "moveq #3, %d0; \
                        movec %d0, %dfc" );

The final result

Here is the code I finally ended up with for sun3/machdep.c
u_long
getfc3 ( u_long size, char *addr )
{
        u_long rv;

        // asm volatile ( "moveq #3, %d0; \
        //                 movec %d0, %dfc" );

        // This works, but I shouldn't have to put the useless "r" constraint in.
        // but it complains if the field is empty
        asm volatile ( "moveq #3, %%d0; \
                        movec %%d0, %%dfc" : : "r" (size) : "%d0" );

        if ( size == sizeof(u_char) ) {
            asm volatile ( "movesb %1, %0" : "=r" (rv) : "m" (addr) );
            return rv;
        }

        if ( size == sizeof(u_short) ) {
            asm volatile ( "movesw %1, %0" : "=r" (rv) : "m" (addr) );
            return rv;
        }

        asm volatile ( "movesl %1, %0" : "=r" (rv) : "m" (addr) );
        return rv;
}

Several things are worth noting.

The numbers in the %0, %1 refer to the order things are mentioned in the input and output lists.

For some reason, the compiler would not let me leave the input operand list empty, so I stuck a useless "r" declaration there to make it happy.

Declaring %d0 in the clobber list is/was essential. Without it, the compiler put a value into %d0 that indeed got clobbered.

Note that there definitely are two "flavors" of inline assembly. We have basic, without any colons and input/ouput lists, and we have "extended" that adds those lists. Here, note in particular that we have to use double "%%" in front of registers in the extended format whereas that was not required in the basic format. In fact in basic mode, a double "%%d0" causes an error.

When I look at the generated code, it is correct, but even with "-O" it does some unnecessary branching. I could do better, but would need to resort to whole hog assembly programming.


Feedback? Questions? Drop me a line!

Tom's Computer Info / tom@mmto.org