October 29, 2019

Lessons from "podkalicki" ATtiny13 blink demo

Before I dive into this, I should point out that the code written by Mr. Pokalicki is actually ideal. He makes use of facilities in the avr-libc library that any good and sensible programmer would be wise to use. Doing so makes his code simple and clear and understandable to the majorit of AVR programmers. You might say there is a sort of AVR device C programming culture that this embraces. I am digging deeper simply because it pleases me to do so. As it turns out this yields a rich payoff of insight into AVR devices in general, as well as the facilities provided by avr-libc.

This simple example includes a couple of files:

#include <avr/io.h>
#include <util/delay.h>
These are part of the avr-libc package and on my system get installed in /usr/avr/include. I don't see anything special in the Makefile to indicate via a "-I" switch what the path to find these is, so it must be built in as part of the avr-gcc package.

Studying these two header files (and the files that they in turn include) is a worthy source of valuable information, and can avoid a fair bit of "recreating the wheel".

The Makefile is also a source of interesting things:

MCU=attiny13
FUSE_L=0x6A
FUSE_H=0xFF

AVRDUDE=avrdude
TARGET=main

flash:
        ${AVRDUDE} -p ${MCU} -c usbasp -B10 -B10 -U flash:w:${TARGET}.hex:i -F -P usb

fuse:
        $(AVRDUDE) -p ${MCU} -c usbasp -B10 -U hfuse:w:${FUSE_H}:m -U lfuse:w:${FUSE_L}:m
In particular, note the fuse settings. I am quite pleased that this fellow uses straightforward Makefiles, and includes in them (as I do!) the avrdude commands to program the flash and fuses. Note in particular, some avrdude switches:
-B10 -- sets the bitclock as 10 microseconds (a period).
-U -- prefixes a memory operation
-F  -- says to force the operation regardless of the device signature
-P usb -- specifies the "port" (not needed on my sysem)
The bitclock setting is important, as the default would probably be too high for the attiny13a.

The business of the fuse values is discussed elsewhere.

Try it!

No hassles. Type make; make fuse; make flash -- then install the programmed part on a breadboard with a resistor and LED connected to pin 5, and away it goes.

This uses 72 bytes. It also depends on avr-libc -- which is fine, but I am always curious about what goes on "bare metal", so I use this project as a starting point for a series of modifications.

First let's try an assymetric blink period. Off for 1.9 seconds, then a 0.1 second flash. This changes the code to the following (I also changed the wiring so that the ATtiny sinks current to turn on the LED).

 while (1) {
                PORTB = _BV( LED_PIN );
                _delay_ms(1900);
                PORTB = 0;
                _delay_ms(100);
        }
Now, can we wire the LED to pin 1 instead of pin 5?
This requires some real understanding of the following statements.
#define LED_PIN PB0
DDRB = 0b00000001; // set LED pin as OUTPUT
PORTB = 0b00000000; // set all pins to LOW
First DDRB has got to be the "data direction register", but why PORTB? Is there a port A? Apparently not, at least not in this chip. The low 6 bits of these registers control port bits 0-5. But why is the low bit (PortB0) mapped to pin 5. The mapping is as follows:
PB0 - pin 5
PB1 - pin 6
PB2 - pin 7
PB3 - pin 2
PB4 - pin 3
PB5 - pin 1
So, the code needs to become something like this:
#define LED_PIN PB5
DDRB = _BV ( LED_PIN );
More than this is required though. Pin 1 is the external reset pin by default. Some alternate function gyrations are required to use it as a GPIO pin. Actually is is deeper than that. The only way to do this is to clear the fuse that says to use pin 1 as an IO pin rather than reset. Once you do this, you will no longer be able to reprogram the device! You will have to use the dreaded "high voltage programming" (which I do not have a setup to perform) to reset this fuse.

We will avoid this for now. Using PB3 with this code works fine to blink using pin 2

Macros in the avr-libc include files

My goal now is to eliminate the use of any include files and to make my blink code "self contained".

As you might guess _BV stands for "bit value" and must be defined somewhere to be something like 1<<x -- I sidestep the use of the macro and just define a "mask" in hex (as is often done in code like this).

The set up for DDRB and PORTB are surprisingly complex. The macros provided as part of avr-libc are as follows:

#define DDRB _SFR_IO8(0x17)
#define PORTB _SFR_IO8(0x18)

#define _MMIO_BYTE(mem_addr) (*(volatile uint8_t *)(mem_addr))
#define _SFR_IO8(io_addr) _MMIO_BYTE((io_addr) + __SFR_OFFSET)
Note that the value of SFR_OFFSET depends on AVR_ARCH, which seems to be one of these things defined by the compiler. This must be set to 0x20, due to a complex and strange scheme described below.
The following is a working blink example without any include statements.
#define LED_PIN_MASK    0x01 /* PB0 - pin 5 */

typedef unsigned int uint8_t __attribute__((__mode__(__QI__)));

#define DDRB  (* ((volatile uint8_t *) (0x20 + 0x17)))
#define PORTB (* ((volatile uint8_t *) (0x20 + 0x18)))

#define CYCLES_PER_MS   (F_CPU / 1000)

extern void __builtin_avr_delay_cycles ( unsigned long );

static void
my_delay( unsigned int ms )
{
        __builtin_avr_delay_cycles ( CYCLES_PER_MS * ms );
}

int
main(void)
{
        DDRB = LED_PIN_MASK; // set LED pin as OUTPUT

        for ( ;; ) {
                PORTB = LED_PIN_MASK;
                my_delay(1900);
                PORTB = 0;
                my_delay(100);
        }
}
This code works! I tried writing the definitions for DDRB and PORTB without the 0x20 offset and I get the wrong instruction. These are SFR (special function registers) and need to be written to using the AVR "out" instruction. Without the 0x20 offset they simply compile to memory writes using the AVR "sts" instruction. Some trickery has been done with avr-gcc to recognize this 0x20 offset, subtract it, and generate an "out" instruction instead. Not what anyone would expect, and it would seem to prevent any possible memory access to addresses 0x20 and on -- which I guess is OK. I will need to learn more about AVR architecture and address spaces to comment on this. The relevant gcc compiler option is "__AVR_SFR_OFFSET__" which is described as follows:
Instructions that can address I/O special function registers directly like IN, OUT, SBI, etc. may use a different address
as if addressed by an instruction to access RAM like LD or STS.
This offset depends on the device architecture and has to be subtracted from the RAM address in order to get the respective I/O address.

What about this strange __QI__ attribute? This is one of a host of such attributes, described as follows:

QI: An integer that is as wide as the smallest addressable unit, usually 8 bits.
HI: An integer, twice as wide as a QI mode integer, usually 16 bits.
SI: An integer, four times as wide as a QI mode integer, usually 32 bits.
DI: An integer, eight times as wide as a QI mode integer, usually 64 bits.
SF: A floating point value, as wide as a SI mode integer, usually 32 bits.
DF: A floating point value, as wide as a DI mode integer, usually 64 bits.
No telling why the people who wrote the avr-libc headers did things this way. This is a way of forcing the "unsigned int" declaration to be 8 bits, which is weird. It works just fine to replace this with "unsigned char" and ditch the cryptic attribute.
As follows:
typedef unsigned char uint8_t;

#define DDRB  (* ((volatile uint8_t *) (0x20 + 0x17)))
#define PORTB (* ((volatile uint8_t *) (0x20 + 0x18)))

Try some inline assembly

I find this whole business of using the 0x20 offset to trick the compiler into generating an "out" instruction somewhat distasteful, though it does allow writing C statements as the usual assignment statements. What is really needed is some kind of "__IO__" attribute that tells the compiler to do this in a more direct way -- but I doubt if any compiler writers are reading this.

Be that as it may, I decided to try the experiment of defining a couple of functions that use inline assembly to generate the out (and in) instructions to access the SFR (special function registers) to do IO. I ended up with the following. It does work, but is 82 bytes rather than 78 for some reason. My guess is the write of 0 does not get optimized as it could be. The AVR has a special _zero_reg_ that could be used in such a case.
Here is my final blink.c using inline assembly:

#define LED_PIN_MASK    0x01 /* PB0 - pin 5 */

/* This inline assembly gets rejected if -c99 is used */
/* This code never uses the "read_sfr" function, but here it is anyway. */
static inline unsigned char
read_sfr ( unsigned char reg )
{
    unsigned char rv;

    asm volatile ("in %0, %1": "=r" (rv): "I" (reg) );

    return rv;
}

static inline void
write_sfr ( unsigned char reg, unsigned char val )
{
    // Write SFR
    // out 0x17,r24
    asm volatile ("out %0, %1":: "I" (reg), "r" (val) );
}

#define SFR_DDRB                0x17
#define SFR_PORTB               0x18

extern void __builtin_avr_delay_cycles ( unsigned long );

#define CYCLES_PER_MS   (F_CPU / 1000)

static void
my_delay( unsigned int ms )
{
        __builtin_avr_delay_cycles ( CYCLES_PER_MS * ms );
}

int
main(void)
{
        // set LED pin as OUTPUT
        // DDRB = LED_PIN_MASK;
        write_sfr ( SFR_DDRB, LED_PIN_MASK );

        for ( ;; ) {
                // PORTB = LED_PIN_MASK;
                write_sfr ( SFR_PORTB, LED_PIN_MASK );
                my_delay(1900);
                // PORTB = 0;
                write_sfr ( SFR_PORTB, 0 );
                my_delay(100);
        }
}
Note the use of "I" to indicate that the SFR number should be an "immediate" value in the generated assembly. The whole business of writing gcc inline assembly is its own special topic. There are two different syntaxes (and I am using the old one). Also every architecture has a flavor of its own. There are particular details to ARM, x86, and AVR as well.

What about this __builtin_avr_delay_cycles function

This seems to be part of avr-gcc. Using the "-S" switch to gcc to get assembler output, it is possible to see that this turns into code that looks something like the following. If I squint at this, it looks like it is using 3 registers to set up a 24 bit value and then decrements it to zero, which works for me. The curious "rjmp ." and "nop" at the end are beyond my knowledge of AVR at this point, but worth noting.
        ldi r18,lo8(455999)
        ldi r19,hi8(455999)
        ldi r20,hlo8(455999)
1:      subi r18,1
        sbci r19,0
        sbci r20,0
        brne 1b
        rjmp .
        nop
This is about as deep as I want to dig into this simple blink example. It certainly taught me a lot, and generated a new set of questions about the AVR architecture.


Have any comments? Questions? Drop me a line!

Tom's Electronics pages / tom@mmto.org