How can I get this SAMD21E18 startup code a little sturdier?

This article is part of a series.

Related Repo: https://github.com/carlynorama/StrippedDownChipRosetta

At the end of the last post I mentioned that I wasn’t going to comment on the code or the linker script, but that I didn’t recommend editing them yet. The code from the last post worked because it had workable hooks where they needed to be for what it was trying to do. Many minimal examples work because they do nothing. Adding more code in the wrong place can break things with no error messages to help.

When programming for a machine with an operating system, generally folks can start with the apps entry point, “main”, and go from there. On a microcontroller the software has to be both application and “operating system”. Thankfully IDEs and frameworks hide a lot of that so folks can get right to work on their idea and only dig deeper if they have a special concern.

When one does need to go under the hood, the process of setting up a new chip has similarities to learning how to implement a new Protocol. There’s a contract to deliver on and you have to learn what it is.

The first step involves providing a start up script that at least:

This post is less tutorial and more record of what I did, what resources I needed and what resources I think others might find helpful to fill in the gaps

References

Tutorials and repos here. Topic specific links in their sections.

Official

Writing start up scripts and linker files correctly requires intimate knowledge of the specific microcontroller’s physical layout. Smart people focused on concrete goals use manufacture provided versions of these files because the manufacturer in theory knows their chip best (and any errata). To find the all the information needed can take a little digging. These are all the sources I looked at. I will state which one contributed to what as I go.

More Instructional

How to search github

When looking for start up scripts to verify my work wasn’t totally off base, github searches of:

Turned up some of the most useful results.

To go a little more scatter shot, the following phrases (plus your Microchip part number) are an indicator that file started from the official source. I’m sure you can come up with more “Statistically Improbable Phrases” from the demo projects, etc above. - “\brief Header file for” - “Microchip Technology Inc. and its subsidiaries” - “Copyright (wild card) Atmel Corporation” - “\brief This module contains”

How To Compare Assembly Start Up Files

This will not be a course in assembly, although I’ve tried to provide alternate examples and some explanations.

I mentioned searching github, but assembly start up scripts can look wildly different even for the same chip:

It can be reaaaaly hard to tell what’s a meaningful difference between two files and what is not. Hopefully the links below will help when my comments aren’t good enough.

A - A mildly better version

Finding what I needed to make a vector table

The “vector table” lists all the addresses of the critical low level functions the chip will need to know what to do when “something happens”. They must live at exactly the location defined by the architecture and the manufacturer. They are incredibly device dependent, although the nice thing about Arm chips is that they share some values across cores and architectures.

Some of the types of things that get answered by the vector table are:

Misc Overviews

How does that all work on the SAMD21 specifically

A vector table for a SAMD21 has to deliver on the Arm® Cortex®-M0+ Processor Core MCU requirements as well as support its own peripherals. The above link is by far the best information I found for this chip.

Images below from that page:

Table from above page

picture from that page

But this just tells you about the Cortex-M0+ values. The chip specific information can be tracked down in the Datasheet. Unfortunately I couldn’t the find a comprehensive list in one easy place like for the Atmega328 (not Arm, different beast, table 11-1 in 328 datasheet) or the STM32F0 (Arm Coretex-M0 (not M0p) Section 11, Table 36 in the reference manual). If it wasn’t for this one page I’m not sure how one would have pulled together the full list except by using sample code.

Product Mapping

Datasheet 9 Product Mapping Figure 9-1. SAM D21 Product Mapping product map

Chip Specific Hardware List

The 0 next to PM - Power Manager lines up with the the 16 IRQ 0 in the very first image, and the 0 PM_Manager (addresses 0x00000040)

Datasheet 11.2 Nested Vector Interrupt Controller 11.2.2 Interrupt Line Mapping

!!!WARNING!!! FAMILY Datasheet. Cross reference the below with section 2 Configuration Summary (not shown) for which ones your chip specifically implements.

table - TODO Switch to HTML table

Updating Hello Code With Minimal Vector Table & Handlers

Between your start up function and the linker file the idea is to defend the expected amount of space at the expected location and fill it with either useful or guaranteed harmless information.

My first example will only specify Cortex-M0+ needed values. Since embedded chips are small it’s not uncommon to not leave space for the features of the chip you know your code (AND CIRCUIT!) won’t use.

.syntax unified
.cpu cortex-m0
.fpu softvfp
.thumb 

@ vector_table will be at the top of text, because it is at the top of text
@ and we have no other files. 
.text
vector_table:
    .word   sp_initial_value //edge of stack, set in linker
    .word   reset_handler   //reset handler
    .word   nmi_handler
    .word   hard_fault_handler
    .word   0//not in M0+
    .word   0//not in M0+
    .word   0//not in M0+
    .word   0//not in M0+
    .word   0//not in M0+
    .word   0//not in M0+
    .word   0//not in M0+
    .word   svc_handler
    .word   0//not in M0+
    .word   0//not in M0+
    .word   pendsv_handler
    .word   systick_handler

  .weakref nmi_handler, default_handler
  .weakref hard_fault_handler, default_handler
  .weakref svc_handler, default_handler
  .weakref pendsv_handler, default_handler
  .weakref systick_handler, default_handler


.word default_handler
.thumb_func
default_handler:
   ldr r3, =#0x8BADF00D
   @ hangs the program.
   b .


.global reset_handler
.thumb_func
reset_handler:
  LDR  r0, =sp_initial_value
  MOV  sp, r0

  LDR  r7, =0xF0CACC1A
  MOVS r0, #0
  main_loop:
    @ Add 1 to register 'r0'.
    ADDS r0, r0, #1
    @ Loop back.
    B    main_loop

I’m using a different syntax than the the vivonomicon version. This version of the code WILL NOT WORK for non-thumb mode arm processors. I’ll revert to that more core flexible syntax, using the gcc assembler directive .type, in code where I care about portability.

Each interrupt gets a weakref (search on this gnu docs page) to a default handler function. This lets a future user of this code override their values in another file. I could have just put default_handler everywhere I put a unique function name in the vector table. You might see this weak ref creation in other files as:

A two step process

    .weak       SysTick_Handler
    .thumb_set  SysTick_Handler, Default_Handler

A different flavor of assembly (armasm, not the now recommended by Arm gccasm) in some of the official docs

SysTick_Handler           PROC
                          EXPORT SysTick_Handler           [WEAK] 
                          B       .
                          ENDP

Handled by a macro as suggested in Arm provided code (set vs thumb_set depending on the instruction set)

.macro IRQ handler
    .weak  \handler
    .set   \handler, Default_Handler
.endm

IRQ SysTick_Handler

Or even good old

void SysTick_Handler(void) __attribute__ ((weak, alias ("Default_Handler")));

These all serve the same purpose.

Another thing you might see in thumb mode vector tables is + 1 to the function address or some .align N where N is odd before a function declaration later in the code or something somewhere that shifts the handler addresses so the least significant bit of the address will be 1. I put some links in the assembly link section above and yesterday that cover how the assembler can tell if it should be using 16 or 32 bit instructions by the lsb. It comes up a lot.

Thankfully .thumb_func should be taking care of that for me. The =mthumb in the gcc calls should do that too when using the .type directive. (.type name, %function), but I haven’t confirmed that yet.

Mildly Improving the Linker

This program relies on the vector table being defined in the one source file immediately after the .textcompiler directive is invoked. I’ll make a more resilient set up next. The compiler already has opinions about what to do with .text, so you could leave it out in other circumstances.

How does this differ from the previous file?

ENTRY(reset_handler);
MEMORY {
    flash(rx): ORIGIN = 0x00000000, LENGTH = 0x00040000 /*256K*/
    sram(rwx): ORIGIN = 0x20000000, LENGTH = 0x00008000 /*32K*/
}
sp_initial_value = ORIGIN(sram) + LENGTH(sram);  /* 0x20008000 */

SECTIONS {
    .text   : { *(.text)  } > flash
    .data : { *(.data) } > sram AT> flash
    .bss    : { *(.bss)    } > sram
}

More links on linker files later.

B - Separating the Startup from The Program Code

This code works (use the makefile), but it’s still fragile. What the chip needs and our wants are all mushed together and everything relies on putting things in the right place in the file. And we haven’t even done some of the important startup basics.

1 - Pull out the Main Loop, Still one File

First I changed hello_improved to hello_main, pulling out the loop in the same file via branching. I did not following all the proper calling conventions.

hello_main also has new section markers, which were tested by adding them in the linker an running the shiny new disassemble command in the makefile.

This is all in the repo here

2- Doing more for the start up

repo folder

When looking for the right assembly syntax for the reset handler I was influenced by the following scripts:

TODO: Benchmark them. I’m not doing anything fancy enough with this code to really make a difference.

I haven’t commented every line. Some assistance (also last post’s links…):

The new reset handler code

.section .text.reset_handler
@ The Reset Handler 
.global reset_handler
.thumb_func
reset_handler:

  /* reset handler started */
  /* r7 will hold my notes to self  */
  ldr  r7, =0x0000AAAA
 
  /* set stack pointer */
  ldr  r1, =_end_stack
  mov  sp, r1

  /* set  .bss section to 0 (SRAM) */
  /* uses the jump to exit style loop */
  zero_bss:
    /* get 0 handy in two registers */
    movs r0, #0
    movs r1, #0
    /* get the bounds */
    ldr  r2, = _start_bss
    ldr  r3, = _end_bss
    loop_start_bss_zero:
      /* compare, first one is that there is a bss.  */
      cmp r2, r3
      /* compare: unsigned higher or same (is end higher than current )  */
      bhs loop_end_bss_zero
      /* will load 2 and increase the pointer after */
      @ stmia r2!, {r0,r1}
      stmia r2!, {r0} /* decided against moving bss to after data in linker  */
      b loop_start_bss_zero
    loop_end_bss_zero:

  /* copy .data section to SRAM */
  /* uses the jump to entry style loop */
  move_data_to_ram:
    /* get the bounds */
    ldr r0, = _start_data   
    ldr r1, = _end_data
    /* read head location */
    ldr r2, = _sidata
    b loop_enter_data_copy

    loop_action_data_copy:
      ldmia r2!, {r3}
      stmia r0!, {r3}
    loop_enter_data_copy:
      cmp r0, r1
      bcc loop_action_data_copy

  /* call the clock system initialization function.*/
  /* not done */

  /* call init of libraries (stdlib when loop in C) that need it.*/
  /* not done */

  /* note to self handler finished */
  LDR  r7, =0x0000BBBB

  /* Call the application's entry point.*/
  bl my_main             

And the top of the file with lots of variables to test

.bss 
/* note, bss is marked noload in the linker */
/* no matter how long, shouldn't effect size of elf */
/* assembler wont let you store values that aren't 0 */
make: .word 0
something: .word 0
be: .word 0
here: .word 0

.data
/* like define, but tricksy. */
.equ pelargonium, 0x600dCA7
.equ rosebud, 0xF0CACC1A

/* variables with values */
ennie:  .word  124
meenie: .word  125
mineie: .word  126
moe:    .word  127

@ this is NOT how you create unallocated memory
@ these all point to the same place in memory. 
breakfast:  .word
lunch:      .word
dinner:     .word
midnight_snack: .word 27 

adr_ennie: .word ennie
adr_meenie: .word meenie
adr_mineie: .word mineie 
adr_moe: .word moe
adr_breakfast: .word breakfast
adr_lunch: .word lunch
adr_dinner: .word dinner
adr_midnight_snack: .word midnight_snack

adr_make: .word make

increment: .word 1

.section .text.program_code

.word my_main
.thumb_func
my_main:
  /* note to self main started finished */
  LDR  r7, =0x0000CCCC

  @ values of the unallocated should match
  @ ennie-moe after done with this. 
  LDR r0, =ennie
  ldmia r0!, {r1-r4}
  LDR r0, =make
  stmia r0!, {r1-r4}
  
  /* start with 0 */
  movs r0, #0
  loop_start:
    // Add 1 to register 'r0'.
    ADDS r0, r0, #1
  b loop_start

The Matching Linker

ENTRY(reset_handler);
MEMORY {
    flash(rx): ORIGIN = 0x00000000, LENGTH = 0x00040000 /*256K*/
    sram(rwx): ORIGIN = 0x20000000, LENGTH = 0x00008000 /*32K*/
}

_end_stack = ORIGIN(sram) + LENGTH(sram);  /* 0x20008000 */

SECTIONS {
    .text : { 
        hello_startup.o(.vectors)  
        KEEP(*(.vectors .vectors.*))
        *(.text.reset_handler)
        *(.text.default_handler)
        /* the rest of the .text and .text sub sections */
        *(.text .text.*)
        /* all files read only data and read only data sub sections */
        *(.rodata, .rodata*) /* read only data */
       
    } > flash /* nothing relocatable */
    . = ALIGN(4);
    _end_text = .; /* store current location counter value in _end_text */

    /* Used by the startup to initialize data with the "originating" address */
    /* not so relevant on the M0+, more significant diff on beefier cores */
    _sidata = LOADADDR(.data);

    .data : { 
        . = ALIGN(4);
        _start_data = .; /* store current location in _end_data t*/
        *(.data) 
        . = ALIGN(4);
        _end_data = .; /* store current location in _end_data t*/
    } > sram AT> flash

    .bss(NOLOAD) : { 
        . = ALIGN(4);
        _start_bss = .;
        *(.bss) 
         *(COMMON) 
     } > sram
     /* todo, inside or outside */
     . = ALIGN(4);
     _end_bss = .;
}

C - Splitting The Code Into Separate Files

Honestly. It just worked.

Direct link

The really change is finally adding in all the peripherals! (see Table 2-1 (Configuration Summary). Some of those are 0s b/c not on the E version of the chip)

.section  .vectors
.global vector_table
vector_table:
    //-------------- arm core list
    .word   _end_stack //edge of stack, set in linker
    .word   reset_handler //reset handler
    .word   nmi_handler
    .word   hard_fault_handler
    .word   0//not in M0+
    .word   0//not in M0+
    .word   0//not in M0+
    .word   0//not in M0+
    .word   0//not in M0+
    .word   0//not in M0+
    .word   0//not in M0+
    .word   svc_handler
    .word   0//not in M0+
    .word   0//not in M0+
    .word   pendsv_handler
    .word   systick_handler
    //-------------- peripherals list
    .word   PM_Handler
    .word   SYSCTRL_Handler
    .word   WDT_Handler
    .word   RTC_Handler
    .word   EIC_Handler
    .word   NVMCTRL_Handler
    .word   DMAC_Handler
    .word   USB_Handler
    .word   EVSYS_Handler
    .word   SERCOM0_Handler
    .word   SERCOM1_Handler
    .word   SERCOM2_Handler
    .word   SERCOM3_Handler
    .word   0
    .word   0
    .word   TCC0_Handler
    .word   TCC1_Handler
    .word   TCC2_Handler
    .word   TC3_Handler
    .word   TC4_Handler
    .word   TC5_Handler
    .word   0
    .word   0
    .word   ADC_Handler
    .word   AC_Handler
    .word   DAC_Handler
    .word   PTC_Handler
    .word   I2S_Handler
    .word   0
    .word   0

.text

.section .text.default_handler
.word default_handler
.thumb_func
default_handler:
   ldr r3, =#0x8BADF00D
   //hangs the program.
   b .

.weakref nmi_handler, default_handler
.weakref hard_fault_handler, default_handler
.weakref svc_handler, default_handler
.weakref pendsv_handler, default_handler
.weakref systick_handler, default_handler

.weakref PM_Handler, default_handler
.weakref SYSCTRL_Handler, default_handler
.weakref WDT_Handler, default_handler
.weakref RTC_Handler, default_handler
.weakref EIC_Handler, default_handler
.weakref NVMCTRL_Handler, default_handler
.weakref DMAC_Handler, default_handler
.weakref USB_Handler, default_handler
.weakref EVSYS_Handler, default_handler
.weakref SERCOM0_Handler, default_handler
.weakref SERCOM1_Handler, default_handler
.weakref SERCOM2_Handler, default_handler
.weakref SERCOM3_Handler, default_handler
//.weakref SERCOM4_Handler, default_handler
//.weakref SERCOM5_Handler, default_handler
.weakref TCC0_Handler, default_handler
.weakref TCC1_Handler, default_handler
.weakref TCC2_Handler, default_handler
.weakref TC3_Handler, default_handler
.weakref TC4_Handler, default_handler
.weakref TC5_Handler, default_handler
//.weakref TC6_Handler, default_handler
//.weakref TC7_Handler, default_handler
.weakref ADC_Handler, default_handler
.weakref AC_Handler, default_handler
.weakref DAC_Handler, default_handler
.weakref PTC_Handler, default_handler
.weakref I2S_Handler, default_handler
.weakref AC1_Handler, default_handler
//.weakref TCC3_Handler, default_handler

Summary

I already have a WIP with all the peripherals commented, a better default handler, etc. But I think this post is quite long enough!

Hopefully, fingers crossed, next time, blinking?

This article is part of a series.