How can I get this SAMD21E18 startup code a little sturdier?
This article is part of a series.
- Part 1: Hello LED on an AVR (ATtiny45) in C
- Part 2: How can I make programming an ARM chip as hard as possible?
- Part 3: This Article
- Part 4: It's ALIVE! (SAMD21E18A, Assembly, No SDK)
- Part 5: And now for 3 ways to set an internal pullup
- Part 6: It'd make sense to do some toolchain clean up
- Part 7: Neat, switching to the ItsyBitsy just... works
Related Repo: https://github.com/carlynorama/StrippedDownChipRosetta
At the end of the last post I mentioned that I wasn’t going to comment on the code or the linker script, but that I didn’t recommend editing them yet. The code from the last post worked because it had workable hooks where they needed to be for what it was trying to do. Many minimal examples work because they do nothing. Adding more code in the wrong place can break things with no error messages to help.
When programming for a machine with an operating system, generally folks can start with the apps entry point, “main”, and go from there. On a microcontroller the software has to be both application and “operating system”. Thankfully IDEs and frameworks hide a lot of that so folks can get right to work on their idea and only dig deeper if they have a special concern.
When one does need to go under the hood, the process of setting up a new chip has similarities to learning how to implement a new Protocol. There’s a contract to deliver on and you have to learn what it is.
The first step involves providing a start up script that at least:
- Defines and places a critical list of function pointers called a “vector table”. (On Arm® Cortex®-M0+ Processor Core MCU chips the official name of the interrupt handler is the Nested Vectored Interrupt Controller.)
- Initializes two memory areas (.data and .bss)
- Starts the program!
This post is less tutorial and more record of what I did, what resources I needed and what resources I think others might find helpful to fill in the gaps
References
Tutorials and repos here. Topic specific links in their sections.
Official
Writing start up scripts and linker files correctly requires intimate knowledge of the specific microcontroller’s physical layout. Smart people focused on concrete goals use manufacture provided versions of these files because the manufacturer in theory knows their chip best (and any errata). To find the all the information needed can take a little digging. These are all the sources I looked at. I will state which one contributed to what as I go.
-
Most Important Link For Vector Table. Do. Not. Skip.
-
Product Reference Page
- https://www.microchip.com/en-us/product/ATSAMD21E18
- Reachable from here:
- arm-gcc compiler tools, (come with Atmel Studio if you have that) https://www.microchip.com/en-us/tools-resources/develop/microchip-studio/gcc-compilers
- http://packs.download.atmel.com/ (rename .atpack files to .zip) (packs on github)
- https://www.keil.arm.com/devices/ for CMSIS? Link provided by Microchip. hunh.
-
Full Family Data Sheet
-
Cortex-M0+ Technical Reference Manual (Revision: r0p1)
-
ARMv6-M Architecture Reference Manual
-
Demo project repo, SAMD21N Getting Started. Needs MPLAB software to run, uses C, but I wanted to look at the list of hardware defs and linker scripts to double check my work.
More Instructional
-
Carry Over: STM32 Based Fastbit Embedded Brain Academy“Bare metal embedded” playlist. Especially videos 2 & 3
-
Continuation: “Bare Metal” STM32 Programming (Part 2)
-
If you’re using a Pico (RP2040), stop reading this and go
-
https://forum.digikey.com/t/getting-started-with-the-sam-d21-xplained-pro-without-asf/13176
-
https://jacobmossberg.se/posts/2017/01/10/minimalistic-assembler-program-on-stm32-h103.html
- armasm source: https://pygmy.utoh.org/riscy/cortex/led-stm32.html
How to search github
When looking for start up scripts to verify my work wasn’t totally off base, github searches of:
SAMD21E18 path:*.s
path:*SAMD21* path:*.s
path:*SAMD21* path:*.s path:*startup*
#ifndef _SAMD21E18A_
Turned up some of the most useful results.
To go a little more scatter shot, the following phrases (plus your Microchip part number) are an indicator that file started from the official source. I’m sure you can come up with more “Statistically Improbable Phrases” from the demo projects, etc above. - “\brief Header file for” - “Microchip Technology Inc. and its subsidiaries” - “Copyright (wild card) Atmel Corporation” - “\brief This module contains”
How To Compare Assembly Start Up Files
This will not be a course in assembly, although I’ve tried to provide alternate examples and some explanations.
I mentioned searching github, but assembly start up scripts can look wildly different even for the same chip:
- for tool chain reasons
- for syntax preference reasons
- for what parts of the chip’s functionality is actually needed reasons
- for compatibility with other code reasons (will there be C?, is this just for one project or the basis of a library?)
- for stylistic decisions about how to ensure code blocks end up where they’re supposed to be reasons (physical placement, naming conventions, manually setting the counter value)
- for stylistic decisions about how much gets handled in the code files vs the linker file reasons (where do you see “align”)
It can be reaaaaly hard to tell what’s a meaningful difference between two files and what is not. Hopefully the links below will help when my comments aren’t good enough.
-
Nice syntax explanation in a relevant example of a start up (Not M0 though)
-
Intro to Assembly for Arm programming for playlist from last post.
-
MIT Open Courseware 4. Assembly Language & Computer Architecture from MIT 6.172 Performance Engineering of Software Systems, Fall 2018
A - A mildly better version
Finding what I needed to make a vector table
The “vector table” lists all the addresses of the critical low level functions the chip will need to know what to do when “something happens”. They must live at exactly the location defined by the architecture and the manufacturer. They are incredibly device dependent, although the nice thing about Arm chips is that they share some values across cores and architectures.
Some of the types of things that get answered by the vector table are:
- what’s the first thing I should do?
- what do I do when I fail?
- what do I do with this messages?
- its been 27 ticks - I check where after 27 ticks?
Misc Overviews
- Program, Interrupted - Computerphile https://www.youtube.com/watch?v=54BrU82ANww (6:40)
- https://en.wikipedia.org/wiki/Vectored_interrupt
- 18.2.6 Strong Priority Systems https://www.youtube.com/watch?v=PmOq8G_hs4o from Computation Structures Uses keyboards and printers, but micros have other “peripherals”
How does that all work on the SAMD21 specifically
- READ THIS!!!!! —-> https://developerhelp.microchip.com/xwiki/bin/view/products/mcu-mpu/32bit-mcu/sam/samd21-mcu-overview/samd21-processor-overview/samd21-nvic-overview/ <———
A vector table for a SAMD21 has to deliver on the Arm® Cortex®-M0+ Processor Core MCU requirements as well as support its own peripherals. The above link is by far the best information I found for this chip.
Images below from that page:
But this just tells you about the Cortex-M0+ values. The chip specific information can be tracked down in the Datasheet. Unfortunately I couldn’t the find a comprehensive list in one easy place like for the Atmega328 (not Arm, different beast, table 11-1 in 328 datasheet) or the STM32F0 (Arm Coretex-M0 (not M0p) Section 11, Table 36 in the reference manual). If it wasn’t for this one page I’m not sure how one would have pulled together the full list except by using sample code.
Product Mapping
Datasheet 9 Product Mapping Figure 9-1. SAM D21 Product Mapping
Chip Specific Hardware List
The 0 next to PM - Power Manager lines up with the the 16 IRQ 0
in the very first image, and the 0 PM_Manager
(addresses 0x00000040)
Datasheet 11.2 Nested Vector Interrupt Controller 11.2.2 Interrupt Line Mapping
!!!WARNING!!! FAMILY Datasheet. Cross reference the below with section 2 Configuration Summary (not shown) for which ones your chip specifically implements.
Updating Hello Code With Minimal Vector Table & Handlers
Between your start up function and the linker file the idea is to defend the expected amount of space at the expected location and fill it with either useful or guaranteed harmless information.
My first example will only specify Cortex-M0+ needed values. Since embedded chips are small it’s not uncommon to not leave space for the features of the chip you know your code (AND CIRCUIT!) won’t use.
.syntax unified
.cpu cortex-m0
.fpu softvfp
.thumb
@ vector_table will be at the top of text, because it is at the top of text
@ and we have no other files.
.text
vector_table:
.word sp_initial_value //edge of stack, set in linker
.word reset_handler //reset handler
.word nmi_handler
.word hard_fault_handler
.word 0//not in M0+
.word 0//not in M0+
.word 0//not in M0+
.word 0//not in M0+
.word 0//not in M0+
.word 0//not in M0+
.word 0//not in M0+
.word svc_handler
.word 0//not in M0+
.word 0//not in M0+
.word pendsv_handler
.word systick_handler
.weakref nmi_handler, default_handler
.weakref hard_fault_handler, default_handler
.weakref svc_handler, default_handler
.weakref pendsv_handler, default_handler
.weakref systick_handler, default_handler
.word default_handler
.thumb_func
default_handler:
ldr r3, =#0x8BADF00D
@ hangs the program.
b .
.global reset_handler
.thumb_func
reset_handler:
LDR r0, =sp_initial_value
MOV sp, r0
LDR r7, =0xF0CACC1A
MOVS r0, #0
main_loop:
@ Add 1 to register 'r0'.
ADDS r0, r0, #1
@ Loop back.
B main_loop
I’m using a different syntax than the the vivonomicon version. This version of the code WILL NOT WORK for non-thumb mode arm processors. I’ll revert to that more core flexible syntax, using the gcc assembler directive .type, in code where I care about portability.
Each interrupt gets a weakref
(search on this gnu docs page) to a default handler function. This lets a future user of this code override their values in another file. I could have just put default_handler
everywhere I put a unique function name in the vector table. You might see this weak ref creation in other files as:
.weak SysTick_Handler
.thumb_set SysTick_Handler, Default_Handler
A different flavor of assembly (armasm, not the now recommended by Arm gccasm) in some of the official docs
SysTick_Handler PROC
EXPORT SysTick_Handler [WEAK]
B .
ENDP
Handled by a macro as suggested in Arm provided code (set vs thumb_set depending on the instruction set)
.macro IRQ handler
.weak \handler
.set \handler, Default_Handler
.endm
IRQ SysTick_Handler
Or even good old
void SysTick_Handler(void) __attribute__ ((weak, alias ("Default_Handler")));
These all serve the same purpose.
Another thing you might see in thumb mode vector tables is + 1
to the function address or some .align N
where N is odd before a function declaration later in the code or something somewhere that shifts the handler addresses so the least significant bit of the address will be 1. I put some links in the assembly link section above and yesterday that cover how the assembler can tell if it should be using 16 or 32 bit instructions by the lsb. It comes up a lot.
- https://stackoverflow.com/questions/27118795/cortex-m0-linker-script-and-startup-code/27128307#27128307
- https://stackoverflow.com/questions/77060865/thumb-func-directive-is-not-accounted-for
- https://stackoverflow.com/questions/4423211/when-are-gas-elf-the-directives-type-thumb-size-and-section-needed
- https://wiki.segger.com/Correct_typing_of_Thumb_functions
- https://community.arm.com/support-forums/f/architectures-and-processors-forum/4167/difference-between-thumb-machine-directives
Thankfully .thumb_func
should be taking care of that for me. The =mthumb
in the gcc calls should do that too when using the .type directive. (.type name, %function
), but I haven’t confirmed that yet.
Mildly Improving the Linker
This program relies on the vector table being defined in the one source file immediately after the .text
compiler directive is invoked. I’ll make a more resilient set up next. The compiler already has opinions about what to do with .text, so you could leave it out in other circumstances.
How does this differ from the previous file?
- changes the entry point officially ENTRY
- uses the SECTIONS command with some of the linker software special-treatment section names (text, data, bss) [todo links]
- sweeps in those sections from ALL files.
- places those sections with > AT based on the defined memory locations
ENTRY(reset_handler);
MEMORY {
flash(rx): ORIGIN = 0x00000000, LENGTH = 0x00040000 /*256K*/
sram(rwx): ORIGIN = 0x20000000, LENGTH = 0x00008000 /*32K*/
}
sp_initial_value = ORIGIN(sram) + LENGTH(sram); /* 0x20008000 */
SECTIONS {
.text : { *(.text) } > flash
.data : { *(.data) } > sram AT> flash
.bss : { *(.bss) } > sram
}
More links on linker files later.
B - Separating the Startup from The Program Code
This code works (use the makefile), but it’s still fragile. What the chip needs and our wants are all mushed together and everything relies on putting things in the right place in the file. And we haven’t even done some of the important startup basics.
1 - Pull out the Main Loop, Still one File
First I changed hello_improved to hello_main, pulling out the loop in the same file via branching. I did not following all the proper calling conventions.
hello_main also has new section markers, which were tested by adding them in the linker an running the shiny new disassemble command in the makefile.
This is all in the repo here
2- Doing more for the start up
When looking for the right assembly syntax for the reset handler I was influenced by the following scripts:
- https://vivonomicon.com/2018/04/20/bare-metal-stm32-programming-part-2-making-it-to-main/
- Same with different copyrights? (armasm)
- https://github.com/microsoft/uf2-samdx1/blob/4c900345561949f1c37a367d72eb8e420b01f72f/lib/samd21/samd21a/armcc/Device/SAMD21/Source/ARM/startup_SAMD21.s
- https://github.com/Microchip-MPLAB-Harmony/dev_packs/blob/345dc12d42a4fdec72117b64a8c3527023bdeca9/Microchip/SAMD21_DFP/3.6.144/samd21a/armcc/armcc/startup_samd21e18a.s
- https://github.com/apache/mynewt-core/blob/158c30fa78f6a93eb267896380f2a03a93cc1527/hw/bsp/arduino_zero/src/arch/cortex_m0/startup_samd21xx.s
- https://electronics.stackexchange.com/a/452035
- https://gist.github.com/ppannuto/672328eb8184abdb9559
TODO: Benchmark them. I’m not doing anything fancy enough with this code to really make a difference.
I haven’t commented every line. Some assistance (also last post’s links…):
- thumb instruction quick reference card by arm
- Fancier branching:
- Multi register handling:
- What’s the deal with _sidata
- since data is relocatable (unlike text), we need the “real” address once available.
- https://en.wikipedia.org/wiki/Data_segment
- LOADADDR in the docs: “Return the absolute load address of the named section. This is normally the same as ADDR, but it may be different if the AT keyword is used in the section definition”
- NOLOAD starts showing up next to bss in fancier linkers - discussion
- more about linkers:
- working with variable in gcc assembly (to test this worked)
- ALIGN (right under LOADADDR) and why memory alignment is important.
The new reset handler code
.section .text.reset_handler
@ The Reset Handler
.global reset_handler
.thumb_func
reset_handler:
/* reset handler started */
/* r7 will hold my notes to self */
ldr r7, =0x0000AAAA
/* set stack pointer */
ldr r1, =_end_stack
mov sp, r1
/* set .bss section to 0 (SRAM) */
/* uses the jump to exit style loop */
zero_bss:
/* get 0 handy in two registers */
movs r0, #0
movs r1, #0
/* get the bounds */
ldr r2, = _start_bss
ldr r3, = _end_bss
loop_start_bss_zero:
/* compare, first one is that there is a bss. */
cmp r2, r3
/* compare: unsigned higher or same (is end higher than current ) */
bhs loop_end_bss_zero
/* will load 2 and increase the pointer after */
@ stmia r2!, {r0,r1}
stmia r2!, {r0} /* decided against moving bss to after data in linker */
b loop_start_bss_zero
loop_end_bss_zero:
/* copy .data section to SRAM */
/* uses the jump to entry style loop */
move_data_to_ram:
/* get the bounds */
ldr r0, = _start_data
ldr r1, = _end_data
/* read head location */
ldr r2, = _sidata
b loop_enter_data_copy
loop_action_data_copy:
ldmia r2!, {r3}
stmia r0!, {r3}
loop_enter_data_copy:
cmp r0, r1
bcc loop_action_data_copy
/* call the clock system initialization function.*/
/* not done */
/* call init of libraries (stdlib when loop in C) that need it.*/
/* not done */
/* note to self handler finished */
LDR r7, =0x0000BBBB
/* Call the application's entry point.*/
bl my_main
And the top of the file with lots of variables to test
.bss
/* note, bss is marked noload in the linker */
/* no matter how long, shouldn't effect size of elf */
/* assembler wont let you store values that aren't 0 */
make: .word 0
something: .word 0
be: .word 0
here: .word 0
.data
/* like define, but tricksy. */
.equ pelargonium, 0x600dCA7
.equ rosebud, 0xF0CACC1A
/* variables with values */
ennie: .word 124
meenie: .word 125
mineie: .word 126
moe: .word 127
@ this is NOT how you create unallocated memory
@ these all point to the same place in memory.
breakfast: .word
lunch: .word
dinner: .word
midnight_snack: .word 27
adr_ennie: .word ennie
adr_meenie: .word meenie
adr_mineie: .word mineie
adr_moe: .word moe
adr_breakfast: .word breakfast
adr_lunch: .word lunch
adr_dinner: .word dinner
adr_midnight_snack: .word midnight_snack
adr_make: .word make
increment: .word 1
.section .text.program_code
.word my_main
.thumb_func
my_main:
/* note to self main started finished */
LDR r7, =0x0000CCCC
@ values of the unallocated should match
@ ennie-moe after done with this.
LDR r0, =ennie
ldmia r0!, {r1-r4}
LDR r0, =make
stmia r0!, {r1-r4}
/* start with 0 */
movs r0, #0
loop_start:
// Add 1 to register 'r0'.
ADDS r0, r0, #1
b loop_start
The Matching Linker
ENTRY(reset_handler);
MEMORY {
flash(rx): ORIGIN = 0x00000000, LENGTH = 0x00040000 /*256K*/
sram(rwx): ORIGIN = 0x20000000, LENGTH = 0x00008000 /*32K*/
}
_end_stack = ORIGIN(sram) + LENGTH(sram); /* 0x20008000 */
SECTIONS {
.text : {
hello_startup.o(.vectors)
KEEP(*(.vectors .vectors.*))
*(.text.reset_handler)
*(.text.default_handler)
/* the rest of the .text and .text sub sections */
*(.text .text.*)
/* all files read only data and read only data sub sections */
*(.rodata, .rodata*) /* read only data */
} > flash /* nothing relocatable */
. = ALIGN(4);
_end_text = .; /* store current location counter value in _end_text */
/* Used by the startup to initialize data with the "originating" address */
/* not so relevant on the M0+, more significant diff on beefier cores */
_sidata = LOADADDR(.data);
.data : {
. = ALIGN(4);
_start_data = .; /* store current location in _end_data t*/
*(.data)
. = ALIGN(4);
_end_data = .; /* store current location in _end_data t*/
} > sram AT> flash
.bss(NOLOAD) : {
. = ALIGN(4);
_start_bss = .;
*(.bss)
*(COMMON)
} > sram
/* todo, inside or outside */
. = ALIGN(4);
_end_bss = .;
}
C - Splitting The Code Into Separate Files
Honestly. It just worked.
- split the files
- change the name of hard include in the linker
- updated the Makefile to pull in two files (vey unpolished. I’ll fix it later.)
The really change is finally adding in all the peripherals! (see Table 2-1 (Configuration Summary). Some of those are 0s b/c not on the E version of the chip)
.section .vectors
.global vector_table
vector_table:
//-------------- arm core list
.word _end_stack //edge of stack, set in linker
.word reset_handler //reset handler
.word nmi_handler
.word hard_fault_handler
.word 0//not in M0+
.word 0//not in M0+
.word 0//not in M0+
.word 0//not in M0+
.word 0//not in M0+
.word 0//not in M0+
.word 0//not in M0+
.word svc_handler
.word 0//not in M0+
.word 0//not in M0+
.word pendsv_handler
.word systick_handler
//-------------- peripherals list
.word PM_Handler
.word SYSCTRL_Handler
.word WDT_Handler
.word RTC_Handler
.word EIC_Handler
.word NVMCTRL_Handler
.word DMAC_Handler
.word USB_Handler
.word EVSYS_Handler
.word SERCOM0_Handler
.word SERCOM1_Handler
.word SERCOM2_Handler
.word SERCOM3_Handler
.word 0
.word 0
.word TCC0_Handler
.word TCC1_Handler
.word TCC2_Handler
.word TC3_Handler
.word TC4_Handler
.word TC5_Handler
.word 0
.word 0
.word ADC_Handler
.word AC_Handler
.word DAC_Handler
.word PTC_Handler
.word I2S_Handler
.word 0
.word 0
.text
.section .text.default_handler
.word default_handler
.thumb_func
default_handler:
ldr r3, =#0x8BADF00D
//hangs the program.
b .
.weakref nmi_handler, default_handler
.weakref hard_fault_handler, default_handler
.weakref svc_handler, default_handler
.weakref pendsv_handler, default_handler
.weakref systick_handler, default_handler
.weakref PM_Handler, default_handler
.weakref SYSCTRL_Handler, default_handler
.weakref WDT_Handler, default_handler
.weakref RTC_Handler, default_handler
.weakref EIC_Handler, default_handler
.weakref NVMCTRL_Handler, default_handler
.weakref DMAC_Handler, default_handler
.weakref USB_Handler, default_handler
.weakref EVSYS_Handler, default_handler
.weakref SERCOM0_Handler, default_handler
.weakref SERCOM1_Handler, default_handler
.weakref SERCOM2_Handler, default_handler
.weakref SERCOM3_Handler, default_handler
//.weakref SERCOM4_Handler, default_handler
//.weakref SERCOM5_Handler, default_handler
.weakref TCC0_Handler, default_handler
.weakref TCC1_Handler, default_handler
.weakref TCC2_Handler, default_handler
.weakref TC3_Handler, default_handler
.weakref TC4_Handler, default_handler
.weakref TC5_Handler, default_handler
//.weakref TC6_Handler, default_handler
//.weakref TC7_Handler, default_handler
.weakref ADC_Handler, default_handler
.weakref AC_Handler, default_handler
.weakref DAC_Handler, default_handler
.weakref PTC_Handler, default_handler
.weakref I2S_Handler, default_handler
.weakref AC1_Handler, default_handler
//.weakref TCC3_Handler, default_handler
Summary
I already have a WIP with all the peripherals commented, a better default handler, etc. But I think this post is quite long enough!
Hopefully, fingers crossed, next time, blinking?
This article is part of a series.
- Part 1: Hello LED on an AVR (ATtiny45) in C
- Part 2: How can I make programming an ARM chip as hard as possible?
- Part 3: This Article
- Part 4: It's ALIVE! (SAMD21E18A, Assembly, No SDK)
- Part 5: And now for 3 ways to set an internal pullup
- Part 6: It'd make sense to do some toolchain clean up
- Part 7: Neat, switching to the ItsyBitsy just... works