Bootloader with OTA Phase 2: Minimal Bootloader

Roadmap

  • Phase 2A: Create a bootloader project. The goal is to create a simple bootloader program running at 0x08000000, blinking an LED, and staying in an infinite loop.
  • Phase 2B: Create a separate project for the application. It will also blink an LED but with a different interval. The application runs at 0x08010000 and we will modify the linker script to achieve this.
  • Phase 2C: We will make a jump_to_application function inside the bootloader to transfer control from the bootloader to the application.

Phase 2A: Create a Bootloader Project

Let’s create a simple program that blinks an LED using HAL methods. Although now lack of bootloading features, later on this will serve as the bootloader. Open Core/Src/main.c and modify the main function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
int main(void)
{
  /* MCU Configuration */
  HAL_Init();
  SystemClock_Config();
  MX_GPIO_Init();
  MX_USART1_UART_Init();  // If you configured UART

  /* USER CODE BEGIN 2 */

  printf("\r\n");
  printf("========================================\r\n");
  printf("    BOOTLOADER v1.0                    \r\n");
  printf("========================================\r\n");
  printf("Running at address: 0x%08lX\r\n", (uint32_t)&main);
  printf("Bootloader running...\r\n");
  printf("\r\n");

  /* USER CODE END 2 */

  /* Infinite loop */
  while (1)
  {
    /* USER CODE BEGIN 3 */

    // Blink LED slowly (bootloader pattern)
    HAL_GPIO_TogglePin(GPIOG, GPIO_PIN_13);  // Green LED
    HAL_Delay(500);  // 500ms on, 500ms off

    /* USER CODE END 3 */
  }
}

There are checkpoint questions from Claude:

  • Do you see these files?: Debug/Bootloader.elf (executable with debug symbols) and Debug/Bootloader.bin (raw binary)
  • When you run the bootloader, what address do you see printed for &main? Is it close to 0x08000000?
  • When we create the application project, what will be different compared to the bootloader project?

Q1. Where Is the .bin File and Why Is It Needed?

If you are running a program in STM32CubeIDE without any additional configuration, then you’ll probably see only these files in the Debug folder:

1
2
3
4
5
6
// My project name is `Basic-Bootloader`
Debug
├── Basic-Bootloader.elf
├── Basic-Bootloader.list
├── Basic-Bootloader.map
...

It’s natural that the .bin file doesn’t appear at first, because the .bin file is not generated by default in STM32CubeIDE. We need to configure the build to create it.

Here’s how to configure the post-build steps:

  1. Right-click your project (Your-Project-Name) → Properties
  2. Navigate to: C/C++ Build → Settings
  3. Go to: MCU Post build outputs tab
  4. Check the box: ☑ Convert to binary file (-O binary)
  5. Click Apply and Close
  6. Rebuild your project

Note that each file is:

  • .elf - Full executable with debug symbols, and it is used by the debugger.
  • .bin - Raw binary firmware image. This gets written to flash.
  • .map - Memory map showing where everything is located. It’s also useful for debugging.

This .bin file is the actual executable file of the bootloader. We will keep updating and writing this .bin file to flash.

Q2. What Address Do You See Printed for &main? Is It Close to 0x08000000?

I got 0x080005AD for the &main address. But why is it not exactly 0x08000000? In fact, the main() function is not the very first thing in flash. Here’s what’s happening:

1
2
3
4
5
6
7
Flash Memory Layout:
0x08000000: Vector Table (first ~400 bytes)
            - Stack pointer, reset handler, interrupt handlers, etc.
0x08000XXX: Startup code (system initialization)
            - SystemInit(), clock config, C runtime setup
0x080005AD: Your main() function starts HERE ← This is what you see!
0x0800XXXX: Other functions, HAL library code, etc.

Before main() runs,

  1. Vector table occupies the first ~0x1C0 bytes
  2. Reset_Handler (= startup code) runs
  3. Startup code calls SystemInit(), initializes clocks, RAM, etc.
  4. And then, finally, it calls main()

That’s why the main() function address is quite close to 0x08000000, but not exactly 0x08000000. At this point, it was getting overwhelming as unfamiliar terms were pouring out. But these concepts will continue to appear in Phase 2 and beyond, so I’ll explain them soon in sufficient detail.

Q3. When Writing the Application, What Is Different Compared to the Bootloader Project?

This is a key question for Phase 2.

The answer is memory address. The bootloader starts at 0x08000000, while the application starts at 0x08010000. And this will require linker script modification, which we will handle in the next step.

Some people might raise a fundamental question here. Why does the application project need to have a separate, different memory address from the bootloader? It might seem obvious, but it was difficult to give a specific answer right away.

Claude answered:

Think about what happens at power-on. The processor always starts executing from address 0x08000000. It reads the stack pointer from 0x08000000, reads the reset vector from 0x08000004, and jumps to that address to start running bootloader code. Reset always starts bootloader by hardware requirement. Then bootloader decides if it should update the bootloader or just jump to app. To enable this “jump to application”, application must be stored in separate address.

That is true. Let’s think about the opposite scenario. If the application is located in the same address 0x08000000, then as you flash the application, it will simply overwrite the bootloader.

What about placing the application in a contiguous memory address to the bootloader? Well, it might seem good from a memory efficiency perspective, but it’s definitely not a good idea. It’s hard to predict exactly how big the bootloader binary file will be, and what if you update the bootloader and the size changes? You would have to change the application memory address every time you update, or otherwise there will be a partial region where the bootloader overwrites the first part of the application.

Phase 2B: Create an Application Project and Modify the Linker Script

To make the application run at 0x08010000 instead of 0x08000000, we must first find the linker script. This is the linker script: STM32F429ZITX_FLASH.ld

1
2
Application/
└── STM32F429ZITX_FLASH.ld

In the file, you’ll see something like:

1
2
3
4
5
6
7
/* Memories definition */
MEMORY
{
  RAM    (xrw)    : ORIGIN = 0x20000000,   LENGTH = 192K
  CCMRAM (xrw)    : ORIGIN = 0x10000000,   LENGTH = 64K
  FLASH  (rx)     : ORIGIN = 0x08000000,   LENGTH = 2048K
}

Modify the FLASH line from:

1
FLASH  (rx)     : ORIGIN = 0x08000000,   LENGTH = 2048K

To:

1
FLASH  (rx)     : ORIGIN = 0x08010000,   LENGTH = 960K

Fix VECT_TAB_OFFSET in system_stm32f4xx.c

In Core/Src folder, there is a file named system_stm32f4xx.c. We need to change from:

1
#define VECT_TAB_OFFSET 0x00000000U /*!< Vector Table offset field. This value must be a multiple of 0x200. */

to:

1
#define VECT_TAB_OFFSET 0x00010000U /* Application starts at 0x08010000 */

The application needs to tell the Cortex-M where its interrupt vector table is located. By default, it assumes 0x08000000, but our application is at 0x08010000 (offset by 0x10000 bytes = 64KB).

Without this change, if an interrupt fires while the application is running, the CPU will look for the handler at the wrong address (and that’s why my initial try didn’t work).

Application Code

The application code is also very simple. The only difference from the bootloader is that it blinks the LED at 100ms, whereas the bootloader’s interval is 500ms.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
int main(void)
{
  /* MCU Configuration */
  HAL_Init();
  SystemClock_Config();
  MX_GPIO_Init();
  MX_USART1_UART_Init();

  /* USER CODE BEGIN 2 */

  printf("\r\n");
  printf("========================================\r\n");
  printf("    APPLICATION v1.0                   \r\n");
  printf("========================================\r\n");
  printf("Running at address: 0x%08lX\r\n", (uint32_t)&main);
  printf("Application is running!\r\n");
  printf("\r\n");

  /* USER CODE END 2 */

  /* Infinite loop */
  while (1)
  {
    /* USER CODE BEGIN 3 */

    // Blink LED FAST (application pattern - different from bootloader)
    HAL_GPIO_TogglePin(GPIOG, GPIO_PIN_13);  // Green LED
    HAL_Delay(100);  // 100ms on, 100ms off (MUCH FASTER than bootloader)

    /* USER CODE END 3 */
  }
}

Phase 2C:jump_to_application Function

Now this is the core feature of the bootloader: jump to the application. I’ll explain each line of code one by one.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
void jump_to_application(uint32_t app_address)
{
    printf("Preparing to jump to application at 0x%08lX...\r\n", app_address);

    // 1. Read the application's vector table
    //    First entry: Initial Stack Pointer
    //    Second entry: Reset Handler (entry point)
    uint32_t app_stack_pointer = *((__IO uint32_t*)app_address);
    uint32_t app_entry_point = *((__IO uint32_t*)(app_address + 4));

    printf("  App Stack Pointer: 0x%08lX\r\n", app_stack_pointer);
    printf("  App Entry Point:   0x%08lX\r\n", app_entry_point);

    // 2. Sanity check: Is the stack pointer valid?
    //    It should point to RAM (0x20000000 - 0x20030000 for STM32F429)
    if ((app_stack_pointer < 0x20000000) || (app_stack_pointer > 0x20030000))
    {
        printf("ERROR: Invalid stack pointer! Application may not be valid.\r\n");
        return;  // Don't jump to invalid application
    }

    printf("Jumping to application NOW!\r\n\r\n");
    HAL_Delay(100);  // Give UART time to send the message

    // 3. Disable interrupts
    __disable_irq();

    // 4. Disable all peripheral clocks (important!)
	__HAL_RCC_GPIOA_CLK_DISABLE();
	__HAL_RCC_GPIOB_CLK_DISABLE();
	__HAL_RCC_GPIOC_CLK_DISABLE();
	__HAL_RCC_GPIOD_CLK_DISABLE();
	__HAL_RCC_GPIOE_CLK_DISABLE();
	__HAL_RCC_GPIOF_CLK_DISABLE();
    __HAL_RCC_GPIOG_CLK_DISABLE();
	__HAL_RCC_GPIOH_CLK_DISABLE();
	__HAL_RCC_USART1_CLK_DISABLE();
	__HAL_RCC_USB_OTG_FS_CLK_DISABLE();
	__HAL_RCC_USB_OTG_HS_CLK_DISABLE();  // Add this!


	__HAL_RCC_DMA2D_CLK_DISABLE();
	__HAL_RCC_LTDC_CLK_DISABLE();
	__HAL_RCC_FMC_CLK_DISABLE();

	// 5. Deinitialize HAL
	HAL_DeInit();

    // 6. Disable SysTick
    SysTick->CTRL = 0;
    SysTick->LOAD = 0;
    SysTick->VAL = 0;

    // 7. Clear all interrupt pending flags
	for (int i = 0; i < 8; i++)
	{
		NVIC->ICPR[i] = 0xFFFFFFFF;
	}

	// 8. Set the vector table address to the application's vector table
	SCB->VTOR = app_address;

    // 9. Set the stack pointer to the application's initial stack pointer
    __set_MSP(app_stack_pointer);

    // 10. Set control register
    __set_CONTROL(0);

    // 11. Jump to the application's reset handler
    void (*app_reset_handler)(void) = (void (*)(void))app_entry_point;
    app_reset_handler();

    // Should never reach here
    while (1);
}

Line 8-9: Application’s Stack Pointer and Entry Point

1
2
    uint32_t app_stack_pointer = *((__IO uint32_t*)app_address);
    uint32_t app_entry_point = *((__IO uint32_t*)(app_address + 4));

This is the code that reads the first two entries from the application’s vector table.

  1. The first is the initial stack pointer, or main stack pointer (MSP). It means “where the app wants its stack”.
  2. The second is the reset handler address, or main() entry point. It means “where the app code starts”.

Or we can put it in a different way: “We set the variable app_address as a pointer to the vector table”.

If we flashed the application correctly, we should have the following output:

1
2
  App Stack Pointer: 0x08010000
  App Entry Point:   0x08010004

Wait… So What Is a Vector Table?

From the explanation above, we now know that there are pointer or address values stored in a vector table. Then what exactly is a vector table?

It is a lookup table of memory address that tells the processor where to jump when specific events occur.

Claude provided a brilliant analogy to explain the concept:

Think of it like an emergency contact list.

  • “If there’s a fire, call 911”
  • “If the power goes out, call the electrician at 555-1234”
  • “If someone breaks in, call security at 555-5678” The vector table says:
  • If the chip resets, jump to address 0x080101C5
  • If a HardFault occurs, jump to address 0x080101E9
  • If UART1 receives data, jump to address 0x08010245

To put it in other terms, a vector table can be thought of as an “array”, an array where each entry is 32-bit address stored in flash memory.

The ARM Cortex-M vector table always has:

  • Entry 0: Initial value for the stack pointer (not an address!)
  • Entry 1: Addresses of exception/interrupt handler functions

While studying the concept of the vector table, I came up with a question: So does every application have one vector table assigned? If there are multiple application programs loaded to memory, does each of the programs have a separate vector table?

And the short answer is YES. Every application has its own vector table embedded in its binary. But only one can be active at a time, pointed to by VTOR (Vector Table Offset Register). To be specific, only one program can execute at any given moment. That means you can have multiple programs in flash, each with their own vector table, but there can be only one vector table that is active (or pointed to by VTOR).

Line 26: Disable Interrupts

1
__disable_irq();

An Interrupt Request (IRQ) is a signal from hardware (like a mouse or keyboard) telling the CPU to pause its current task to handle an important event, while disabling an IRQ means telling the system to temporarily ignore signals from a specific device, often to resolve conflicts or manage system resources, though it can increase latency if done excessively.

Why do we need to disable interrupt requests (IRQ)?

The bootloader itself has configured interrupts (UART, timers, etc.), so if an interrupt fires after we jump to the app, it would call the bootloader’s interrupt handler. But by then we’re no longer in bootloader memory, so it will crash.

Line 29-44: Disable Peripheral Clocks

1
2
3
__HAL_RCC_GPIOA_CLK_DISABLE();
__HAL_RCC_USART1_CLK_DISABLE();
// etc...

For similar reasons, the bootloader needs to “clean up” the peripheral states before jumping to the application. The bootloader enabled clocks for UART, GPIO, USB, etc., but the application will also configure its own peripherals in its own way. If we leave peripherals running before jumping to the application, it can cause conflicts—the application tries to reconfigure a running peripheral, or the application’s initialization might not work correctly.

Line 47-54: HAL and SysTick

1
2
3
4
5
HAL_DeInit();
// ...
SysTick->CTRL = 0; // SysTick Control and Status Register
SysTick->LOAD = 0; // Reload Value Register
SysTick->VAL = 0;  // Current Value Register

HAL maintains internal state, such as whether peripherals are initialized or tick counters. The application will call HAL_Init() too and expects everything to be in a reset state.

Disabling SysTick is another critical step. SysTick is the timer that drives HAL_Delay() and HAL_GetTick() (which caused me a lot of trouble during debugging). The bootloader configured it to tick at 1ms intervals, and if left running, it would interrupt the application using the bootloader’s SysTick handler.

  • SysTick->CTRL = 0 (disable), LOAD (reload value), and VAL (current value) to fully reset it.

Line 57-60: Clear All Interrupt Pending Flags

1
2
3
4
for (int i = 0; i < 8; i++)
{
    NVIC->ICPR[i] = 0xFFFFFFFF;
}

NVIC (Nested Vectored Interrupt Controller) is a type of specialized hardware that manages all interrupts on ARM Cortex-M.

Then why do we “erase” or clear all the states (set ICPR = Interrupt Clear Pending Register to “erase”) in NVIC? It’s because some interrupt might have triggered but not been serviced yet (it’s “pending”). In other words, we write 0xFFFFFFFF to clear ALL pending interrupts. There are 8 registers because STM32F4 has up to 256 possible interrupt sources (8 × 32 bits). Without this, a pending bootloader interrupt could fire in the application using the wrong handler address.

Line 63: Relocate the Vector Table

1
SCB->VTOR = app_address;
  • SCB stands for System Control Block, which is the core ARM Cortex-M configuration.
  • VTOR stands for Vector Table Offset Register.

By default, VTOR is 0x08000000 (bootloader’s vector table). But when an interrupt occurs, the processor looks at VTOR + interrupt_number × 4 to find the handler. We change VTOR to 0x08010000 so interrupts use the application’s handlers. Without this, all interrupts would still jump to bootloader handlers even though we’re running application code, leading to an instant crash.

Line 66: Set the Stack Pointer

1
__set_MSP(app_stack_pointer);

The stack is where local variables, function return addresses, and interrupt contexts are stored. The bootloader’s stack is in one area of RAM. The application expects its stack to be at a different location, defined when it was compiled. Therefore, we must switch to the app’s stack before jumping to its code.

Without this, the application would use the bootloader’s stack, likely causing a stack overflow or corruption.

Line 69: Set Control Register

1
__set_CONTROL(0);

The control register determines which stack pointer is active (MSP vs PSP) and the privilege level (privileged vs unprivileged). Setting it to 0 ensures MSP (Main Stack Pointer) is active and thread mode is privileged. This matches the reset state the application expects.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
int main(void)
{
  /* MCU Configuration */
  HAL_Init();
  SystemClock_Config();
  MX_GPIO_Init();
  MX_USART1_UART_Init();

  /* USER CODE BEGIN 2 */

  printf("\r\n\r\n");
  printf("========================================\r\n");
  printf("    BOOTLOADER v1.0                    \r\n");
  printf("========================================\r\n");
  printf("Running at address: 0x%08lX\r\n", (uint32_t)&main);
  printf("\r\n");

  // Blink LED a few times to show bootloader is running
  printf("Bootloader running... (LED blinks 3 times)\r\n");
  for (int i = 0; i < 3; i++)
  {
      HAL_GPIO_WritePin(GPIOG, GPIO_PIN_13, GPIO_PIN_SET);
      HAL_Delay(200);
      HAL_GPIO_WritePin(GPIOG, GPIO_PIN_13, GPIO_PIN_RESET);
      HAL_Delay(200);
  }

  printf("\r\n");
  printf("Attempting to jump to application...\r\n");
  HAL_Delay(500);  // Brief pause

  // Jump to application!
  jump_to_application(0x08010000);

  // If we reach here, jump failed
  printf("\r\n");
  printf("ERROR: Failed to jump to application!\r\n");
  printf("Staying in bootloader mode.\r\n");

  /* USER CODE END 2 */

  /* Infinite loop */
  while (1)
  {
    /* USER CODE BEGIN 3 */

    // Slow blink indicates bootloader fallback mode
    HAL_GPIO_TogglePin(GPIOG, GPIO_PIN_13);
    HAL_Delay(500);

    /* USER CODE END 3 */
  }
}

Line 72-73: Jump to Application

1
2
void (*app_reset_handler)(void) = (void (*)(void))app_entry_point;
app_reset_handler();

Finally, we create a function pointer app_reset_handler that points to the application’s Reset_Handler. We cast app_entry_point, which is just a number like 0x08010195, to a function pointer type.