Introduction to Arm Cortex-M Microcontrollers (STM32F4 Discovery Board)

 Introduction to Arm Cortex-M Microcontrollers (STM32F4 Discovery Board)

Introduction to ARM Cortex-M Microcontrollers

In this introduction to ARM based Cortex Microcontrollers, we are biased to STM32F4 family(at some point) because of its great success and wide popularity in embedded industry. This is the main reason why i am designing this course. We will not discuss much about ARM as i have discussed that already in ARM7 classic course with LPC2148. For quick ARM reference click here, here we only highlight the Cortex-M Micocontroller family. So let’s start with overview of Cortex Microcontroller and its development environment.

                       The Development board which we are going to use here is STM32F429 Discovery board(Pic provided at the end of article) which have inbuilt DSP engine and many more features, we will discuss that later. But the first thing is to understand development ecology of ARM Cortex Microcontroller.

                     STM32F4xx Series of MCU are based on ARM Cortex M4 core, To work with these kind of cores, you need a specific library called CMSIS(Cortex Microcontroller Software Interface Standard). In simple words, this is device core specific library or software standard to develop application. Besides this CMSIS, all third-party vendor (in our case STMicroelectronics) also provide standard peripheral drivers to utilize full features of device. To learn more about CMSIS and Development environment of ARM Cortex, see following links:

Overview of CMSIS

ARM Processors Architecture Overview

Understanding Development Environment of STM32F4 Discovery Board

Why ARM Cortex-M Microcontrollers ?

Why ARM Cortex M


Difference between Cortex-M0,M0+, M1, M3 and M4


The ARM Cortex™-M0 processor is the smallest ARM processor available. The exceptionally small silicon area, low power and minimal code footprint of the processor enables developers to achieve 32-bit performance at an 8-bit price point.

  • Supports 3-stage pipeline, thumb2, hardware-single-cycle (32×32) multiply hardware.
  • Supports 1 NMI and 32 physical interrupts.
  • Has only 56 instructions and has ‘C-friendly’ architecture.


The ARM Cortex™-M0+ processor is an adaption of Cortex-M0 but with more improved performance and reduced energy footprint.

  • Supports 2-stage pipeline, thumb2, hardware-single-cycle(32×32) multiply hardware.


The ARM Cortex™-M1 processor is the first ARM processor designed specifically for implementation in FPGAs.

  • Supports 3-stage pipeline, thumb2 and big & little endian configuration.


The ARM Cortex™-M3 processor is the industry-leading 32-bit processor for highly deterministic real-time applications, specifically developed to enable partners to develop high-performance low-cost platforms for a broad range of devices including microcontrollers, automotive body systems, industrial control systems and wireless networking and sensors.

  • The Cortex-M3 NVIC is highly configurable at design time to deliver up to 240 system interrupts with individual priorities, dynamic re-prioritization and integrated system clock.
  • Supports 1 NMI and 240 physical interrupts with 8 to 256 level priorities.
  • Supports hardware divide, single cycle-multiply and saturated math support.


The ARM Cortex™-M4 processor is specifically developed to address digital signal control markets that demand an efficient, easy-to-use blend of control and signal processing capabilities.

  • Supports 3-stage pipeline with branch prediction and thumb2.
  • Supports hardware-divide, 8/16 bit SIMD arithmetic.
  • Supports single precision floating point unit & DSP engine.
  • Supports Memory protection unit and deterministic operations.


The ARM Cortex™-M7 processor is specifically developed to target IoT and high end application which includes motor control, industrial automation, advanced audio & image processing.

  • Supports 6-stage superscalar pipeline with branch prediction.
  • Speed upto 2.1  Dhrystone with Hardware Divide (2-12 Cycles).
  • Includes Adaptive real-time accelerator (ART Accelerator™-supports 64-bit transfer) and L1 cache for data  and instruction, allowing 0-wait state execution from embedded Flash memory and external memories and powerful peripherals.
  • Supports Memory protection unit and DSP engine.


Why Starting With STM32F4 Discovery Board


  • STM32F is greatest success family in embedded system application nowadays. 
  • STM32F is balance between new Cortex series(M7 and above) and old start up like M0, M0+, etc. it is neither too old technology nor latest.
  • The main reason for using STM32F is largest community support and various available tools provided by ST.

Some Features of STM32F4 family

  • Up to 2 Mbyte of Flash memory
  • Speed upto 2.1 Dhrystone  ( 1.25 DMIPS/MHz)
  • Some of Device include inbuilt DSP engine with Adaptive real-time accelerator (ART Accelerator™)
  • POR, PDR, PVD and BOR modes
  • Various low power modes like sleep, saving, standby, stop, etc
  • Large variety of communication peripheral interfaces like I2C, SPI, Ethernet, USB OTG, IrDA, LIN, CAN, UART, etc
  • In-built support of ADC, DAC, DMA, PWM, etc
  • CRC calculation and random number generator unit provided inbuilt which is very important feature in some critical application development.