Skip to main content

2. Tracing OS events


In these tutorials, we will make intensive use a commercial tool to illustrate FreeRTOS mechanisms. This tool is Tracealyzer® from Percepio®.

image_000.png

 

You may take some time to visit Percepio® website:

If you cannot get access to a Tracealyzer license (sorry for you), you may skip this tutorial and still profit from the several screenshots that you will find along these tutorials.

Otherwise, and if you want to be able to record and analyze what's happening in your FreeRTOS projects, you are highly encouraged to setup your project in order to get those OS traces. That's really helpful, and highly instructive.

The only thing you need to know at the moment is that tracing OS events requires two pieces of software:

  • The recorder is running on your target MCU while OS is executing. It captures and records events at runtime in a memory buffer . It is provided as a set of source files (just like FreeRTOS) you need to include and configure in your MCU project. You can see it as a spying program working from the inside of the firmware.

  • The Tracealyzer is a computer application which provides visual tools to help analyze OS behavior based on stored events. It uses the information gathered by the recorder and provide views into a user-friendly graphical interface.

 

There are two methods for transferring events stored in the MCU memory by the recorder to the Tracealyzer computer application:

  • Using a memory dump from the debugger while program execution is suspended (snapshot mode)

  • Using a real-time streaming while program executes (streaming mode)

This tutorial only covers the snapshot mode, which I find easier to setup for a start.

At time of writing, Tracealyzer version is 4.8.1. Again, I strongly encourage you to use the same version as me until you get enough understanding of the recorder integration. You can get Tracealyzer installers from the Percepio® website: https://percepio.com/tracealyzer/update. Pick up the 4.8.1 version from September 26th 2023 to be in sync with this tutorial and proceed with the installation on your computer.

 

1. Integrating the Recorder

1.1. Getting the Recorder source files into your project

You may get recorder source files from two locations:

  • The FreeRTOS archive comes with various addons including the Percepio recorder source files. These are located in FreeRTOS-Plus\Source\FreeRTOS-Plus-Trace\ folder.

  • After you have downloaded and installed Percepio Tracealyzer on your computer, you'll find the recorder source files in the installation directory (e.g. C:\Program Files\Percepio\Tracealyzer 4\FreeRTOS\TraceRecorder\).

You can open the file \config\trcConfig.h in both distributions to see what you have. Be aware that if the recorder is prior to the FreeRTOS version, it might not work properly. Actually, the FreeRTOS release (v202212.01) comes with a FreeRTOS kernel in version  v10.5.1 and packages the Percepio recorder in version v4.6.0 that has apparently no support update since FreeRTOS v10.4.1.

On the other hand, the recorder that comes with Tracealyzer is in version 4.8.1 and displays support for FreeRTOS up to version 10.5.1. (even through the configuration for v10.5.1 refers to v10.4.0). Anyway, we will take that one!

FreeRTOS archivePercepio installation folder (e.g. Program Files)

FreeRTOS-Plus-Trace\config\trcConfig.h

/*
 * Trace Recorder for Tracealyzer v4.6.0
 * Copyright 2021 Percepio AB
 * www.percepio.com
 */

FreeRTOS\TraceRecorder\config\trcConfig.h

/*
 * Trace Recorder for Tracealyzer v4.8.1
 * Copyright 2023 Percepio AB
 * www.percepio.com
 */

After you've located the recorder source files on your computer, create a new source folder 'TraceRecorder' in your project structure and Copy/Paste source files.

Again, some cleanup is necessary:

  • Delete the /extras folder

  • In the /streamports folder, delete all folders except the /Jlink_RTT folder

  • Remove all remaining files that are neither source (.c) nor header (.h)

By the end of the cleanup, your project structure should look like this. That's indeed a lot of new files, but don't worry, we will just deal with a few configuration headers.

 

Next, add the TraceRecorder\include\ path to the build configuration and apply changes:

 

1.2. Configuring the Recorder

Configuring the recorder requires some attention, especially because our target has very a limited amount of RAM (16kB). Most of this configuration takes place in headers located in the /config folder of the recorder source tree:

  • trcConfig.h

  • trcKernelPortConfig.h

  • trcSnapshotConfig.h

  • trcKernelPortSnapshotConfig.h

Let us visit each of these header and see what configuration we have to do.

  1. Open trcConfig.h:

    Comment out the error message and include our target header:

    /******************************************************************************
     * Include of processor header file
     *
     * Here you may need to include the header file for your processor. This is
     * required at least for the ARM Cortex-M port, that uses the ARM CMSIS API.
     * Try that in case of build problems. Otherwise, remove the #error line below.
     *****************************************************************************/
    // #error "Trace Recorder: Please include your processor's header file here and remove this line."
    #include "stm32f0xx.h"

    Specify the FreeRTOS hardware port we are using:

    /**
     * @def TRC_CFG_HARDWARE_PORT
     * @brief Specify what hardware port to use (i.e., the "timestamping driver").
     *
     * All ARM Cortex-M MCUs are supported by "TRC_HARDWARE_PORT_ARM_Cortex_M".
     * This port uses the DWT cycle counter for Cortex-M3/M4/M7 devices, which is
     * available on most such devices. In case your device don't have DWT support,
     * you will get an error message opening the trace. In that case, you may
     * force the recorder to use SysTick timestamping instead, using this define:
     *
     * #define TRC_CFG_ARM_CM_USE_SYSTICK
     *
     * For ARM Cortex-M0/M0+ devices, SysTick mode is used automatically.
     *
     * See trcHardwarePort.h for available ports and information on how to
     * define your own port, if not already present.
     */
    #define TRC_CFG_HARDWARE_PORT TRC_HARDWARE_PORT_ARM_Cortex_M

    Then apply following tuning:

    #define TRC_CFG_SCHEDULING_ONLY             0
    #define TRC_CFG_INCLUDE_MEMMANG_EVENTS      0
    #define TRC_CFG_INCLUDE_USER_EVENTS         0
    #define TRC_CFG_INCLUDE_ISR_TRACING         0
    #define TRC_CFG_INCLUDE_READY_EVENTS        1
    #define TRC_CFG_INCLUDE_OSTICK_EVENTS       0
    
    #define TRC_CFG_ENABLE_STACK_MONITOR        0
    #define TRC_CFG_STACK_MONITOR_MAX_TASKS     10
    #define TRC_CFG_STACK_MONITOR_MAX_REPORTS   1
    
    #define TRC_CFG_CTRL_TASK_PRIORITY          1
    #define TRC_CFG_CTRL_TASK_DELAY             10
    #define TRC_CFG_CTRL_TASK_STACK_SIZE        1024
    
    #define TRC_CFG_RECORDER_BUFFER_ALLOCATION  TRC_RECORDER_BUFFER_ALLOCATION_STATIC
    #define TRC_CFG_MAX_ISR_NESTING             8
    
    #define TRC_CFG_ISR_TAILCHAINING_THRESHOLD  0
    #define TRC_CFG_RECORDER_DATA_INIT          1
    
    #define TRC_CFG_RECORDER_DATA_ATTRIBUTE 
    
    #define TRC_CFG_USE_TRACE_ASSERT            0

     

  2. Open trcKernelPortConfig.h:

    The recorder can operate in Snapshot mode or in Streaming mode. The Streaming mode allows for recording long traces but it requires a dedicated tutorial. Here, we deal with the Snapshot mode:

    /**
     * @def TRC_CFG_RECORDER_MODE
     * @brief Specify what recording mode to use. Snapshot means that the data is saved in
     * an internal RAM buffer, for later upload. Streaming means that the data is
     * transferred continuously to the host PC.
     *
     * For more information, see http://percepio.com/2016/10/05/rtos-tracing/
     * and the Tracealyzer User Manual.
     *
     * Values:
     * TRC_RECORDER_MODE_SNAPSHOT
     * TRC_RECORDER_MODE_STREAMING
     */
    #define TRC_CFG_RECORDER_MODE TRC_RECORDER_MODE_SNAPSHOT

    Then we need to specify our FreeRTOS version:

    /**
     * @def TRC_CFG_FREERTOS_VERSION
     * @brief Specify what version of FreeRTOS that is used (don't change unless using the
     * trace recorder library with an older version of FreeRTOS).
     *
     * TRC_FREERTOS_VERSION_7_3_X				If using FreeRTOS v7.3.X
     * TRC_FREERTOS_VERSION_7_4_X				If using FreeRTOS v7.4.X 
     * TRC_FREERTOS_VERSION_7_5_X				If using FreeRTOS v7.5.X
     * TRC_FREERTOS_VERSION_7_6_X				If using FreeRTOS v7.6.X
     * TRC_FREERTOS_VERSION_8_X_X				If using FreeRTOS v8.X.X
     * TRC_FREERTOS_VERSION_9_0_0				If using FreeRTOS v9.0.0
     * TRC_FREERTOS_VERSION_9_0_1				If using FreeRTOS v9.0.1
     * TRC_FREERTOS_VERSION_9_0_2				If using FreeRTOS v9.0.2
     * TRC_FREERTOS_VERSION_10_0_0				If using FreeRTOS v10.0.0
     * TRC_FREERTOS_VERSION_10_0_1				If using FreeRTOS v10.0.1
     * TRC_FREERTOS_VERSION_10_1_0				If using FreeRTOS v10.1.0
     * TRC_FREERTOS_VERSION_10_1_1				If using FreeRTOS v10.1.1
     * TRC_FREERTOS_VERSION_10_2_0				If using FreeRTOS v10.2.0
     * TRC_FREERTOS_VERSION_10_2_1				If using FreeRTOS v10.2.1
     * TRC_FREERTOS_VERSION_10_3_0				If using FreeRTOS v10.3.0
     * TRC_FREERTOS_VERSION_10_3_1				If using FreeRTOS v10.3.1
     * TRC_FREERTOS_VERSION_10_4_0				If using FreeRTOS v10.4.0
     * TRC_FREERTOS_VERSION_10_4_1				If using FreeRTOS v10.4.1
     * TRC_FREERTOS_VERSION_10_4_2				If using FreeRTOS v10.4.2
     * TRC_FREERTOS_VERSION_10_4_3				If using FreeRTOS v10.4.3
     * TRC_FREERTOS_VERSION_10_5_0				If using FreeRTOS v10.5.0
     * TRC_FREERTOS_VERSION_10_5_1				If using FreeRTOS v10.5.1 or later
     */
    #define TRC_CFG_FREERTOS_VERSION TRC_FREERTOS_VERSION_10_5_1 // <-- Change this if different

    Then turn off all other monitoring options:

    #define TRC_CFG_INCLUDE_EVENT_GROUP_EVENTS     0
    #define TRC_CFG_INCLUDE_TIMER_EVENTS           0
    #define TRC_CFG_INCLUDE_PEND_FUNC_CALL_EVENTS  0
    #define TRC_CFG_INCLUDE_STREAM_BUFFER_EVENTS   0
    #define TRC_CFG_ACKNOWLEDGE_QUEUE_SET_SEND     0

     

  3. Open trcSnapshotConfig.h:

    In Snapshot mode, kernel events are recorded in RAM (as an array in a structured variable, to be precise). Because of the limited availability of RAM in our target (16kB), the number of recorded events is also limited. When we start the program, the OS life is tracked until the recorder buffer is full. If 'STOP_WHEN_FULL' option is selected, the recorder then contains the n firsts events that came after the application started. Otherwise, if 'RING_BUFFER' is selected, the recorder keeps recording, replacing older events by new ones. When the execution is suspended (manually, in the debug session), the recorder therefore contains the n lasts events being observed. Note that under intense activity, the recorder may only contains few seconds (if not milliseconds) of the OS life.

    Here, let's try to catch the beginning of our OS life:

    /**
     * @def TRC_CFG_SNAPSHOT_MODE
     * @brief Macro which should be defined as one of:
     * - TRC_SNAPSHOT_MODE_RING_BUFFER
     * - TRC_SNAPSHOT_MODE_STOP_WHEN_FULL
     * Default is TRC_SNAPSHOT_MODE_RING_BUFFER.
     *
     * With TRC_CFG_SNAPSHOT_MODE set to TRC_SNAPSHOT_MODE_RING_BUFFER, the
     * events are stored in a ring buffer, i.e., where the oldest events are
     * overwritten when the buffer becomes full. This allows you to get the last
     * events leading up to an interesting state, e.g., an error, without having
     * to store the whole run since startup.
     *
     * When TRC_CFG_SNAPSHOT_MODE is TRC_SNAPSHOT_MODE_STOP_WHEN_FULL, the
     * recording is stopped when the buffer becomes full. This is useful for
     * recording events following a specific state, e.g., the startup sequence.
     */
    #define TRC_CFG_SNAPSHOT_MODE TRC_SNAPSHOT_MODE_STOP_WHEN_FULL

    Then set the size of the recorder buffer:

    /**
     * @def TRC_CFG_EVENT_BUFFER_SIZE
     * @brief Macro which should be defined as an integer value.
     *
     * This defines the capacity of the event buffer, i.e., the number of records
     * it may store. Most events use one record (4 byte), although some events
     * require multiple 4-byte records. You should adjust this to the amount of RAM
     * available in the target system.
     *
     * Default value is 1000, which means that 4000 bytes is allocated for the
     * event buffer.
     */
    #define TRC_CFG_EVENT_BUFFER_SIZE 1000

    And leave other options at their default values, except the size of the symbol table that can be lowered:

    /**
     * @def TRC_CFG_SYMBOL_TABLE_SIZE
     * @brief Macro which should be defined as an integer value.
     *
     * This defines the capacity of the symbol table, in bytes. This symbol table
     * stores User Events labels and names of deleted tasks, queues, or other kernel
     * objects. If you don't use User Events or delete any kernel
     * objects you set this to a very low value. The minimum recommended value is 4.
     * A size of zero (0) is not allowed since a zero-sized array may result in a
     * 32-bit pointer, i.e., using 4 bytes rather than 0.
     *
     * Default value is 800.
     */
    #define TRC_CFG_SYMBOL_TABLE_SIZE 200

     

  4. Open trcKernelPortSnapshotConfig.h:

    Adjust (lower) the maximum number of kernel objects that you want to monitor at a time

    #define TRC_CFG_NTASK           6
    #define TRC_CFG_NISR            4
    #define TRC_CFG_NQUEUE          2
    #define TRC_CFG_NSEMAPHORE      6
    #define TRC_CFG_NMUTEX          2
    #define TRC_CFG_NTIMER          1
    #define TRC_CFG_NEVENTGROUP     2
    #define TRC_CFG_NSTREAMBUFFER   2
    #define TRC_CFG_NMESSAGEBUFFER  2

    And finally keep string length at their nominal values:

    #define TRC_CFG_NAME_LEN_TASK           15
    #define TRC_CFG_NAME_LEN_ISR            15
    #define TRC_CFG_NAME_LEN_QUEUE          15
    #define TRC_CFG_NAME_LEN_SEMAPHORE      15
    #define TRC_CFG_NAME_LEN_MUTEX          15
    #define TRC_CFG_NAME_LEN_TIMER          15
    #define TRC_CFG_NAME_LEN_EVENTGROUP     15
    #define TRC_CFG_NAME_LEN_STREAMBUFFER   15
    #define TRC_CFG_NAME_LEN_MESSAGEBUFFER  15

 

1.3. Enable the Recorder

The final step is to enable the tracing feature within FreeRTOS.

  • Turn on the configUSE_TRACE_FACILITY definition in FreeRTOSConfig.h :

/* Run time and task stats gathering related definitions. */
#define configGENERATE_RUN_TIME_STATS           	0
#define configUSE_TRACE_FACILITY                	1		// <-- Turn tracing ON
#define configUSE_STATS_FORMATTING_FUNCTIONS    	0
  • Then add the following at the very end of FreeRTOSConfig.h:

...

/* Integrates the Tracealyzer recorder with FreeRTOS */
#if ( configUSE_TRACE_FACILITY == 1 )
#include "trcRecorder.h"
#endif

#endif /* FREERTOS_CONFIG_H */
  • And finally start the recorder in the main() function:

// Main function
int main()
{
	// Configure System Clock
	SystemClock_Config();

	// Initialize LED & Button pin
	BSP_LED_Init();
	BSP_PB_Init();

	// Initialize Debug Console
	BSP_Console_Init();
	my_printf("Console ready!\r\n");

	// Start Trace Recording
	xTraceEnable(TRC_START);		// <- Recorder starts now

	// Create Tasks
	xTaskCreate(vTask1, "Task_1", 256, NULL, 1, NULL);
	xTaskCreate(vTask2, "Task_2", 256, NULL, 2, NULL);

	// Start the Scheduler
	vTaskStartScheduler();


	while(1)
	{
		// The program should never be here...
	}
}

 

Note that the recorder must be started before any OS event occurs, therefore before any object or task creation and obviously before the scheduler starts.

Save all and make sure that the project builds without any error or warnings. Repair if needed by reviewing the above settings.

You should take care of the reported level of bss memory since it includes the amount of memory that we reserved for both the FreeRTOS heap (7kB) and for the recorder (4kB). Remember that STM32F072 only features 16kB of RAM. Here, it's almost full...

 

Fortunately, the linker won't build if you exceed the amount of available RAM. You can try and double the recorder capacity for instance:

#define TRC_CFG_EVENT_BUFFER_SIZE 2000

and you'll see:

nucleo_64_F072_FreeRTOS.elf section `.bss' will not fit in region `RAM'
region `RAM' overflowed by 1976 bytes
collect2.exe: error: ld returned 1 exit status
make: *** [makefile:69: nucleo_64_F072_FreeRTOS.elf] Error 1
"make all" terminated with exit code 2. Build might be incomplete.

Then go back to 1000...

 

2. Recording OS events

Before experimenting the tracing, let us speed things up in vTask1() and vTask2() in order to get more recorded events in a short time window. Set the LED toggle period to 30ms, and the message display every 100ms:

/*
 *	Task1 toggles LED every 30ms
 */
void vTask1 (void *pvParameters)
{
	while(1)
	{
		BSP_LED_Toggle();
		vTaskDelay(30);		// <-- Change here
	}
}

/*
 *	Task2 sends a message to console every 100ms
 */
void vTask2 (void *pvParameters)
{
	uint16_t count;
	count = 0;

	while(1)
	{
		my_printf("Hello %2d from task2\r\n", count);
		count++;
		vTaskDelay(100);	// <-- Change here
	}
}

 

Save all , make sure that the project still builds without any error or warnings.

gitlab- commit Commit name "Trace recorder"
- push Push onto Gitlab

 

Launch a debug session . You should get the debugger waiting at the beginning of main():

 

Using the Expression view, add the following expressions (global variables):

  • RecorderData

  • RecorderDataPtr

  • sizeof(RecorderData)

  • &RecorderData

Then select &RecorderData, and copy into the clipboard its address in memory:

 

Also note that the size of the RecorderData variable is 5060 bytes. We'll need this information in a moment.

Step over the code until you pass the xTraceEnable(TRC_START) function and take a look (unfold) the RecorderData structure in the Expression view. You'll see that a lot of initializations have been performed. Notice that numEvents member is set to 1 and that RecorderDataPtr is a pointer that now points to the DataRecorder structured variable.

 

Then step over again the two xTaskCreate() functions while watching the numEvents member of DataRecorder. You'll see that each task creation corresponds to 2 events in the recorder. This demonstrates that the tracking mechanism is up and running.

Now, just run the program , wait a little while (a couple of seconds). Watch the LED and the console to make sure that the program works as expected (very fast blinking/printing), and then click the suspend button. How high is the numEvent variable ? If it reached 1000, that means that recorder is full, and that it doesn't record anything anymore. In that case, the recorderActive flag should be 0:

 

At this step, we have a very valuable information stored in the recorder memory, but it's not human-readable in its current form. What we need to do now is to transfer the MCU memory segment that corresponds to the DataRecorder variable into a file on your host computer. That action is called a memory dump.

  • Open the Memory view

  • Click Add Memory Monitor

  • Paste the previously copied address (alternatively, you can also enter &DataRecorder, or RecorderDataPtr) and click OK.


     
  • Click the Export button in the Memory view:


     

  • Then carefully fill up the popup window:

    • Select RAW Binary as export Format

    • Make sure that Start  address is the one of the RecorderData structure you got above

    • Set the Length to anything bigger than the recorder length (you remember? It is 5060 bytes, so let say 6000)

    • Give a name such as ram.bin to the destination file

 

Click OK. You should get a file ram.bin created on your computer, which is a copy of the actual STM32 memory content. 

If you have an Hex editor at hand on your computer (by the way, that's a very useful utility. Here, I'm using HxD, which is available here https://mh-nexus.de/en/hxd/), you can open and read the ram.bin file and see that it contents the very same bytes than the ones you see in the memory view of the IDE.

 

And that's when Tracealyzer now enters the game.

 

3. Analyzing the trace

From Windows launch menu, start Tracealyzer.

 

Within Tracealyzer, select File →Open→Open File and browse to ram.bin. You can also just drag and drop the ram.bin file into the Tracealyzer window. The trace opens...

 

On the left, vertical lanes represent the system tasks execution along a vertical time line. Task priority is increasing from left to right. You can scroll and zoom into this view using the mouse wheel and CTRL key and use right filters to display various information.

 

You can measure time by click-dragging the mouse in the main view. For instance let us verify that Task_1 is executed every 30ms as programmed:

 

Selecting the task IDLE (left lane), you get an idea of the CPU load. In the figure below, the IDLE task is running 97.9% of time. In other words, the CPU is only busy with our Task_1 and Task_2 for 2.1% of the time... Not too busy after all...

 

Using the View menu, you get access to more information. Open for instance View→Trace Overview. This view provides a summary of the recorder trace. You can see that 1000 events have been recorded in 3.8s. If you need longer traces, you need a larger recorder buffer, which is difficult with a limited amount of RAM. In the future you will learn how to run the recorder in the steaming mode that allows for virtually unlimited trace length recording.

 

Now try and open Views → Actor Instance Graphs. Then in the drop-down menu, select Wait time. You get a view of waiting time for each task. The waiting time corresponds to the time a task waits, from the moment it is ready to execute, to the moment it actually runs. In the graph below, one can see that Task_2 (yellow) never waits. That's fortunate, because Task_2 is of highest priority. Task_1 (green) wait time is also zero most of the time, but on a regular period, it had to wait about 2ms before being executed.

 

Double-clicking the point of interest in the above view align all the Tracealyzer opened views so that you can get a better understanding of a local situation. In this case, we understand that, at regular time interval, both Task_1 and Task_2 are simultaneously ready. Task_2 has higher priority, therefore Task_1 waits for the CPU to be available (i.e. until Task_2 terminates).

 

4. Using Percepio® Eclipse plugin

The process of memory dumping and loading of Tracealyzer is not difficult, but it can be made even easier using a Percepio plugin within STM32CubeIDE.

Open Help → Eclipse Marketplace... and search for percepio plugins. Then install the Trace Exporter for STM32CubeIDE.

 

When done, you'll need to restart the IDE. Note that a new Percepio entry has been added in the top menu:

 

Go to Percepio → Preferences and review the settings, in particular the Tracealyzer Path that must indicate the full path to the Tracealyzer executable. I also recommend filling the Trace Filename field with the file of your choice (e.g., "ram.bin") so that every new trace just overwrites the previous one and then avoids filling your hard drive with tons of trace files.

You may notice that the plugin does nothing more than automating the memory dump process that you've done manually just before. As you did, it uses RecoredDataPtr (the pointer) to locate the RecorderData structure in memory and then uses the size sizeof(*RecoredDataPtr) to adjust the dump length. No magic here! Still, very useful.

 

When done, the process for loading a trace into Tracealyzer is as simple as:

  • Start a debug session

  • Run the program for a little while to let the recorder capture some events

  • Suspend execution

  • Go to Percepio → Save Snapshot Trace. It opens Tracealyzer and loads the last recorded events automatically. If already opened, you'll get a dialog box asking for reloading.