4.1. Digital filtering


1. Introduction

The purpose of this tutorial is to drive you though the design of a digital filter that takes an analog input signal applied to the ADC, then processes incoming samples, and finally provides an analog output using a DAC.

 

image011.png

In the first part, the tutorial provides some basic background on digital filters. You may need some prerequisite knowledge, mostly on Laplace transform and z-transform, to get a full understanding of exposed matter, but you can still grab some useful tricks even if you are new to the field of digital filtering.

 

2. First order digital filter

 

2.1. The z-domain transfer function

The common representation of first-order Low-Pass filter transfer function in the Laplace domain is:

image013.png

The cutoff frequency of such filter is given by:

image015.png

Let us suppose that we want a cutoff frequency image017.png. We therefore have:

image019.png
image021.png

The Laplace variable p (or s if you live anywhere in the world but in France :-)) is useful to represent transfer functions in the continuous-time domain. When dealing with digital systems, we usually sample incoming signal at regular time intervals. The Laplace transform is no more representative of such a discrete-time system.

The Laplace-transform equivalent for sampled signals is called the z-transform.

A common approach to move a Laplace transfer function into the z-domain is to use something called the “bilinear transform”. The bilinear transform is obtained by mapping z-plane to p-plane using:

image023.png

where T is the sampling period (in seconds). In the tutorial, we will use a 10ms sampling period (i.e. we will get a new input from ADC every 10ms).

The inverse mapping gives:

image025.png

Using first term of ln series expansion, we have:

image027.png

Applying the bilinear transform to our first order Low-Pass filter then gives:

image013.png
 
image029.png
 
image031.png
 
image033.png
 
image035.png
 
image037.png
 
image039.png

 

The z-domain transfer function of our digital Low-Pass filter, sampled at 100Hz (T=10ms), and having a cutoff frequency of fc=2Hz is finally:

image041.png

 

2.2. On the use of Scilab® as a z-domain transfer function calculator

Luckily for you, scientific computing software such as Matlab® or Scilab® are able to process linear systems transfer function from continuous-time to discrete-time (and reversely) quite easily.

The following Scilab® script does the same bilinear transform as seen above:

clear;

s=poly(0,'s');

sl=syslin('c',(1)/(1+0.08*s));   // Continuous-time system in transfer form
slss=tf2ss(sl);                  // Now in state-space form

figure(1);                       // Plot Bode diagram
clf(1);
bode(sl, 0.1, 10000);

t=0:0.01:0.6;                    // Time scale
y = csim('step',t,slss);         // Simulate step response

figure(2);                       // Plot step response
clf(2);
plot(t,y,'r.-');

sld=cls2dls(slss,0.01);          // Continuous to discrete transform
sldt=ss2tf(sld);                 // Converts in transfer form
disp(sldt);                      // Display z-transform

Executing the above script provides the following outputs. The obtained z transfer function confirms the one we calculated earlier.

  • Bode diagram
image044.png
 
  • Step response
image046.png

 

  • z-domain transfer function
image048.png
 
 

2.3. From z-domain transfer function to recursion equation

The z-transform is a discrete-time representation of the filter that relates output samples to input samples:

image050.png

Where X(z) represents the input samples and Y(z) represents the output samples.

Let us turn this into something more practical from a programming perspective:

image056.png
 
image058.png

Above equation is processed each time a new sample is available. While X(z) represents the current incoming signal sample, X(z)z represents the sample that will come (in the future) the next sampling period. More generally, multiplying X(z) or Y(z) by zn is the same as putting an index n on the sample being processed.
    n = 0   → current sample
    n = 1   → next sample
    n = -1 → previous sample

Because no filter can predict what the future will be, we can divide all the parts of the above equation by z to have:

image062.png
 
image064.png

The z-1 factor represents a delay of one sample. The above equation can therefore be written:

image068.png
 
image070.png

where:

  •     yn    is the output sample computed at the nth iteration
  •     yn-1 is the output sample computed at the (n-1)th iteration
  •     xn    is the input sample read at the nth iteration
  •     xn-1 is the input sample read at the (n-1)th iteration

The above equation is the called the recursion equation, and can be directly applied to the filter computation within a program.

For instance, you can test the above equation with Excel (or any other tool you like):

  •  Create a column of sample indexes n
  •  Create a column of x input samples (some ‘0’ and then some ‘1’ for a step response)
  •  Create a column for y outputs by using the above equation
  •  Plot both x and y data with n as x-axis.

 

image080.png

    

2.4. Further simplification

In the previous equation, we observe that each output sample calculation requires:

  • One addition          → image082.png
  • Two products         → image084.png and image086.png
  • Another addition    → image088.png

 

If we assume that the sampling frequency if high compared to input signal frequency (which is usually the case), we can further simplify our equation by assuming that xn-1 and xn are not very different.

image070.png with image094.png

image096.png

 

Now, we can notice that:

image098.png
 
image100.png
 
image102.png

 

The above equation rewrites:

image104.png

Let us write

image106.png

image108.png

With image110.png

We have removed one addition. Using again:

T=0.01s      τ=0.08s

We have finally:

image113.png

 

The plot below compares both step responses of first (y) and simplified (y’) version of our filter. Based on that result, we can accept the assumption that xn-1xn and use the simplified form of the filter. We can also see that 63% of the final output value is reached at sample #18. That is 8 samples after the step is applied. With a sampling period of 0.01s, we confirm that τ=0.08. It now makes clear that the sampling period strongly determines the cutoff frequency of the filter. In other words, running the above equation at different sampling rate would provide different cutoff frequency.

image117.png
 

3. Basic “analog follower” project

Before going into the filter implementation, let us prepare an application that simply copies the input voltage from the ADC channel to the DAC output. The next sections review the necessary setup for involved peripherals.

 

3.1. ADC configuration

Below is the setup for ADC. It is a one channel conversion therefore DMA is not involved:

/*
 * ADC_Init()
 * Initialize ADC for single channel conversion
 * - Channel 11 -> pin PC1
 */

void BSP_ADC_Init()
{
	// Enable GPIOC clock
	RCC->AHBENR |= RCC_AHBENR_GPIOCEN;

	// Configure pin PC1 as analog
	GPIOC->MODER &= ~(GPIO_MODER_MODER1_Msk);
	GPIOC->MODER |= (0x03 <<GPIO_MODER_MODER1_Pos);

	// Enable ADC clock
	RCC->APB2ENR |= RCC_APB2ENR_ADC1EN;

	// Reset ADC configuration
	ADC1->CR 	= 0x00000000;
	ADC1->CFGR1  = 0x00000000;
	ADC1->CFGR2  = 0x00000000;
	ADC1->CHSELR = 0x00000000;

	// Enable 12-bit continuous conversion mode
	ADC1->CFGR1 |= (0x00 <<ADC_CFGR1_RES_Pos) | ADC_CFGR1_CONT;

	// Select PCLK/2 as ADC clock
	ADC1->CFGR2 |= (0x01 <<ADC_CFGR2_CKMODE_Pos);

	// Set sampling time to 28.5 ADC clock cycles
	ADC1->SMPR = 0x03;

	// Select channel 11
	ADC1->CHSELR |= ADC_CHSELR_CHSEL11;

	// Enable ADC
	ADC1->CR |= ADC_CR_ADEN;

	// Start conversion
	ADC1->CR |= ADC_CR_ADSTART;
}

 
3.2. DAC configuration

Same thing for the DAC setup. We have a basic configuration:

/*
 * DAC_Init()
 * Initialize DAC for a single output
 * on channel 1 -> pin PA4
 */

void BSP_DAC_Init()
{
	// Enable GPIOA clock
	RCC->AHBENR |= RCC_AHBENR_GPIOAEN;

	// Configure pin PA4 as analog
	GPIOA->MODER &= ~GPIO_MODER_MODER4_Msk;
	GPIOA->MODER |= (0x03 <<GPIO_MODER_MODER4_Pos);

	// Enable DAC clock
	RCC->APB1ENR |= RCC_APB1ENR_DACEN;

	// Reset DAC configuration
	DAC->CR = 0x00000000;

	// Enable Channel 1
	DAC->CR |= DAC_CR_EN1;
}

 

3.3. Timebase configuration

TIM6 timebase is tuned for 10ms period between update interrupts:

/*
 * BSP_TIMER_Timebase_Init()
 * TIM6 at 48MHz
 * Prescaler   = 48000  -> Counting frequency is 1kHz
 * Auto-reload = 10 	-> Update period is 10ms
 * Enable Update Interrupt
 */

void BSP_TIMER_Timebase_Init()
{
	// Enable TIM6 clock
	RCC->APB1ENR |= RCC_APB1ENR_TIM6EN;

	// Reset TIM6 configuration
	TIM6->CR1 = 0x0000;
	TIM6->CR2 = 0x0000;

	// Set TIM6 prescaler
	// Fck = 48MHz -> /48000 = 1kHz counting frequency
	TIM6->PSC = (uint16_t) 48000 -1;

	// Set TIM6 auto-reload register for 10ms
	TIM6->ARR = (uint16_t) 10 -1;

	// Enable auto-reload preload
	TIM6->CR1 |= TIM_CR1_ARPE;

	// Enable Interrupt upon Update Event
	TIM6->DIER |= TIM_DIER_UIE;

	// Start TIM6 counter
	TIM6->CR1 |= TIM_CR1_CEN;
}

 

And NVIC is configure to allow TIM6 interrupts with priority 1:

/*
 * BSP_NVIC_Init()
 * Setup NVIC controller for desired interrupts
 */

void BSP_NVIC_Init()
{
	// Set priority level 1 for TIM6 interrupt
	NVIC_SetPriority(TIM6_DAC_IRQn, 1);

	// Enable TIM6 interrupts
	NVIC_EnableIRQ(TIM6_DAC_IRQn);
}

 
3.4. Follower stage

Finally, the main() function implementing follower stage is given below. in and out variables have been set as global in order to allow STM-Studio® monitoring:

// Global variables

...

uint8_t	timebase_irq = 0;
uint16_t	in, out;

// Main program

void main()
{
	// Configure System Clock
	SystemClock_Config();

	// Initialize 10ms timebase
	BSP_TIMER_Timebase_Init();
	BSP_NVIC_Init();

	// Initialize ADC
	BSP_ADC_Init();

	// Initialize DAC
	BSP_DAC_Init();

	// Initialize LED pin
	BSP_LED_Init();

	// Main loop
	while(1)
	{
		// Do every 10ms
		if (timebase_irq == 1)
		{
			// Start measure
			BSP_LED_On();

			// Input signal <-- ADC
			while ( (ADC1->ISR & ADC_ISR_EOC) != ADC_ISR_EOC );
			in = ADC1->DR;

			// Follower stage
			out = in;

			// Output signal --> DAC
			DAC->DHR12R1 = out;

			// Stop measure
			BSP_LED_Off();
			timebase_irq = 0;
		}
	}
}

 

In order to test the follower stage, let us connect the user button pin PC13 to the ADC input pin PC1. This is a convenient way to generate full-scale input steps (0V → 3.3V) to the filter input pin PC1. Make sure that you have no code processing the push-button event. You don’t even need to call the BSP_PB_Init() function as all pins are input upon reset and we’re not going to read button state anyway.

image124.png

 

Save all, build and run the application.

 

If you have an oscilloscope at hand, you can make some verifications:

  •     Probing PA5, you should see a short pulse that occurs every 10ms, according to the settings of the timebase:

 

image134.png
 
 
  • Zooming horizontal, the width of the PA5 pulse tells you about the time it takes to execute the filter loop. At this moment, the process takes 1.5µs to complete. Remember that we are doing nothing more than output = input (follower stage). It has to be short…

 

image137.png

 

  • Probing both PC1 and PA4 and playing with the user push-button, you can make sure that the follower stage actually follows (DAC output on PA4 is a copy of ADC input on PC1):

 

image139.png

 

Alternatively, you can monitor both in and out variables with STM-Studio® while playing with the user button:

image141.png

 

When you have checked that everything is working as expected, you are ready to start the filter implementation.

 

  Commit name "Digital filter - Follower"
  Push onto Gitlab

 

4. Filter implementation

Remember that our filter is described by the following recursion equation:

image142.png
 
image144.png

In a first time, we will implement this equation directly using floats for the two coefficients c1 and c2.

 

4.1. Floating point arithmetic

Edit the main() function as shown below.


...

void main()
{
	float c1, c2;	// Filter coefs
	float x = 0;		// Filter input
	float y = 0;		// FIlter ouput

	...

	// Setup filter coefs
	c1 = 0.11765f;
	c2 = 0.88235f;

	// Main loop
	while(1)
	{
		// Do every 10ms
		if (timebase_irq == 1)
		{
			// Start measure
			BSP_LED_On();

			// Input signal <-- ADC
			while ( (ADC1->ISR & ADC_ISR_EOC) != ADC_ISR_EOC );
			in = ADC1->DR;

			// Filter stage
			x = (float)in;
			y = (c1*x) + (c2*y);
			out = (uint16_t)y;

			// Output signal --> DAC
			DAC->DHR12R1 = out;

			// Stop measure
			BSP_LED_Off();
			timebase_irq = 0;
		}
	}
}

 

Note that in y = (c1*x) + (c2*y) we already have yn on the left-hand side of '=' and yn-1 on the right-hand side.

Build the project and run.

You can use STM-Studio® to get a quick view on the result:

image147.png

 

For precise measurements, it’s better using an oscilloscope. Probe both input (PC1) and output (PA4) pins:

image148.png

 
Remember that we want  τ =80ms. About 95% of the final value should be reached within 3τ = 240ms. The figure above is a detailed view of the step (push-button) response. We can see that the filter behaves perfectly as expected. The filter final value is almost reached after 5τ.

Next, probe PA5 (filter processing time).  In the figure below, we first note that the filter is processed quite fast compared to the sampling period (10ms), meaning that CPU is spending most of the time doing nothing, waiting for the next timebase interrupt.

We can measure an execution time of about 16µs:

image151.png

 

If you play with the user Push-Button, you may notice that processing time is not the same when input signal is high or low. It is actually a little bit faster (13.4µs) when input is 0 (button down) because arithmetic operations are likely simplified with null operands. Therefore, one can see that computation time is not deterministic, and depends on operands values. That’s a consequence of conditional branches in the assembly code that result from floating point arithmetic synthesis.

 

  Commit name "Digital Filter - Floating point arithmetics"
  Push onto Gitlab

 

4.2. Fixed point arithmetic

Using float type to represent real numbers such as our coefficients c1 and c2 is a natural thing, but you may know that real numbers may also be represented by integers, assuming a virtual decimal point. Such representation is called fixed-point representation (‘float’ being for floating-point representation).

We are aware that Cortex-M0 CPU does not have any dedicated hardware to perform floating point arithmetic (Floating Point Unit, FPU). So it is likely that using floats is slowing down code execution (i.e. floating point arithmetic produces many assembly lines to complete).

Try the code below where filter variables are represented by a 12-bit decimal place fixed-point:

...

// Main program

void main()
{
	uint16_t c1, c2;		// Filter coefs
	uint16_t x = 0;		// Filter input
	uint32_t y = 0;		// Filter ouput

	...

	// Setup filter coefs
	c1 = 482;	// 0.11765f *2^12
	c2 = 3614;	// 0.88235f *2^12

	// Main loop
	while(1)
	{
		// Do every 10ms
		if (timebase_irq == 1)
		{
			// Start measure
			BSP_LED_On();

			// Input signal <-- ADC
			while ( (ADC1->ISR & ADC_ISR_EOC) != ADC_ISR_EOC );
			in = ADC1->DR;

			// Filter stage
			x = in;
			y = (c1*x) + (c2*y);
			y = (y >>12U);		// Remove decimal part
			out = y;

			// Output signal --> DAC
			DAC->DHR12R1 = out;

			// Stop measure
			BSP_LED_Off();
			timebase_irq = 0;
		}
	}
}

Build the project and flash the target.

Make sure that the filter behaves exactly as before (i.e.  when using floats):

image154.png

 

If you are using STM-Studio®, you can also see the filter behavior. Just set the sampling period to its maximum (sequential loop) and display all the data points:

image156.png

 

When this is done, use the PA5 pin to measure the filter processing time. You’ll find that this implementation is about 8x faster (2µs) than the one making use of floats. For the very same behavior!

image158.png

 

  Commit name "Digital filter - Fixed point arithmetics"
  Push onto Gitlab

 

5. Summary

In this tutorial, you have implemented a digital low-pass filter. With a little help from Scilab® you should be able to synthetize in the digital domain any filter as long as you have its transfer function.

The tutorial took you into 2 coding approaches, the second being 8x faster that the first one for the very same result.

Processing efficiency depends on your coding skills. It has consequences on application performance and more generally on the business (money).

A code that is more efficient allows you:

  • to buy a less powerful, less expensive processor for the same task
  • to reduce MCU clock speed and therefore reduce power consumption (i.e. increase battery life in the case of mobile application, reduce heat dissipation requirements…)
  • spend more time in low-power modes

These benefits are not just “icing on the cake”. It is fundamental that you produce efficient code every time you can.