1. Introduction
The purpose of this tutorial is to drive you though the design of a digital filter that takes an analog input signal applied to the ADC, then processes incoming samples, and finally provides an analog output using a DAC.
In the first part, the tutorial provides some basic background on digital filters. You may need some prerequisite knowledge, mostly on Laplace transform and ztransform, to get a full understanding of exposed matter, but you can still grab some useful tricks even if you are new to the field of digital filtering.
2. First order digital filter
2.1. The zdomain transfer function
The common representation of firstorder LowPass filter transfer function in the Laplace domain is:
The cutoff frequency of such filter is given by:
Let us suppose that we want a cutoff frequency . We therefore have:
The Laplace variable p (or s if you live anywhere in the world but in France :)) is useful to represent transfer functions in the continuoustime domain. When dealing with digital systems, we usually sample incoming signal at regular time intervals. The Laplace transform is no more representative of such a discretetime system.
The Laplacetransform equivalent for sampled signals is called the ztransform.
A common approach to move a Laplace transfer function into the zdomain is to use something called the “bilinear transform”. The bilinear transform is obtained by mapping zplane to pplane using:
where T is the sampling period (in seconds). In the tutorial, we will use a 10ms sampling period (i.e. we will get a new input from ADC every 10ms).
The inverse mapping gives:
Using first term of ln series expansion, we have:
Applying the bilinear transform to our first order LowPass filter then gives:
The zdomain transfer function of our digital LowPass filter, sampled at 100Hz (T=10ms), and having a cutoff frequency of fc=2Hz is finally:
2.2. On the use of Scilab® as a zdomain transfer function calculator
Luckily for you, scientific computing software such as Matlab® or Scilab® are able to process linear systems transfer function from continuoustime to discretetime (and reversely) quite easily.
The following Scilab® script does the same bilinear transform as seen above:
clear;
s=poly(0,'s');
sl=syslin('c',(1)/(1+0.08*s)); // Continuoustime system in transfer form
slss=tf2ss(sl); // Now in statespace form
figure(1); // Plot Bode diagram
clf(1);
bode(sl, 0.1, 10000);
t=0:0.01:0.6; // Time scale
y = csim('step',t,slss); // Simulate step response
figure(2); // Plot step response
clf(2);
plot(t,y,'r.');
sld=cls2dls(slss,0.01); // Continuous to discrete transform
sldt=ss2tf(sld); // Converts in transfer form
disp(sldt); // Display ztransform
Executing the above script provides the following outputs. The obtained z transfer function confirms the one we calculated earlier.
 zdomain transfer function
2.3. From zdomain transfer function to recursion equation
The ztransform is a discretetime representation of the filter that relates output samples to input samples:
Where X(z) represents the input samples and Y(z) represents the output samples.
Let us turn this into something more practical from a programming perspective:
Above equation is processed each time a new sample is available. While X(z) represents the current incoming signal sample, X(z)z represents the sample that will come (in the future) the next sampling period. More generally, multiplying X(z) or Y(z) by z^{n} is the same as putting an index n on the sample being processed.
n = 0 → current sample
n = 1 → next sample
n = 1 → previous sample
Because no filter can predict what the future will be, we can divide all the parts of the above equation by z to have:
The z^{1} factor represents a delay of one sample. The above equation can therefore be written:
where:
 y_{n} is the output sample computed at the n^{th} iteration
 y_{n1} is the output sample computed at the (n1)^{th} iteration
 x_{n} is the input sample read at the n^{th} iteration
 x_{n1} is the input sample read at the (n1)^{th} iteration
The above equation is the called the recursion equation, and can be directly applied to the filter computation within a program.
For instance, you can test the above equation with Excel (or any other tool you like):
 Create a column of sample indexes n
 Create a column of x input samples (some ‘0’ and then some ‘1’ for a step response)
 Create a column for y outputs by using the above equation
 Plot both x and y data with n as xaxis.
2.4. Further simplification
In the previous equation, we observe that each output sample calculation requires:
 One addition →
 Two products → and
 Another addition →
If we assume that the sampling frequency if high compared to input signal frequency (which is usually the case), we can further simplify our equation by assuming that x_{n1} and x_{n} are not very different.
with
Now, we can notice that:
The above equation rewrites:
Let us write
With
We have removed one addition. Using again:
T=0.01s τ=0.08s
We have finally:
The plot below compares both step responses of first (y) and simplified (y’) version of our filter. Based on that result, we can accept the assumption that x_{n1}≈x_{n} and use the simplified form of the filter. We can also see that 63% of the final output value is reached at sample #18. That is 8 samples after the step is applied. With a sampling period of 0.01s, we confirm that τ=0.08. It now makes clear that the sampling period strongly determines the cutoff frequency of the filter. In other words, running the above equation at different sampling rate would provide different cutoff frequency.
3. Basic “analog follower” project
Before going into the filter implementation, let us prepare an application that simply copies the input voltage from the ADC channel to the DAC output. The next sections review the necessary setup for involved peripherals.
3.1. ADC configuration
Below is the setup for ADC. It is a one channel conversion therefore DMA is not involved:
/*
* ADC_Init()
* Initialize ADC for single channel conversion
*  Channel 11 > pin PC1
*/
void BSP_ADC_Init()
{
// Enable GPIOC clock
RCC>AHBENR = RCC_AHBENR_GPIOCEN;
// Configure pin PC1 as analog
GPIOC>MODER &= ~(GPIO_MODER_MODER1_Msk);
GPIOC>MODER = (0x03 <<GPIO_MODER_MODER1_Pos);
// Enable ADC clock
RCC>APB2ENR = RCC_APB2ENR_ADC1EN;
// Reset ADC configuration
ADC1>CR = 0x00000000;
ADC1>CFGR1 = 0x00000000;
ADC1>CFGR2 = 0x00000000;
ADC1>CHSELR = 0x00000000;
// Enable 12bit continuous conversion mode
ADC1>CFGR1 = (0x00 <<ADC_CFGR1_RES_Pos)  ADC_CFGR1_CONT;
// Select PCLK/2 as ADC clock
ADC1>CFGR2 = (0x01 <<ADC_CFGR2_CKMODE_Pos);
// Set sampling time to 28.5 ADC clock cycles
ADC1>SMPR = 0x03;
// Select channel 11
ADC1>CHSELR = ADC_CHSELR_CHSEL11;
// Enable ADC
ADC1>CR = ADC_CR_ADEN;
// Start conversion
ADC1>CR = ADC_CR_ADSTART;
}
3.2. DAC configuration
Same thing for the DAC setup. We have a basic configuration:
/*
* DAC_Init()
* Initialize DAC for a single output
* on channel 1 > pin PA4
*/
void BSP_DAC_Init()
{
// Enable GPIOA clock
RCC>AHBENR = RCC_AHBENR_GPIOAEN;
// Configure pin PA4 as analog
GPIOA>MODER &= ~GPIO_MODER_MODER4_Msk;
GPIOA>MODER = (0x03 <<GPIO_MODER_MODER4_Pos);
// Enable DAC clock
RCC>APB1ENR = RCC_APB1ENR_DACEN;
// Reset DAC configuration
DAC>CR = 0x00000000;
// Enable Channel 1
DAC>CR = DAC_CR_EN1;
}
3.3. Timebase configuration
TIM6 timebase is tuned for 10ms period between update interrupts:
/*
* BSP_TIMER_Timebase_Init()
* TIM6 at 48MHz
* Prescaler = 48000 > Counting frequency is 1kHz
* Autoreload = 10 > Update period is 10ms
* Enable Update Interrupt
*/
void BSP_TIMER_Timebase_Init()
{
// Enable TIM6 clock
RCC>APB1ENR = RCC_APB1ENR_TIM6EN;
// Reset TIM6 configuration
TIM6>CR1 = 0x0000;
TIM6>CR2 = 0x0000;
// Set TIM6 prescaler
// Fck = 48MHz > /48000 = 1kHz counting frequency
TIM6>PSC = (uint16_t) 48000 1;
// Set TIM6 autoreload register for 10ms
TIM6>ARR = (uint16_t) 10 1;
// Enable autoreload preload
TIM6>CR1 = TIM_CR1_ARPE;
// Enable Interrupt upon Update Event
TIM6>DIER = TIM_DIER_UIE;
// Start TIM6 counter
TIM6>CR1 = TIM_CR1_CEN;
}
And NVIC is configure to allow TIM6 interrupts with priority 1:
/*
* BSP_NVIC_Init()
* Setup NVIC controller for desired interrupts
*/
void BSP_NVIC_Init()
{
// Set priority level 1 for TIM6 interrupt
NVIC_SetPriority(TIM6_DAC_IRQn, 1);
// Enable TIM6 interrupts
NVIC_EnableIRQ(TIM6_DAC_IRQn);
}
3.4. Follower stage
Finally, the main() function implementing follower stage is given below. in and out variables have been set as global in order to allow STMStudio® monitoring:
// Global variables
...
uint8_t timebase_irq = 0;
uint16_t in, out;
// Main program
void main()
{
// Configure System Clock
SystemClock_Config();
// Initialize 10ms timebase
BSP_TIMER_Timebase_Init();
BSP_NVIC_Init();
// Initialize ADC
BSP_ADC_Init();
// Initialize DAC
BSP_DAC_Init();
// Initialize LED pin
BSP_LED_Init();
// Main loop
while(1)
{
// Do every 10ms
if (timebase_irq == 1)
{
// Start measure
BSP_LED_On();
// Input signal < ADC
while ( (ADC1>ISR & ADC_ISR_EOC) != ADC_ISR_EOC );
in = ADC1>DR;
// Follower stage
out = in;
// Output signal > DAC
DAC>DHR12R1 = out;
// Stop measure
BSP_LED_Off();
timebase_irq = 0;
}
}
}
In order to test the follower stage, let us connect the user button pin PC13 to the ADC input pin PC1. This is a convenient way to generate fullscale input steps (0V → 3.3V) to the filter input pin PC1. Make sure that you have no code processing the pushbutton event. You don’t even need to call the BSP_PB_Init() function as all pins are input upon reset and we’re not going to read button state anyway.
Save all, build and run the application.
If you have an oscilloscope at hand, you can make some verifications:
 Probing PA5, you should see a short pulse that occurs every 10ms, according to the settings of the timebase:
 Zooming horizontal, the width of the PA5 pulse tells you about the time it takes to execute the filter loop. At this moment, the process takes 1.5µs to complete. Remember that we are doing nothing more than output = input (follower stage). It has to be short…
 Probing both PC1 and PA4 and playing with the user pushbutton, you can make sure that the follower stage actually follows (DAC output on PA4 is a copy of ADC input on PC1):
Alternatively, you can monitor both in and out variables with STMStudio® while playing with the user button:
When you have checked that everything is working as expected, you are ready to start the filter implementation.

Commit name "Digital filter  Follower" Push onto Gitlab 
4. Filter implementation
Remember that our filter is described by the following recursion equation:
In a first time, we will implement this equation directly using floats for the two coefficients c1 and c2.
4.1. Floating point arithmetic
Edit the main() function as shown below.
...
void main()
{
float c1, c2; // Filter coefs
float x = 0; // Filter input
float y = 0; // FIlter ouput
...
// Setup filter coefs
c1 = 0.11765f;
c2 = 0.88235f;
// Main loop
while(1)
{
// Do every 10ms
if (timebase_irq == 1)
{
// Start measure
BSP_LED_On();
// Input signal < ADC
while ( (ADC1>ISR & ADC_ISR_EOC) != ADC_ISR_EOC );
in = ADC1>DR;
// Filter stage
x = (float)in;
y = (c1*x) + (c2*y);
out = (uint16_t)y;
// Output signal > DAC
DAC>DHR12R1 = out;
// Stop measure
BSP_LED_Off();
timebase_irq = 0;
}
}
}
Note that in y = (c1*x) + (c2*y) we already have y_{n} on the lefthand side of '=' and y_{n1} on the righthand side.
Build the project and run.
You can use STMStudio® to get a quick view on the result:
For precise measurements, it’s better using an oscilloscope. Probe both input (PC1) and output (PA4) pins:
Remember that we want τ =80ms. About 95% of the final value should be reached within 3τ = 240ms. The figure above is a detailed view of the step (pushbutton) response. We can see that the filter behaves perfectly as expected. The filter final value is almost reached after 5τ.
Next, probe PA5 (filter processing time). In the figure below, we first note that the filter is processed quite fast compared to the sampling period (10ms), meaning that CPU is spending most of the time doing nothing, waiting for the next timebase interrupt.
We can measure an execution time of about 16µs:
If you play with the user PushButton, you may notice that processing time is not the same when input signal is high or low. It is actually a little bit faster (13.4µs) when input is 0 (button down) because arithmetic operations are likely simplified with null operands. Therefore, one can see that computation time is not deterministic, and depends on operands values. That’s a consequence of conditional branches in the assembly code that result from floating point arithmetic synthesis.

Commit name "Digital Filter  Floating point arithmetics" Push onto Gitlab 
4.2. Fixed point arithmetic
Using float type to represent real numbers such as our coefficients c1 and c2 is a natural thing, but you may know that real numbers may also be represented by integers, assuming a virtual decimal point. Such representation is called fixedpoint representation (‘float’ being for floatingpoint representation).
We are aware that CortexM0 CPU does not have any dedicated hardware to perform floating point arithmetic (Floating Point Unit, FPU). So it is likely that using floats is slowing down code execution (i.e. floating point arithmetic produces many assembly lines to complete).
Try the code below where filter variables are represented by a 12bit decimal place fixedpoint:
...
// Main program
void main()
{
uint16_t c1, c2; // Filter coefs
uint16_t x = 0; // Filter input
uint32_t y = 0; // Filter ouput
...
// Setup filter coefs
c1 = 482; // 0.11765f *2^12
c2 = 3614; // 0.88235f *2^12
// Main loop
while(1)
{
// Do every 10ms
if (timebase_irq == 1)
{
// Start measure
BSP_LED_On();
// Input signal < ADC
while ( (ADC1>ISR & ADC_ISR_EOC) != ADC_ISR_EOC );
in = ADC1>DR;
// Filter stage
x = in;
y = (c1*x) + (c2*y);
y = (y >>12U); // Remove decimal part
out = y;
// Output signal > DAC
DAC>DHR12R1 = out;
// Stop measure
BSP_LED_Off();
timebase_irq = 0;
}
}
}
Build the project and flash the target.
Make sure that the filter behaves exactly as before (i.e. when using floats):
If you are using STMStudio®, you can also see the filter behavior. Just set the sampling period to its maximum (sequential loop) and display all the data points:
When this is done, use the PA5 pin to measure the filter processing time. You’ll find that this implementation is about 8x faster (2µs) than the one making use of floats. For the very same behavior!

Commit name "Digital filter  Fixed point arithmetics" Push onto Gitlab 
5. Summary
In this tutorial, you have implemented a digital lowpass filter. With a little help from Scilab® you should be able to synthetize in the digital domain any filter as long as you have its transfer function.
The tutorial took you into 2 coding approaches, the second being 8x faster that the first one for the very same result.
Processing efficiency depends on your coding skills. It has consequences on application performance and more generally on the business (money).
A code that is more efficient allows you:
 to buy a less powerful, less expensive processor for the same task
 to reduce MCU clock speed and therefore reduce power consumption (i.e. increase battery life in the case of mobile application, reduce heat dissipation requirements…)
 spend more time in lowpower modes
These benefits are not just “icing on the cake”. It is fundamental that you produce efficient code every time you can.