Monday, October 22, 2012

Modified PID Controller with Constrained Cubic Spline Error Function

A PID controller is a good tool to have in your belt. You can use it as the first approach for most of the control problems a practical electronics engineer faces. You don't have to have the complete dynamic model of your system to be able to use it. What you have to know is how to tune it. After some time you will start feeling how the gains affect the system, and how to test the limits of the system to determine stability.

My  comments here, most likely,  will freak out my  control theory teachers, but what the heck. Sometimes, most of the time to be more precise, you don't have all the tools you need to evaluate your system for several reasons. And most of the time a PID controller will be good enough to move you project forward.

Just to give some context here, I'm not talking about the industrial PID controllers implemented as a box you can buy and connect to your boiler. Here the PID refers to the algorithm and it's implementation as a piece of software.

In this post I will present a modification of the classical PID controller to enable a kind of symmetrical behaviour when the input changes to saturation limits.

This idea occurred to me when I was working with video cameras and having some issues with the auto exposure system. Particularly with the control of the electromechanical iris in the lenses. The control of the iris was achieved by a standard PID, which produced a very noticeable difference in the convergence speed when moving the camera from a bright to a dark scene compared to performing the opposite (dark to bright). This produced one annoying very bright picture that could last for some seconds. I had no luck in tweaking the gains of the controller because while it solved the problem in one condition, it created instabilities in the other.

Analysis of the problem led me to develop the improvement describe here.

The PID equation

The equation for the PID controller in it's parallel form is:
\[u(t)=K_pe(t)+K_i\int_{0}^{t}{e(\tau)}\,{d\tau} + K_d\frac{d}{dt}e(t)\]
\(K_p\): Proportional gain
\(K_i\): Integral gain
\(K_d\): Derivative gain
\(e\): Error
The error is the difference between the set point and the process variable:
\(SP\): Set Point
\(PV\): Process variable (input)
I often use this form to directly implement a discrete PID controller.

In a PID controller, the proportional and integral terms contribute to the convergence speed.
The integral term is necessary to offset the error.

The Problem

For any physical system there will be saturation in every single part. The input will saturate to an maximum and a minimum level. You can design your controller to work with any range of inputs, which is most probably impractical. Or you can artificially limit the input to a certain range. For digital controllers there is a big chance of this saturation being imposed by your A/D converter or other analog signal conditioning circuit. Other good reason to saturate the input is to avoid numerical instabilities problems.

There some classes of systems where, for some reason, you want to limit the convergence speed, or you cannot improve speed without destabilize it.

Fig. 1
Now let's consider what happen if a very fast change in the input lead to a saturation. The error will be constant until the accumulated integral part will be enough to offset it. The error derivative is zero at this point, as the error is not changing. 

For a quick analysis, lets consider that the integral part is much bigger then the proportional one and dominates the equation. While this situation persists the output will increase, or decrease, at a constant rate. This is represented in the fig 1. The time between 1 and 2.5 seconds the input saturates at its maximum and starting from 3.5 seconds it saturates at its minimum.

Observe that the period of time necessary to recover from  a saturated minimum and the one to recover from a saturated maximum are very different. When it's max saturated it takes 1.5 seconds, and 6 seconds when at minimum. That is 4 times larger. The reason is that our set point is at 1/4 of the maximum value. The Integral term is the dominating factor:


As we are saturated the error term being integrated is constant:


In our example:

Leads to:

Linear Error Function

The error function for the classical PID controller is:


Fig. 2

In the Fig. 2 we can see the error 'curve' plotted for 3 different set points in a saturation limited (bounded) system. Observe that if the set point is set to half of the allowed range (0.5 in this case), the error will have the same absolute value, leading to equivalent raising and falling times for the integral, and for the convergence as a consequence.


What I propose is to replace the error function in the PID controller by a constrained cubic spline with some special requirements:
  • The three points of the curve are: P1=(0,0.5); P2=(PV, 0) and P3=(1, -0.5)
  • The curve has to be smooth at P2 and 
  • The 1st. derivative at P2 should be -1
  • The 1st. derivatives should always be negative, i.e. there will be no overshoot
  • It has to be simple enough to be computed at runtime 
Fig. 3 shows a plot of the proposed function.

Fig. 3
In Fig. 3 we can see the same 3 cases depicted previously . But now, regardless of the set point value, the error at both saturation points have the same absolute value: -0.5 and 0.5. This way the system will converge with approximately the same rate in both directions. Notice also that for SP = 0.5 the curve is the same line as in the original error function.

The ideas is to replace the error function by an spline. The proposed solution are a 2 segments cubic splines whose polynomials are:

\[e_1(t) = a_{1} + b_{1}PV(t) + c_{1}PV(t)^2 + d_{1}PV(t)^3\]
\[e_2(t) = a_{2} + b_{2}PV(t) + c_{2}PV(t)^2 + d_{2}PV(t)^3\]

Thus, the error function is given by:

e_1(t) & PV(t) \leq SP\\
e_2(t) & PV(t) > SP

The coefficient of the polynomials are functions of the Set Point and must be computed each time this value changes. To calculate the coefficients we must solve the spline equations including the proposed constraints.

The equations bellow are a solution for the two segments cubic spline with the constrains presented above.

\[f'_{1}(x_{0})= \frac{3(y_{1}-y_{0})}{2(x_{1}-x_{0})}-\frac{f'_{1}(x_{1})}{2}\]
\[f''_{1}(x_{0})=\frac{-2(f'_{1}(x_{1}) + 2 f'_{1}(x_{0}))}{(x_{1} - x_{0})} + \frac{6 (y_{1} - y_{0})}{(x_{1} - x_{0})^2}\]
\[f''_{1}(x_{1})=\frac{2(2 f'_{1}(x_{1}) + f'_{1}(x_{0}))}{(x_{1} - x_{0})} - \frac{6 (y_{1} - y_{0})}{(x_{1} - x_{0})^2}\]
\[d_{1} = \frac{f''_{1}(x_{1}) - f''_{1}(x_{0})}{6 (x_{1} - x_{0})}\]
\[c_{1} = \frac{x_{1} f''_{1}(x_{0}) - x_{0} f''_{1}(x_{1})}{x_{1} - x_{0}}\]
\[b_{1} = \frac{(y_{1} - y_{0}) - c_{1}(x_{1}^2 - x_{0}^2) - d_{1}( x_{1}^3 - x_{0}^3)}{x_{1} - x_{0}}\]
\[a_{1} = y_{0} - b_{1} x_{0} - c_{1} x_{0}^2 - d_{1} x_{0}^3\]
\[f'_{2}(x_{1})=\frac{2}{\frac{x_{2} - x_{1}}{y_{2} - y_{1}} + \frac{x_{1} - x_{0}}{y_{1} - y_{0}}}\]
\[f'_{2}(x_{2})= \frac{3(y_{2} - y_{1})}{2(x_{2} - x_{1})}-\frac{f'_{2}(x_{1})}{2}\]
\[f''_{2}(x_{1})=\frac{-2(f'_{2}(x_{2}) + 2 f'_{2}(x_{1}))}{(x_{2}-x_{1})} + \frac{6 (y_{2} - y_{1})}{(x_{2} - x_{1})^2}\]
\[f''_{2}(x_{2})=\frac{2(2 f'_{2}(x_{2}) + f'_{2}(x_{1}))}{(x_{2} - x_{1})} - \frac{6 (y_{2} - y_{1})}{(x_{2} - x_{1})^2}$\]
\[d_{2} = \frac{f''_{2}(x_{2}) - f''_{2}(x_{1})}{6 (x_{2} - x_{1})}\]
\[c_{2} = \frac{x_{2} f''_{2}(x_{1}) - x_{1} f''_{2}(x_{2})}{x_{2} - x_{1}}\]
\[b_{2} = \frac{(y_{2} - y_{1}) - c_{2}(x_{2}^2 - x_{1}^2) - d_{2}( x_{2}^3 - x_{1}^3)}{x_{2} - x_{1}}\]
\[a_{2} = y_{1} - b_{2} x_{1} - c_{2} x_{1}^2 - d_{2} x_{1}^3\]
\[x_{0} = 0\]
\[y_{0} = 0.5\]
\[x_{1} = SP\]
\[y_{1} = 0\]
\[x_{2} = 1\]
\[y_{2} = -0.5\]

To help with the calculation of the polynomials' coefficients I've developed a small Matlab (Octave) program. Bellow there are some results.

  a1=   0.50000 b1=  -5.50000 c1=   0.00000 d1=  96.00000
  a2=   0.13703 b2=  -1.19679 c2=   0.83965 d2=  -0.27988

  a1=   0.50000 b1=  -2.50000 c1=   0.00000 d1=   8.00000
  a2=   0.29630 b2=  -1.38889 c2=   0.88889 d2=  -0.29630

  a1=   0.50000 b1=  -1.00000 c1=   0.00000 d1=   0.00000
  a2=   0.50000 b2=  -1.00000 c2=   0.00000 d2=   0.00000

  a1=   0.50000 b1=  -0.50000 c1=   0.00000 d1=  -0.29630
  a2=  -6.00000 b2=  21.50000 c2= -24.00000 d2=   8.00000

  a1=   0.50000 b1=  -0.35714 c1=   0.00000 d1=  -0.27988
  a2= -91.00000 b2= 282.50000 c2=-288.00000 d2=  96.00000


When SP is close to the limits (0 or 1)  the derivatives (slope) became very steep and may cause numeric problems. So this method should be used with caution in its extremes.

Real Application

This modified controlling method was devised when I was developing a digital controller for the auto-exposure system of video surveillance cameras. The objective of this system is to control the brightness of the image being captured by the camera. It has to be able to perform under very extreme light conditions such as direct sunlight and poorly illuminated indoor areas.
There are three parameters to control in the camera in order to regulate the exposure:
  • Image sensor gain
  • Shutter speed
  • Iris opening
Not all the lenses have a controllable Iris, so the system can operate in two different modes depending on the type of lens installed:
  • Iris mode - regulates the amount of light entering in the camera
  • Shutter mode - regulates the exposure time of each captured frame
I will present some examples of improvements in both modes.

Iris Mode

In this mode the dominant parameter to be controlled is the amount of light enters in the camera. This is accomplished by regulating the opening of a mechanical iris embedded in the lens. 
The video bellow shows two similar sequences, the first with the normal PID and the second with the modified version. The sequences consist of moving the camera from a dark to a bright scene. 

As we can observe on this video, during the first sequence, the camera overshot and got "blind" for a little more than 2 seconds. This effect is due to the very high hysteresis of the electromechanical iris. Next sequence shows a mere half second dark picture. This represents an 8 fold improvement over the original design.
The figures 4 and 5 are the output of a real-time scope that was monitoring the controller operation when the videos were shot. Figure 4 shows the normal PID and figure 5 is the spline error modified. The major horizontal divisions  represent the time in seconds (10 seconds in total). The vertical axis is a interval form -1 to 1, all the variables were normalized to fit this interval.
 The traces captured are:
  • Blue: Set point (Illumination reference)
  • Red: Input (current measured illumination in the image sensor)
  • Green: Integral term (normalized to the interval 1, -1)
Fig. 4 - Iris control with standard PID error

Fig. 5 - Iris controller with modified spline error

Shutter Mode

In this mode the amount of light entering in the camera is fixed, what is controlled is the frame's exposure time. The mechanism that allow us to do this is an electronic shutter implemented in the image sensor itself.
The next video is an example of moving from bright to dark. We don't observe visually a so dramatic improvement  as in the previous case. But, as the graphs  bellow shows, the controller took 4 seconds to  converge with the normal PID and 2 seconds with the improved version.

Also is worth mentioning that the shutter model is linear , so we don't observe an overshoot as in the previous case.

Figures 6 and 7 were captured when above video sequences were taken. It must be noticed that the graphs show a little more than the video sequences. The graphs include moving the camera from dark to bright, that is the point where the red line goes up suddenly.

Fig. 6 - Shutter control with standard PID error
Fig 7 - Shutter control with modified spline error.

Friday, October 19, 2012

Interprocess RPC generation tool


This post discusses a methodology to create a Remote Procedure Call interface for Process-to-Process intercommunication. That is, to communicate between two processes in the same host.
The general solution for the problem is summarized here as design pattern. A tool to automatically generate the RPC stubs code, called irpcgen, is presented as well.

The Problem

We have two processes in an embedded system, lets call them H and C.

  • H - is a hard real-time process with strict deadlines. This can be one or more control loops or an acquisition system for example.
  • I - controls the operation of H and perform other non time sensitive tasks. It may implement the user interface, operational logging etc... but it's main task is create and watchdog H.

The question that arises is, what's the best approach to create a communication channel between H and I? More specifically, we want to answer two questions:

  1. which IPC mechanism will be best suited for the task?
  2. how to send and receive structured information over these channels?

The Solution

To answer 1. I created a small set of programs to benchmark several alternative IPC mechanisms in Linux. See my previous posts on the subject: Embedded Linux Interprocess Communication Mechanisms Benchmark - Part 2

With these data in hand it occurred to me that a natural channel will be two pipes,  I have used this approach in other opportunities, but I never considered before of using unnamed pipes for the task. That's what I propose here, to used a pair of unnamed pipes connected to the stdin and stdout. Which was the best thing to do as our controlling process I is the one forking H. And H have only one single controller attached to it. This way we created a two-way Process to Process IPC channel. Now we have to be able to send and receive structured data trough it.

My requisites for the data exchange mechanisms were:
  • Simple to program and extend
  • The communication has to be synchronous
  • The programming interface has to be at high level, RPC like
First thing to do was to transform an asynchronous channel into a synchronous one. To do this a small overhead protocol was introduced. It just defines a framing structure, to delimit the message boundaries and a scheme to multiplex different message types, also it introduced control messages for synchronization an link management.

Next step was to create a way of encapsulate C structures into the messages and to label them in order to be able to demultiplex on reception. No marshalling is necessary because both processes are in the same host. This involved in defining, for each message, a function to be called to transmit it and a corresponding callback to be invoked on reception.

This can be done manually. As a matter of fact I just did it, in the first product developed with this approach. It was also a way of validating the strategy without incurring in too much tooling effort.
But for this to be generally useful a tool to automatic generate the code was needed.

The irpcgen tool

To make the development easy I created a tool to generate the stubs for the server and client as well as sample server service calls. The program works pretty much like the SUN RPC rpgen tool, except that instead of reading a RPCL input file (.x) it reads a standard C header (.h) file. This is to super simplify the things. You just need to write your API in a header file an use the functions in the client's side. The implementation of the functions will be at the server's side.

The rpcgen will read the header file and will create stubs for all functions declared that can be used as RPC. This functions represent the server's API and have to follow some rules:

1 - The return has to be a bool type;
2 - There must be at most 2 arguments to the function;
3 - It cannot be declared static;
4 - If a second argument is provided it has to be a pointer to something except a void pointer;

Functions that fail to conform to any of these rules are not considered IRPC and no stub will be generated for them.

Furthermore the direction of the data transmission will be derived by the position and type of the arguments. The following cases are possible:

No arguments

Sends nothing returns nothing (but invokes the corresponding callback on the server side) .

bool my_rpc(void);

Single argument passed by value.

This is an server input value. This is more or less obvious as the client cannot read anything back. 

bool my_rpc_set(int val);
bool my_rpc_set(struct my_req req);

Single argument passed by constant reference.

This is similar as the previous case, the single argument is a server input value.

bool my_rpc_set(const struct my_req * req);

Single argument passed by reference.

This case the argument is a return value from the server. The client should provide a pointer to a variable that will receive the data.

bool my_rpc_get(int * val);
bool my_rpc_get(struct my_rsp * rsp);

Two arguments

This case the first argument is an input value and the second a return value from the server. The client should provide a pointer to a variable that will receive the data. Note that the second argument must be a non constant reference. The first argument can be any, except a void pointer (void *); 

bool my_rpc_set_and_get(struct my_req * req, struct my_rsp * rsp);
bool my_rpc_set_and_get(const struct my_req * req, struct my_rsp * rsp);
bool my_rpc_set_and_get(struct my_req req, struct my_rsp * rsp);
bool my_rpc_set_and_get(int req, int * rsp);


If any argument is passed as a char pointer (char *) it will be treated as a NULL terminated string. The rule for a single argument is the same as for reference. I.e. if it's declared as const it represents a client to server message and will be reverse for non const strings.

bool my_rpc_set_and_get(char * req, char * rsp);
bool my_rpc_set_and_get(const char * req, char * rsp);

bool my_rpc_set(const char * req);
bool my_rpc_get(char * rsp);

Service calls

The irpcgen tool will optionally create a ".h" file with "_svc" appended to the input file name. E.g. if the input is "my_rpc.h" the generated file will be "my_rpc_svc.h". The file will contains the signature for the services to be implemented.

The file:

bool my_rpc_get(int * val);
bool my_rpc_set(int val);

Will produce:

bool my_rpc_get_svc(int * val);
bool my_rpc_set_svc(int val);

The "_svc" functions must implement the server behaivour. Optionally a "*_svc.c" can be created with dummy functions. All you need to do is to fill this functions body to have a functional RPC system.

The libirpc

The libirpc is the companion of the irpcgen. The generated code depends on this library to run.

Source code

The irpcgen tool is GPL open-source and can be downloaded from: irpcgen.tar.gz
The package also contains the libirpc and a sample. The library is LGPL licensed.

There is a Makefile in the directories irpcgen, libirpc and sample. You need to compile irpcgen and libirpc before compiling the test.

If you want do cross-compile the library and the sample to an embedded platform, set the environment variable CROSS_COMPILE to the prefix of your tool-chain e.g. export CROSS_COMPILE=arm-gnu-linux-.

Friday, October 12, 2012

YARD-ICE goes Open Source


YARD-ICE stands for Yet Another Remote Debugger - In Circuit Emulator. It is a hardware and software platform I made public recently at Google Code. The project goal is to design the Hardware and Software of a JTAG tool to program and debug ARM microcontrollers. The target audience include developers of deep embedded systems with shallow pockets.

Link to the Project: YARD-ICE on Google Code

Why Another JTAG Tool?

There are tons of tool in the market. Why another one? The main reasons are three:

  1.  performance. Some basic, low cost tools, available in the market are really slow. One of the main reasons is that low level operations are performed by the Host PC. The round trip of the USB is the one to be blamed. YARD-ICE solve this problem with and FPGA handling the serialization and other bit handling.
  2.  support for Linux/MAC platforms. Most ICE hardware lacks a decent support for non Windows platforms. There are some exceptions, but those are expensive tools with TCP/IP support. YARD-ICE is a TCP/IP based tool with embedded GDB server. It's designed to work with any IDE supporting GDB like Eclipse.
  3. flexibility. Some tools are OK for some processors, but their best performance is tied to a certain proprietary tool. Scripting is not always an option. And when this possibility exists it's some obscure language or API with Windows DLL dependencies, and too slow. Why not to write a simple shell or python script in the host to automate a test or to program your systems in the factory? YARD-ICE provides a simple csh like scripting capability, you can run small scripts remotely through a ordinary TCP connection. End better than this, if you don't like the way we do or want to customize your tool? No worries, it's LGPL open source, meaning that you have what you need to do just that.

Apart from that I really like bit scrubbing. It's a good way of knowing the processor cores in depth.

Friday, October 5, 2012

Unix Select+Timers


When developing real-time network protocols and other embedded time sensitive systems, it is common having to read from one or more file-descriptors while keeping track of various timeouts at the same time.

This post discusses a method to implements timers and  file-descriptors polling in a single loop. It's very limited in the resource usage and relatively fast for a small number of timers and file descriptors. This conditions are usually met in embedded systems, where either is not allowed or expected for a device to serve too many clients.

The solution is fairly portable among UNIX like OSs as it uses POSIX calls. I wont claim that this is the best method to do it, but I have to say it's being successfully used in some time sensitive protocol implementations.


To implement the timers we have to keep track of the time, this is performed by a clock. The clock is a monotonic counter obtained through the clock_gettime(CLOCK_MONOTONIC) system call. And, usually,  it represents an absolute time since the system start up. The reference or epoch of the clock doesn't really matter, the important thing is that it can' t be subjected to corrections like NTP.

The timers are represented in the same way the clock is. Active timers (not expired) have their times set in the future. The timers with a time in the past are expired and are consider inactive. We compare the clock value with the timers values to determine when a timer expire and an appropriate action can be taken. One approach is to associate callback functions with the timers.

To improve the performance on 32bits systems, for the clock and timers we use only the first 32bits of the value, this way the time will wraps each 4294 seconds or 71 minutes. That means that to unequivocally determine if a timer timeout is in the future it should be at most at 1/2 of the wrapping value or about 35.5 minutes. This is more than enough for most of real-time applications. If this is not your case consider using a milliseconds clock reference (see bellow).

To setup a timer timeout it's just a matter of adding the timeout time in microseconds to the clock. In the example bellow we use a value of 0 to represent an inactive timer. So if the value of the clock plus the timeout time wraps to 0 we add 1 microsecond to avoid this condition. Other more elaborated methods can be used but this have the advantage of avoiding an extra memory reference when polling the timers.  


The idea is to use the select() system call to poll for the files-descriptors adjusting the timeout parameter according to the expiration time of the timers. We compare all the timers with the current clock and selects the smallest difference, higher than zero, between the expiration time an current time.

The select() system call has the advantage of being fast and conservative regarding resources usage, for a small number of file descriptors. The call by itself will not depend much of the number of the file-descriptors as it depends on the value of the last file-descriptor in the set. Another advantage of select() is portability.

#include <stlib.h>
#include <stdint.h>
#include <time.h>

#define ONE_SECOND 1000000
#define TMR_MAX 8
#define FD_MAX 8

/* get the system monotonic clock value in microseconds. */
static uint32_t get_clock_us(void)
    struct timespec tv;
    clock_gettime(CLOCK_MONOTONIC, &tv);
    return (tv.tv_sec * 1000000) + (tv.tv_nsec / 1000);

/* the maximum timer timeout allowed is 
   2147 seconds ~ 35 minutes */

uint32_t tmr[TMR_MAX]; /* List of timers */
unsigned int tmr_cnt; /* Number of timers in the list */

int fd[FD_MAX]; /* List of file descriptors */
unsigned int fd_cnt; /* Number of descriptors in the list */

static void * my_task(void * arg)
    struct timeval tv;
    uint32_t clock;
    int fd_max;
    fd_set rs;
    int ret;
    int i;

    for (;;) {
        /* get the current time in mircosseconds */
        clock = get_clock_us();

        /* clear the fd set */
        /* initialize dt_min to 1 minute */
        dt_min = 60 * ONE_SECOND;
        /* initialize fd_max */
        fd_max = 0;

        for (i = 0; i < tmr_cnt; i++) {
            int32_t dt;
            if (tmr[i] == 0) /* timer is inactive */
            if ((dt = (int32_t)(tmr[i] - clock)) <= 0) {
                /* timer timeout */
            } else if (dt < dt_min) {
                /* adjust the minimum timeout time */
                dt_min = dt;

        tv.tv_usec = dt_min;
        tv.tv_sec = 0;

        for (i = 0; i < fd_cnt; i++) {
            if (fd[i] != -1) {
                FD_SET(fd[i], &rs);
                if (fd[i] > fd_max)
                    fd_max = fd[i];

        ret = select(fd_max + 1, &rs, NULL, NULL, &tv);

        if (ret < 0) {
            if (errno == EINTR) /* select() interrupted */
            /* select() failed */
            return ret;

        for (i = 0; i < fd_cnt; i++) {
            if ((fd[i] != -1) && FD_ISSET(fd[i], &rs)) {
                /* read from the file descriptor */

void timer_set(unsigned int id, unsigned int tmo_us)
    tmr[id] = clock + tmo_us;
    if (tmr[id] == 0)

I've tried to keep the example as short as possible, so the structure is far from ideal in terms of encapsulation. If the list of timers or file descriptors is changes dynamically, a mutual exclusion mechanism should be implemented as well. This is to avoid race conditions when evaluating the timers or the file descriptors.

Minor improvements

It may be a good idea to avoid arithmetic divisions in platforms that don't have an equivalent div instruction, like ARM v4  and v5 (ARM7-9). This will improve the performance a little bit. The following code is an alternative to the original one that uses sums of shifts to get an approximation of the 'by 1000' division, when calculating the number of microseconds.

static inline uint32_t get_sys_clock_us(void)
   struct timespec ts;

   clock_gettime(CLOCK_MONOTONIC, &ts);
   /* This is a fast, no division, good approximation to: 
      tv_nsec / 1000. The maximum error is 74 microseconds
      It costs only 5 structions on ARMv5 */
   return (ts.tv_sec * 1000000) + (ts.tv_nsec >> 10) +  
   (ts.tv_nsec >> 15) - (ts.tv_nsec >> 17) + (ts.tv_nsec >> 21);

If timers with more than 35 minutes are needed the clock function can be modified to count in milliseconds instead of microseconds. Follows the non-division implementation of the clock function, and the conversion to microseconds to set-up the timeval struct:

static uint32_t get_clock_ms(void)
   struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts); /* This is a fast, no division, good approximation to: tv_nsec / 1000000. */ return (ts.tv_sec * 1000) + (ts.tv_nsec >> 20) + (ts.tv_nsec >> 25) - (ts.tv_nsec >> 26) + (ts.tv_nsec >> 28) + (ts.tv_nsec >> 29); } ... tv.tv_usec = dt_min * 1000;

Friday, February 10, 2012

The Espresso Machine

This tale begins in Brazil, winter time. I mean winter on the north hemisphere. Naturally it was summer in South-America, where we fled to escape the peak of Canada's cold (turns out that the winter was not that bad this year). Anyway, my wife and I were in vacations visiting our relatives there. While my wife went to the north-east part of the country, I had to go to the the capital of Minas-Gerais state, the city of Belo Horizonte. There is where my younger sister has being living.

I won't say that I do not appreciate a good espresso coffee, I'm more like a tea kind of guy. But even someone as inexperienced as I am, have to admit, that there is something rather pleasant in the taste of a good coffee extracted by a skilled barista. That was sure the case when we went to a coffee shop called KahlĂșa. By the recommendation of my brother-in-law, as well as my sister, I tasted two 'single origin' ('sigle origin' being in opposition of 'blends' as I learned from them). The first one called Araponga and the other one being Sul-de-Minas Especial(South of Minas Gerais Special), to be more precise we tasted the later first. I may fail to describe the sensation of smelling the 'exquisite' aroma, a mixing of the brew and the freshly roasted beans. They where roasting the coffee while we are at the store.  All that I can say is that the coffees were amazing, no bitter nor soar, just perfect. So much so that I couldn't help my self but to buy right away two packets. One to myself, my wife and dog (you have to know the dog to understand), and the other one for a couple of friends who were 'dog sitting' our little cockapoo. It is worth mentioning that the beans were medium roasted, packed and sealed as we were in the store. This allows to preserve most of the characteristics of the coffee, I suppose.

All very well, except by the fact that, we didn't have the grinder to get a coffee powder, nor the espresso machine to brew it into something worth drinking. Returning to Toronto the first thing I did was to look for machines and learn a little bit about the art of espresso making. Well, there is a plethora of ways to brew coffee and a lot of different types of machines to do espresso variants. The choice of a particular type of machine will depend, as we learned, on how much you want to be involved in the process of coffee making. Tt can range from completely manual to fully automated ones. In some matters, as food and beverages, I like to be in control of the preparation whenever is possible, or at least be part of it. Besides of the fact that, I don't classify myself as gourmet, I like to fancy of being a reasonable cook. So I decided to venture into this new endeavour of espresso making.

After some googling around, I settled for the Rancillio Silvia espresso machine and the
Baratza Vario grinder. The main reasons being, the good reviews of both machines in several sites like CoffeGeek (, as well as the bundle was in the budget we had available. I located a store ( in Mississauga (a city nearby Toronto)  which have this particular combination in a promotional package, along with some accessories and 1Kg of coffee beans. The first Saturday, just after arriving home, we went there. I must say that I was very impressed by the store, that turns out being much larger than I expected. The person who took care of us there was very kind and knowledgeable. We had the opportunity to test the machines on the spot, clarify some doubts and taste coffees. Needless to say, we bought the package and other stuff we deemed necessary to complete the espresso experience. These included: a calibrated tamper, a knock box, some 'vacuum' sealed containers for the beans and a new water filter. In the picture bellow you can see how the two machines are happily installed in our dining room.

Rancilio Silvia and Baratza Vario

ARM-GCC Toolchain How-To

Once in a wile I have to compile the GCC Toolchain (Binutils, GCC, GDB) for a new platform, either because I want to have some new feature, or due to a bug correction, and also after installing a new operating system. As I don't do this often, I always have trouble remembering some steps. That's why I'm posting it here.

Before you go any further I want to point out that we will not cover here how to compile the C++ compiler (g++) - this will require the compilation of a runtime library, and is a little more challenging. Only the C language will be supported, and no C library (libc) will be generated as well. This will be, for sure, a limiting factor for almost everybody except those who are developing system software.

This tutorial will explain how to compile a cross GCC toolchain for ARM processors on a Ubuntu 10.04 LST host machine. It will probably work fine on other Ubuntu releases as well, but please be aware that there is a good chance of these procedures failing if you intend to use a different set of OS and source code (other versions of GCC, binutils or GDB).
So there we go. First of all, let's get the packages:

Downloading the source code

cd /tmp

Now lets prepare the environment to compile and install. I usually install the tools in a subdirectory over the /opt directory. In this case we will be installing in the /opt/arm-none-eabi directory. The binaries (programs, gcc, gdb and such) will be located in the /opt/arm-none-eabi/bin
subdirectory and will be prefixed by "arm-none-eabi" (arm-none-eabi-as,, arm-none-eabi-gcc,...) .

Installing development libraries

sudo apt-get install libmpfr-dev libgmp3-dev libmpc-dev
sudo apt-get install libz-dev

The first line install the MPFR, GMP and MPC development libraries, which are required to compile GCC since version 4.3.
The last line adds the zlib development package, as you may get an error when compiling the zlib provided with GCC.

Creating a build tree

Assuming that all the source code files where downloaded in the /tmp directory, ad we will compile in our home directory:

mkdir gcc-toolchain
cd gcc-toolchain
bzip2 -dc /tmp/binutils-2.22.tar.bz2 | tar -vxf -
bzip2 -dc /tmp/gcc-core-4.6.2.tar.bz2 | tar -vxf -
bzip2 -dc /tmp/gdb-7.4.tar.bz2 | tar -vxf -
mkdir arm-none-eabi
cd arm-none-eabi
mkdir binutils-2.22
mkdir gcc-4.6.2
mkdir gdb-7.4
export PATH=/opt/arm-none-eabi/bin:/bin:/usr/bin

The last line will set-up the PATH for the compilation. Notice that the first entry (/opt/arm-none-eabi/bin) does not exist yet, but it will be crated when installing the binutils and will be necessary for compiling the GCC.

Compiling GNU binutils

First let's do the basics: assembler, archiver, linker and object files utilities.

cd binutils-2.22
../../binutils-2.22/configure --prefix=/opt/arm-none-eabi --target=arm-none-eabi --disable-nls
make -j 8
sudo make install
cd ..

Compiling GCC

If everything went well, we are good to compile the cross-compiler. To make sure check the /opt/arm-none-eabi/bin directory, all the "arm-none-*" family of binutils must be there.

cd gcc-4.6.2
../../gcc-4.6.2/configure --prefix=/opt/arm-none-eabi --target=arm-none-eabi --disable-nls --disable-libssp --disable-zlib --enable-languages="c"
make -j 8
sudo make install
cd ..

GCC is up, let's see if it's running:
$ arm-none-eabi-gcc
arm-none-eabi-gcc: fatal error: no input files
compilation terminated.

If you got that message your compiler is fine.

Compiling GDB
As an optional step, you can compile the GDB. This will allows you, with the right tool, to remotely debug your embedded application.

cd gdb-7.4
../../gdb-7.4/configure --prefix=/opt/arm-none-eabi --target=arm-none-eabi --disable-nls
make -j 8
sudo make install
cd ..

Update your PATH

You have to include the newly created toolchain bin directory into your PATH environment. Edit .bashrc, in your home directory, and add the following line:

export PATH=$PATH:/opt/arm-none-eabi/bin

For the changes to take effect you will have to restart the terminal or source your .bashrc with:

$ source ~.bashrc

/!\ Attention: the -j 8 parameter in the make line, allows for parallel building, which will speed-up the compilation process quite a lot. But, from my experience, I recommend not to use -j alone, as this may result in a non-responsive computer and sometimes the compilation itself or some other applications may crash. For that reason always set the number of tasks to match the number of cores or threads your machine has. For example I'm using a Intel Core 7 with 4 cores and 2 threads per core (Intel Hyper Threading), so I use -j 8.

Other tips


In some operating systems the GMP and MPFR libraries required to compile the GCC are outdate, to solve this we download source code and tell the configure script where to find them:

bzip2 -dc /tmp/gmp-4.3.2.tar.bz2 | tar -vxf -
bzip2 -dc /tmp/mpfr-3.1.0.tar.bz2 | tar -vxf -


Depending on what your projects are you most probably will need a C library. NewLib is a good option and you can compile it along with GCC.
Quoting from the Newlib's website: "Newlib is a C library intended for use on embedded systems. It is a conglomeration of several library parts, all under free software licenses that make them easily usable on embedded products."

cd / tmp
cd ~/gcc-toolchain
gzip -dc /tmp/newlib-1.20.0.tar.gz | tar -vxf -

In the GCC compilation step you need to inform you want to use the NewLib as your default C library: --with-newlib
 As the NewLib provides support for building the run-time elements of C++ we can enable the C++ in the GCC compilation as well: --enable-languages="c,c++"

cd gcc-4.6.2
../../gcc-4.6.2/configure --prefix=/opt/arm-none-eabi --target=arm-none-eabi --disable-nls --disable-libssp --disable-zlib --enable-languages="c,c++" --with-newlib --with-headers=../../newlib-1.20.0/newlib/libc/include 
make -j 8
sudo make install
cd ..

Now you can compile the library:

mkdir newlib-1.20.0
cd newlib-1.20.0
../../newlib-1.20.0/configure --prefix=/opt/arm-none-eabi --target=arm-none-eabi
make -j 8
sudo make install
cd ..

NewLib Notes

If you want to use the NewLib to do I/O, dynamic memory, file operations and some other other functions, you will need to create a OS adaptation layer. I may write a tutorial on the subject one of these days.


I've tried the compilation sequence in my netbook running Lubuntu 12.04 and it worked like a charm. If you have a resource limited, old computer, or simply don't swallow the new Gnome/Ubuntu interface I really recommend LUbuntu: