tag:blogger.com,1999:blog-88126657325760967852024-02-07T08:37:23.080-08:00Mostly EmbeddedEmbedded systems and other personal interests.Robinson Mittmannhttp://www.blogger.com/profile/00297238237576764278noreply@blogger.comBlogger9125tag:blogger.com,1999:blog-8812665732576096785.post-30613384227947838652015-11-27T06:23:00.001-08:002017-10-04T20:15:57.221-07:00Script Language for Memory Constrainded Embedded Systems<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
A little digression on my quest for the ideal embedded scripting language.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
I wanted to add scripting capability into a system I've created as open source project for quite some time: <a href="https://code.google.com/p/yard-ice/" style="color: #1155cc;" target="_blank">https://code.google.com/<wbr></wbr>p/yard-ice/</a></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
After some back and forth with languages like eLua the conclusion I've reached is that although powerful the run-time footprint was far away from what I could afford. At the time I couldn't find anything suitable.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
Then recently when developing a tool, for the company I'm working right now, I felt the need for some mechanism to flexibly configure the system. My answer too provide this flexibility was again an embedded scripting language. </div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
I revisited some options previously considered and also new ones. Frustrated with the outcome I decided to write my own scripting language and run-time environment. I've start studying compilers theory, attended some online classes and after a while I was ready to go.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
First thing was to create a list of requirements for the language, which are:</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- The language is not meant to be a general purpose one, so we can get away with some complex constructs.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- The main use is to bind existing functions of the system in order to provide dynamic response to events. A sort of user defined behavioural response (the ultimate flexibility for configuration).</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- Small runtime footprint.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- Should run portable p-code (bytecodes) in an compact virtual machine. It should allow for the bytecode stream to be generated in a host and transferred to run in the target.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- Support complex integer arithmetic and logic expressions.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- 32 bits integers variables is a requirement.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- Exception handling is desirable as a clean way of dealing with errors.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- Strings are optional specially string manipulation routines. The reason for that is that this requires dynamic memory allocation. </div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- For stings it must support constant strings in non volatile memory.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- It's not required to support user defined functions.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- The system can support multiple scripts. Possibly one script for each type of event.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- Support global user defined variables shared among scripts. Meaning that when one script is fired in response to an event it can record some information to be used for another script later one.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- The syntax must be close to a common programming language, like C or Java.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- The compiler must be compact to be embedded in the target system.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- No optimization of the code is necessary.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- The syntax analysis should be relatively strong (difficult to measure). The idea is to do a certain amount of static analysis to reduce the run-time checking as much as possible. Some common constructs can be left out of the language for this reason.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- If possible avoid complex run time memory management support. They are tend to be expensive in terms of memory and/or CPU usage.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- It should be possible to compile chunks of code at time. The compiler have to save it's state to resume it;s operation when more code became available. This is needed in order to compile code embedded as arrays of strings in JSON files without resorting to stitching the strings before compiling the full text.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
From this initial requirements some other where added due to the limitation of syntax analysis with very little memory:</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- Has to be a one pass compiler (translator) no parse or abstract syntax tree can be generated due to memory limitation.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- It can't be a recursive compiler because it's too heavy on the stack. Also memory limit check for the stack are hard to implement.So a recursive descent parser was out of the equation, although simple to implement.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
These requirements narrowed down the solution to a LL(1) grammar and a table based Syntax Directed Translator. The problem now was how to generate the a compact lexical analyzer and the parser. The lexer (scanner) was solved by a handcrafted specialized code, not particularly difficult. </div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
Next dragon to slay was the parser. I've tried to find some tools to generate the tables for LL(1) grammars but nothing good. Some tools crashed as soon as the grammar grew to be a little more complex. Also the tables generated where very large, unsuitable for what I wanted. The solution was to write my own parser generator for LL(1) grammars. But wait, searching the internet I found a perfect starting point which was a tool developed by Prof. Ivo Mateljan from the University of Split, Croatia. It was a code he wrote for his students in a computer science classes. The program called ELL had almost everything I needed. It cold parse a grammar create the first and follow sets and create the list of predictions for each rule. I asked Prof. Ivo to modify his program, which he promptly and generously did. Then I added a code to generate the tables in C code and an extension to insert semantic actions inside the productions in the grammar. The big trick was a method I devised to binary search for the correct rule in the predictions list instead of a single lookup table. The ELL then generates 4 tables 2 functions and a set of constant definitions to generate the skeleton of a Syntax Directed Translator. It's pretty neat. I hope to create an open source project out of it, it may help other people as well.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
The result is a language which I'm calling provisionally "MicroJS" because resembles JavaScript. </div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
As it stands right now, the minimum compiled code targeting an ARM Cortex-M3 microcontroller is 9064bytes of FLASH (Code) 416 bytes of RAM plus some 128bytes for the stack. This includes a small library with some 9 functions including a "printf", the compiler, the virtual machine. </div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
A more realistic example is a system that includes a serial driver, a console shell a small basic flash filesystem some basic commands to upload manage files and to upload scripts using the Xmodem protocol. All of that costs 17720 bytes of code and 968 bytes of memory and around 1256 bytes for stack (the problem is the xmodem here that requires a 1k buffer, a better implementation could reuse the microjs space for the xmodem buffer which would reduce the total memory requirement to around 1.5KiB).</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
In case you may be wandering the type of code I can run. These are 2 examples of the code I used to test the system described above:</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
Example 1:</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<pre style="color: black; font-size: medium; white-space: pre-wrap;"><span style="color: blue;">//</span>
<span style="color: blue;">// Generate the Fibonacci sequence up to the maximum 32 bits signed integer</span>
<span style="color: blue;">//</span>
<span style="color: teal;">var</span> x0 = 0, x1 = 1, x;
<span style="color: #804040; font-weight: bold;">try</span> <span style="color: teal;">{</span>
<span style="color: teal;">var</span> i = 1;
<span style="color: #804040; font-weight: bold;">while</span> (1) <span style="color: teal;">{</span>
<span style="color: blue;">// Check whether the next sum will overflow or not</span>
<span style="color: #804040; font-weight: bold;">if</span> (0x7fffffff - x1 <= x0) <span style="color: teal;">{</span>
<span style="color: #804040; font-weight: bold;">throw</span> 1; <span style="color: blue;">// overflow</span>
<span style="color: teal;">}</span>
x = x1 + x0;
x0 = x1;
x1 = x;
printf(<span style="color: magenta;">"%2d | %10u</span><span style="color: slateblue;">\n</span><span style="color: magenta;">"</span>, i, x);
i = i + 1;
<span style="color: teal;">}</span>
<span style="color: teal;">}</span> <span style="color: #804040; font-weight: bold;">catch</span> (err) <span style="color: teal;">{</span>
printf(<span style="color: magenta;">" - overflow error!</span><span style="color: slateblue;">\n</span><span style="color: magenta;">"</span>);
<span style="color: teal;">}</span></pre>
<pre style="white-space: pre-wrap;"><div style="font-family: arial; font-size: small; white-space: normal;">
<pre style="white-space: pre-wrap;"><div style="font-family: arial; white-space: normal;">
Produces the output:</div>
</pre>
</div>
<div>
<pre style="white-space: pre-wrap;"><span style="color: black; font-size: small;">[JS]$ js fib.js
"fib.js"
Code: 85 bytes.
Data: 12 bytes.
1 | 1
2 | 2
3 | 3
4 | 5
5 | 8
6 | 13
7 | 21
8 | 34
9 | 55
10 | 89
11 | 144
12 | 233
13 | 377
14 | 610
15 | 987
16 | 1597
17 | 2584
18 | 4181
19 | 6765
20 | 10946
21 | 17711
22 | 28657
23 | 46368
24 | 75025
25 | 121393
26 | 196418
27 | 317811
28 | 514229
29 | 832040
30 | 1346269
31 | 2178309
32 | 3524578
33 | 5702887
34 | 9227465
35 | 14930352
36 | 24157817
37 | 39088169
38 | 63245986
39 | 102334155
40 | 165580141
41 | 267914296
42 | 433494437
43 | 701408733
44 | 1134903170
45 | 1836311903
- overflow error!
[JS]$</span></pre>
</div>
<div style="font-family: arial; font-size: small; white-space: normal;">
</div>
<div style="font-family: arial; font-size: small; white-space: normal;">
Example 2:</div>
</pre>
</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<span style="color: teal;"></span><br />
<pre style="color: black; font-size: medium; white-space: pre-wrap;"></pre>
<span style="color: teal;">
<pre style="color: black; font-size: medium; white-space: pre-wrap;"><pre style="white-space: pre-wrap;"><span style="color: blue;">// Print a list of 100 random prime numbers</span>
<span style="color: blue;">//</span>
<span style="color: teal;">var</span> j, cnt = 0;
srand(time()); <span style="color: blue;">// initialize random number generator</span>
printf(<span style="color: magenta;">"----------------------</span><span style="color: slateblue;"><wbr></wbr>\n</span><span style="color: magenta;">"</span>);
<span style="color: #804040; font-weight: bold;">for</span> (j = 0; j < 100; ) <span style="color: teal;">{</span>
<span style="color: teal;">var</span> n = rand();
<span style="color: teal;">var</span> prime;
<span style="color: #804040; font-weight: bold;">if</span> (n <= 3) <span style="color: teal;">{</span>
prime = n > 1;
<span style="color: teal;">}</span> <span style="color: #804040; font-weight: bold;">else</span> <span style="color: teal;">{</span>
<span style="color: #804040; font-weight: bold;">if</span> (n % 2 == 0 || n % 3 == 0) <span style="color: teal;">{</span>
prime = <span style="color: magenta;">false</span>;
<span style="color: teal;">}</span> <span style="color: #804040; font-weight: bold;">else</span> <span style="color: teal;">{</span>
<span style="color: teal;">var</span> i;
<span style="color: teal;">var</span> m;
m = sqrt(n) + 1;
prime = <span style="color: magenta;">true</span>;
<span style="color: #804040; font-weight: bold;">for</span> (i = 5; (i < m) && (prime); i = i + 6) <span style="color: teal;">{</span>
<span style="color: #804040; font-weight: bold;">if</span> (n % i == 0 || n % (i + 2) == 0) <span style="color: teal;">{</span>
prime = <span style="color: magenta;">false</span>;
<span style="color: teal;">}</span>
<span style="color: teal;">}</span>
<span style="color: teal;">}</span>
<span style="color: teal;">}</span>
<span style="color: #804040; font-weight: bold;">if</span> (prime) <span style="color: teal;">{</span>
j = j + 1;
printf(<span style="color: magenta;">"%3d %12d</span><span style="color: slateblue;">\n</span><span style="color: magenta;">"</span>, j, n);
<span style="color: teal;">}</span>
cnt = cnt + 1;
<span style="color: teal;">}</span>
printf(<span style="color: magenta;">"----------------------</span><span style="color: slateblue;"><wbr></wbr>\n</span><span style="color: magenta;">"</span>);
<span style="color: teal;">var</span> x = (j * 10000) / cnt;
printf(<span style="color: magenta;">"%d out of %d are prime, %d.%02d %%.</span><span style="color: slateblue;">\n</span><span style="color: magenta;">"</span>,
j, cnt, x / 100, x % 100);
printf(<span style="color: magenta;">"---</span><span style="color: slateblue;">\n\n</span><span style="color: magenta;">"</span>);</pre>
</pre>
</span></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
The result form the console (the intermediate values were cut):</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<pre style="white-space: pre-wrap;"><span style="color: black; font-size: small;">[JS]$ js prime.js
"prime.js"
Code: 230 bytes.
Data: 12 bytes.
----------------------
1 1840531613
2 1518954509
...</span></pre>
<pre style="color: black; font-size: medium; white-space: pre-wrap;"> 98 1946156671
99 821160383
100 359376917
----------------------
100 out of 2022 are prime, 4.49 %.
---
[JS]$
</pre>
</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
This can give you an idea of the syntax and capabilities of the language.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
The compiled code size is reasonable. And the execution speed is considerably good. But this is just the impression I have from the complexity of the prime algorithm It took 28 seconds to factor 2022 32 bit numbers in a 16MHz machine, It won't break any cryptosystem but seems good enough for and embedded scripting.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
Some observations about the language:</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- There is no increment (++) or decrement (--) operations.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- The only assignment operation allowed is equals (=), contrasting with C alternative assignments like: +=, *= ...</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- The "for", "if" and "while" structures require the statements to be surrounded by braces "{ }", this is to avoid the famous dangling "else" issue, which is hard to treat with LL(1) grammars.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- There is no "switch/case" construct in the language.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- All variables are 32 bits signed integers. Although the language can accept strings and chars and booleans they will be stored ant treated internally as signed integers.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- There are no support (for the moment) of "break" and "continue" declarations.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- There is no "goto" construct.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- No user defined functions. All callable functions are provided by a compile time defined library. This is a problem difficult to decide. Although it's not that complicated to allow functions, it's misuse can lead to problems difficult to treat like recursive calls that may exhaust the stack very quickly. Also static analysis is much simpler with no function calls to deal with. Library calls area easy to handle as they don't use the VM's memory space to run.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- Variadic functions are allowed. Yippee. I can't live without printf()...</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- Multiple return of functions arguments are allowed (work in progress). With the not so common construct:</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<span style="font-family: "courier new" , monospace;">(x, y) = get_point();</span></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- There is a default catch all exception handler which silently terminates the script. The exception number is returned as a return value of the virtual machine. So it's better not to throw a 0 exception, which will be difficult to catch.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- There are no real differences between logical and bitwise AND and OR operators, which will perform as expected on boolean values anyway. So "&" and "&&" are interchangeable.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- BUG: There is a small limitation (which I plan to fix soon) in the precedence of "*" and "/" operators, they are at the same level and evaluated from right to left (easy to solve).</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- TODO: arrays. This is a tricky one for non-typed language (but hey, we are intrinsically typed everything is integral). The problem is two folds. First is how to correctly allocate memory for it. Easy to solve if we force defining the size in the declaration, alternatively a static analysis could do the trick. But how to check for bound in run-time without too much of metadata being managed by the VM? Does someone have the answer for that? Other issue is the utility of arrays if we can't have other types except for integers. Maybe the trade offs do not favor implementing arrays. Extra complexity with no real benefit for the intended use. Other idea is to implement library defined arrays only. At least you could do something like:</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<span style="font-family: "courier new" , monospace;"> x = sensor[2];</span></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<span style="font-family: "courier new" , monospace;"> valve[1] = x * 4;</span></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
This can be easily implemented by a syntax action which calls access functions (get()/set()).</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
- TODO: packaging the byte code for remote target. How to carry the required library information without taking too much space.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
Well I think that's enough for now.</div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: arial, sans-serif; font-size: 12.8px;">
Thanks for listening :)...</div>
Robinson Mittmannhttp://www.blogger.com/profile/00297238237576764278noreply@blogger.com1tag:blogger.com,1999:blog-8812665732576096785.post-39639972922253022812012-10-22T19:16:00.000-07:002017-04-25T06:27:50.173-07:00Modified PID Controller with Constrained Cubic Spline Error Function<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-MML-AM_CHTML"></script><br />
A PID controller is a good tool to have in your belt. You can use it as the first approach for most of the control problems a practical electronics engineer faces. You don't have to have the complete dynamic model of your system to be able to use it. What you have to know is how to tune it. After some time you will start feeling how the gains affect the system, and how to test the limits of the system to determine stability.<br />
<br />
My comments here, most likely, will freak out my control theory teachers, but what the heck. Sometimes, most of the time to be more precise, you don't have all the tools you need to evaluate your system for several reasons. And most of the time a PID controller will be good enough to move you project forward.<br />
<br />
Just to give some context here, I'm not talking about the industrial PID controllers implemented as a box you can buy and connect to your boiler. Here the PID refers to the algorithm and it's implementation as a piece of software.<br />
<br />
In this post I will present a modification of the classical PID controller to enable a kind of symmetrical behaviour when the input changes to saturation limits.<br />
<br />
This idea occurred to me when I was working with video cameras and having some issues with the auto exposure system. Particularly with the control of the electromechanical iris in the lenses. The control of the iris was achieved by a standard PID, which produced a very noticeable difference in the convergence speed when moving the camera from a bright to a dark scene compared to performing the opposite (dark to bright). This produced one annoying very bright picture that could last for some seconds. I had no luck in tweaking the gains of the controller because while it solved the problem in one condition, it created instabilities in the other.<br />
<br />
Analysis of the problem led me to develop the improvement describe here.<br />
<h3>
The PID equation</h3>
The equation for the PID controller in it's parallel form is:<br />
\[u(t)=K_pe(t)+K_i\int_{0}^{t}{e(\tau)}\,{d\tau} + K_d\frac{d}{dt}e(t)\]<br />
where:<br />
\(K_p\): Proportional gain<br />
\(K_i\): Integral gain<br />
\(K_d\): Derivative gain<br />
\(e\): Error<br />
The error is the difference between the set point and the process variable:<br />
\(e=SP-PV\)<br />
\(SP\): Set Point<br />
\(PV\): Process variable (input)<br />
<span style="background-color: white;">I often use this form to directly implement a discrete PID controller.</span><br />
<span style="background-color: white;"></span><br />
<span style="background-color: white;">In a PID controller, the proportional and integral terms contribute to the convergence speed.</span><br />
<span style="background-color: white;">The integral term is necessary to offset the error.</span><br />
<br />
<h3>
<span style="background-color: white;">The Problem</span></h3>
For any physical system there will be saturation in every single part. The input will saturate to an maximum and a minimum level. You can design your controller to work with any range of inputs, which is most probably impractical. Or you can artificially limit the input to a certain range. For digital controllers there is a big chance of this saturation being imposed by your A/D converter or other analog signal conditioning circuit. Other good reason to saturate the input is to avoid numerical instabilities problems.<br />
<br />
There some classes of systems where, for some reason, you want to limit the convergence speed, or you cannot improve speed without destabilize it.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEguzJbw9Vua3V5896WUec20RzmQ123BqU8qfGo5rrjcmudA6I10e5u1Ab__0LILhXVm4NvW7BuaddCo91GPcI0xYmeM_qZmArmx9EeO2SB3axiKyaHAEAQfDcsTG_myaJ1kCFoUXzBIx8wc/s1600/pid_sat_1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="280" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEguzJbw9Vua3V5896WUec20RzmQ123BqU8qfGo5rrjcmudA6I10e5u1Ab__0LILhXVm4NvW7BuaddCo91GPcI0xYmeM_qZmArmx9EeO2SB3axiKyaHAEAQfDcsTG_myaJ1kCFoUXzBIx8wc/s400/pid_sat_1.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fig. 1</td></tr>
</tbody></table>
<span style="background-color: white;">Now let's consider what happen if a very fast change in the input lead to a saturation. T</span><span style="background-color: white;">he error will be constant until the accumulated integral part will be enough to offset it. The error derivative is zero at this point, as the error is not changing. </span><br />
<span style="background-color: white;"><br />
</span> <span style="background-color: white;">For a quick analysis, l<span style="background-color: white;">ets consider that the integral part is much bigger then the proportional one and dominates the equation. While this situation persists the output will increase, or decrease, at a constant rate. This is represented in the fig 1. The time between 1 and 2.5 seconds the input saturates at its maximum and starting from 3.5 seconds it saturates at its minimum.</span></span><br />
<span style="background-color: white;"><span style="background-color: white;"><br />
</span></span> <span style="background-color: white;"><span style="background-color: white;">Observe that the period of time necessary to recover from a saturated minimum and the one to recover from a saturated maximum are very different. When it's max saturated it takes 1.5 seconds, and 6 seconds when at minimum. That is 4 times larger. The reason is that our set point is at 1/4 of the maximum value. The Integral term is the dominating factor:</span></span><br />
<h4>
<span style="background-color: white;"><span style="background-color: white;"><div style="font-weight: normal;">
\[{I}(t)=K_{i}\int_{0}^{t}{e(\tau)}\,{d\tau}\]<br />
<br />
<div>
As we are saturated the error term being integrated is constant:</div>
<div>
<br /></div>
<div>
\(e(\tau)_{max}=SP-PV_{max}\)</div>
<div>
\(e(\tau)_{min}=SP-PV_{min}\)</div>
<div>
<br /></div>
<div>
In our example:</div>
<div>
\(SP=0.2\)<br />
\(PV_{max}=1\)<br />
\(PV_{min}=0\)</div>
<div>
<br /></div>
<div>
Leads to:</div>
<div>
<div>
\(e(\tau)_{max}=-0.8\)</div>
<div>
\(e(\tau)_{min}=0.2\)</div>
</div>
</div>
</span></span></h4>
<h4>
<span style="background-color: white;"><span style="background-color: white;">Linear Error Function</span></span></h4>
<div>
<span style="background-color: white;"><span style="background-color: white;">The error function for the classical PID controller is:</span></span></div>
<div>
<span style="background-color: white;"><span style="background-color: white;"><br />
\[e=SP-PV\]<br />
</span></span><br />
<div>
<span style="background-color: white;"><span style="background-color: white;"><br />
</span></span></div>
<div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9jwoMkaYvntIt5DK1F5TOptNFkS-Nga7t_lvu5Tcu6ycGHrhlNqCIyKmYub-mLQ_U-4FgWIy_7RCZkpitdWemhgsv8rpMwwyioMmnTBqbcJZYyoVngWKVkfWiAvpBFEKqd5YvfnjJHdnY/s1600/error_linear_small.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9jwoMkaYvntIt5DK1F5TOptNFkS-Nga7t_lvu5Tcu6ycGHrhlNqCIyKmYub-mLQ_U-4FgWIy_7RCZkpitdWemhgsv8rpMwwyioMmnTBqbcJZYyoVngWKVkfWiAvpBFEKqd5YvfnjJHdnY/s1600/error_linear_small.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fig. 2</td></tr>
</tbody></table>
<span style="background-color: white;"><span style="background-color: white;"><br />
</span></span></div>
</div>
<div>
<span style="background-color: white;"><span style="background-color: white;"><br />
</span></span></div>
<span style="background-color: white;"><span style="background-color: white;">In the Fig. 2 we can see the error 'curve' plotted for 3 different set points in a saturation limited (bounded) system. Observe that if the set point is set to half of the allowed range </span></span><span style="background-color: white;">(0.5 in this case)</span><span style="background-color: white;">, the error will have the same absolute value, leading to equivalent raising and falling times for the integral, and for the convergence as a consequence.</span><br />
<h3>
<span style="background-color: white;"><span style="background-color: white;">Solution</span></span></h3>
<span style="background-color: white;"><span style="background-color: white;">What I propose is to replace the error function in the PID controller by a constrained cubic spline with some special requirements:</span></span><br />
<ul>
<li><span style="background-color: white;">The three points of the curve are: P1=(0,0.5); P2=(PV, 0) and P3=(1, -0.5)</span></li>
<li><span style="background-color: white;">The curve has to be smooth at P2 and </span></li>
<li><span style="background-color: white;">The 1st. derivative at P2 should be -1</span></li>
<li>The 1st. derivatives should always be negative, i.e. there will be no overshoot</li>
<li>It has to be simple enough to be computed at runtime </li>
</ul>
Fig. 3 shows a plot of the proposed function.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBJuG-HICypyV1BhwWDAOeBCg0ofcga6BS6ssPqoIdA5uC0y-ap-35ck8SyKy_PNXEstI7CtcAk5JlKDrkbDxk2Sq0CHeItPhBkPl91c69Ym0r_x4N26b8YZJch1BLZ-5s_cHfGjZuuGlB/s1600/error_spline_small.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBJuG-HICypyV1BhwWDAOeBCg0ofcga6BS6ssPqoIdA5uC0y-ap-35ck8SyKy_PNXEstI7CtcAk5JlKDrkbDxk2Sq0CHeItPhBkPl91c69Ym0r_x4N26b8YZJch1BLZ-5s_cHfGjZuuGlB/s1600/error_spline_small.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fig. 3</td></tr>
</tbody></table>
In Fig. 3 we can see the same 3 cases depicted previously . But now, regardless of the set point value, the error at both saturation points have the same absolute value: -0.5 and 0.5. This way the system will converge with approximately the same rate in both directions. Notice also that for SP = 0.5 the curve is the same line as in the original error function.<br />
<br />
<div>
<br />
The ideas is to replace the error function by an spline. The proposed solution are a 2 segments cubic splines whose polynomials are:<br />
<br />
\[e_1(t) = a_{1} + b_{1}PV(t) + c_{1}PV(t)^2 + d_{1}PV(t)^3\]<br />
\[e_2(t) = a_{2} + b_{2}PV(t) + c_{2}PV(t)^2 + d_{2}PV(t)^3\]<br />
<br />
Thus, the error function is given by:<br />
<br />
\[e(t)=\left\{\begin{matrix}<br />
e_1(t) & PV(t) \leq SP\\ <br />
e_2(t) & PV(t) > SP<br />
\end{matrix}\right.\]<br />
<br />
The coefficient of the polynomials are functions of the Set Point and must be computed each time this value changes. To calculate the coefficients we must solve the spline equations including the proposed constraints.<br />
<br /></div>
The equations bellow are a solution for the two segments cubic spline with the constrains presented above.<br />
<br />
\[f'_{1}(x_{1})=\frac{2}{\frac{x_{2}-x_{1}}{y_{2}-y_{1}}+\frac{x_{1}-x_{0}}{y_{1}-y_{0}}}\]<br />
\[f'_{1}(x_{0})= \frac{3(y_{1}-y_{0})}{2(x_{1}-x_{0})}-\frac{f'_{1}(x_{1})}{2}\]<br />
\[f''_{1}(x_{0})=\frac{-2(f'_{1}(x_{1}) + 2 f'_{1}(x_{0}))}{(x_{1} - x_{0})} + \frac{6 (y_{1} - y_{0})}{(x_{1} - x_{0})^2}\]<br />
\[f''_{1}(x_{1})=\frac{2(2 f'_{1}(x_{1}) + f'_{1}(x_{0}))}{(x_{1} - x_{0})} - \frac{6 (y_{1} - y_{0})}{(x_{1} - x_{0})^2}\]<br />
\[d_{1} = \frac{f''_{1}(x_{1}) - f''_{1}(x_{0})}{6 (x_{1} - x_{0})}\]<br />
\[c_{1} = \frac{x_{1} f''_{1}(x_{0}) - x_{0} f''_{1}(x_{1})}{x_{1} - x_{0}}\]<br />
\[b_{1} = \frac{(y_{1} - y_{0}) - c_{1}(x_{1}^2 - x_{0}^2) - d_{1}( x_{1}^3 - x_{0}^3)}{x_{1} - x_{0}}\]<br />
\[a_{1} = y_{0} - b_{1} x_{0} - c_{1} x_{0}^2 - d_{1} x_{0}^3\]<br />
\[f'_{2}(x_{1})=\frac{2}{\frac{x_{2} - x_{1}}{y_{2} - y_{1}} + \frac{x_{1} - x_{0}}{y_{1} - y_{0}}}\]<br />
\[f'_{2}(x_{2})= \frac{3(y_{2} - y_{1})}{2(x_{2} - x_{1})}-\frac{f'_{2}(x_{1})}{2}\]<br />
\[f''_{2}(x_{1})=\frac{-2(f'_{2}(x_{2}) + 2 f'_{2}(x_{1}))}{(x_{2}-x_{1})} + \frac{6 (y_{2} - y_{1})}{(x_{2} - x_{1})^2}\]<br />
\[f''_{2}(x_{2})=\frac{2(2 f'_{2}(x_{2}) + f'_{2}(x_{1}))}{(x_{2} - x_{1})} - \frac{6 (y_{2} - y_{1})}{(x_{2} - x_{1})^2}$\]<br />
\[d_{2} = \frac{f''_{2}(x_{2}) - f''_{2}(x_{1})}{6 (x_{2} - x_{1})}\]<br />
\[c_{2} = \frac{x_{2} f''_{2}(x_{1}) - x_{1} f''_{2}(x_{2})}{x_{2} - x_{1}}\]<br />
\[b_{2} = \frac{(y_{2} - y_{1}) - c_{2}(x_{2}^2 - x_{1}^2) - d_{2}( x_{2}^3 - x_{1}^3)}{x_{2} - x_{1}}\]<br />
\[a_{2} = y_{1} - b_{2} x_{1} - c_{2} x_{1}^2 - d_{2} x_{1}^3\]<br />
\[x_{0} = 0\]<br />
\[y_{0} = 0.5\]<br />
\[x_{1} = SP\]<br />
\[y_{1} = 0\]<br />
\[x_{2} = 1\]<br />
\[y_{2} = -0.5\]<br />
<br />
To help with the calculation of the polynomials' coefficients I've developed a small <a href="https://docs.google.com/open?id=0BzTuEidJ5iLedDZySkl5dWltTm8">Matlab (Octave) program</a>. Bellow there are some results.<br />
<br />
<span courier="" monospace="" new="">SP=0.1250</span><br />
<span courier="" monospace="" new=""> a1= 0.50000 b1= -5.50000 c1= 0.00000 d1= 96.00000</span><br />
<span courier="" monospace="" new=""> a2= 0.13703 b2= -1.19679 c2= 0.83965 d2= -0.27988</span><br />
<span courier="" monospace="" new=""><br />
</span> <span courier="" monospace="" new="">SP=0.2500</span><br />
<span courier="" monospace="" new=""> a1= 0.50000 b1= -2.50000 c1= 0.00000 d1= 8.00000</span><br />
<span courier="" monospace="" new=""> a2= 0.29630 b2= -1.38889 c2= 0.88889 d2= -0.29630</span><br />
<span courier="" monospace="" new=""><br />
</span> <span courier="" monospace="" new="">SP=0.5000</span><br />
<span courier="" monospace="" new=""> a1= 0.50000 b1= -1.00000 c1= 0.00000 d1= 0.00000</span><br />
<span courier="" monospace="" new=""> a2= 0.50000 b2= -1.00000 c2= 0.00000 d2= 0.00000</span><br />
<span courier="" monospace="" new=""><br />
</span> <span courier="" monospace="" new="">SP=0.7500</span><br />
<span courier="" monospace="" new=""> a1= 0.50000 b1= -0.50000 c1= 0.00000 d1= -0.29630</span><br />
<span courier="" monospace="" new=""> a2= -6.00000 b2= 21.50000 c2= -24.00000 d2= 8.00000</span><br />
<span courier="" monospace="" new=""><br />
</span> <span courier="" monospace="" new="">SP=0.8750</span><br />
<span courier="" monospace="" new=""> a1= 0.50000 b1= -0.35714 c1= 0.00000 d1= -0.27988</span><br />
<span courier="" monospace="" new=""> a2= -91.00000 b2= 282.50000 c2=-288.00000 d2= 96.00000</span><br />
<h4>
Limitations</h4>
<div>
When SP is close to the limits (0 or 1) the derivatives (slope) became very steep and may cause numeric problems. So this method should be used with caution in its extremes.</div>
<h3>
<b>Real Application</b></h3>
This modified controlling method was devised when I was developing a digital controller for the auto-exposure system of video surveillance cameras. The objective of this system is to control the brightness of the image being captured by the camera. It has to be able to perform under very extreme light conditions such as direct sunlight and poorly illuminated indoor areas.<br />
There are three parameters to control in the camera in order to regulate the exposure:<br />
<ul>
<li>Image sensor gain</li>
<li>Shutter speed</li>
<li>Iris opening</li>
</ul>
Not all the lenses have a controllable Iris, so the system can operate in two different modes depending on the type of lens installed:<br />
<ul>
<li>Iris mode - regulates the amount of light entering in the camera</li>
<li>Shutter mode - regulates the exposure time of each captured frame</li>
</ul>
I will present some examples of improvements in both modes.<br />
<h4>
Iris Mode</h4>
<div>
In this mode the dominant parameter to be controlled is the amount of light enters in the camera. This is accomplished by regulating the opening of a mechanical iris embedded in the lens. </div>
<div>
The video bellow shows two similar sequences, the first with the normal PID and the second with the modified version. The sequences consist of moving the camera from a dark to a bright scene. </div>
<div class="separator" style="clear: both; text-align: center;">
<br /><iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dwrLIkuqtwnoQMUJ4skdOSiR4yZnU-zjTygS1XxB6ruvKtI5wHokfUVs7TcEljZbCM4DNuziwa_dz4buJxnJg' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div>
<br />
<div>
</div>
<div>
As we can observe on this video, during the first sequence, the camera overshot and got "blind" for a little more than 2 seconds. This effect is due to the very high hysteresis of the electromechanical iris. Next sequence shows a mere half second dark picture. This represents an 8 fold improvement over the original design.</div>
<div>
<div>
The figures 4 and 5 are the output of a real-time scope that was monitoring the controller operation when the videos were shot. Figure 4 shows the normal PID and figure 5 is the spline error modified. The major horizontal divisions represent the time in seconds (10 seconds in total). The vertical axis is a interval form -1 to 1, all the variables were normalized to fit this interval.</div>
<div>
The traces captured are:</div>
<div>
<ul>
<li>Blue: <b>Set point</b> (Illumination reference)</li>
<li>Red: <b>Input</b> (current measured illumination in the image sensor)</li>
<li>Green: <b>Integral term</b> (normalized to the interval 1, -1)</li>
</ul>
</div>
</div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieE5Pm-YoTUGMVNxj5tGl0mZcKOWGV-ByFZFGmjj6u3baUXqLUihe_TqInY4I6-ROuJgvBKBc71vtjkeed4RiRPNazVT6tS9ap6WtgIVfuq-e7K4k3uAPhrg8IeRX05XILhREmlvXZS-4O/s1600/ai-dark-bright-linear.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="446" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieE5Pm-YoTUGMVNxj5tGl0mZcKOWGV-ByFZFGmjj6u3baUXqLUihe_TqInY4I6-ROuJgvBKBc71vtjkeed4RiRPNazVT6tS9ap6WtgIVfuq-e7K4k3uAPhrg8IeRX05XILhREmlvXZS-4O/s640/ai-dark-bright-linear.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fig. 4 - Iris control with standard PID error</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><div class="separator" style="clear: both; text-align: center;">
<br /></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyH4PL3Wh2jvB9zqj41G268j4MymJrcNOwP6Wy9MU7KJhPduhfwLYqIneHHNEqrE2flnpY3UTiCvM9IymXAChlAwLTjsA2UewMbAllhulc7g30ZqpcPvClnCF95zsdgWwOOpALrwSZYaFj/s1600/ai-dark-bright-spline.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="446" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyH4PL3Wh2jvB9zqj41G268j4MymJrcNOwP6Wy9MU7KJhPduhfwLYqIneHHNEqrE2flnpY3UTiCvM9IymXAChlAwLTjsA2UewMbAllhulc7g30ZqpcPvClnCF95zsdgWwOOpALrwSZYaFj/s640/ai-dark-bright-spline.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fig. 5 - Iris controller with modified spline error</td></tr>
</tbody></table>
<h4>
Shutter Mode</h4>
<div>
In this mode the amount of light entering in the camera is fixed, what is controlled is the frame's exposure time. The mechanism that allow us to do this is an electronic shutter implemented in the image sensor itself.</div>
<div>
The next video is an example of moving from bright to dark. We don't observe visually a so dramatic improvement as in the previous case. But, as the graphs bellow shows, the controller took 4 seconds to converge with the normal PID and 2 seconds with the improved version.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dxyvDd9x69s_jQSX5S2KNnHeeoldSA--lxPghq6XT-5KV9XnfDzqPe-mzEK7saICUPEEkXySKcUSlHBWDmJ_w' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div>
<br />
Also is worth mentioning that the shutter model is linear , so we don't observe an overshoot as in the previous case.</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div>
<br />
Figures 6 and 7 were captured when above video sequences were taken. It must be noticed that the graphs show a little more than the video sequences. The graphs include moving the camera from dark to bright, that is the point where the red line goes up suddenly.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmfGiYl-Dvg3K2TSKZdmaEtkt2L1EwKUAFPV8ipC-xr3zvkoA3XFYI_Xmciqhdqe2bY0agc0j-StYJV9Jy1ppo0CdV6h2ygU9d8AfgTvl0HNj0m4CQ8e_MvX9TKgH0Ljnjt3q1WajevV3a/s1600/ae-linear.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="364" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmfGiYl-Dvg3K2TSKZdmaEtkt2L1EwKUAFPV8ipC-xr3zvkoA3XFYI_Xmciqhdqe2bY0agc0j-StYJV9Jy1ppo0CdV6h2ygU9d8AfgTvl0HNj0m4CQ8e_MvX9TKgH0Ljnjt3q1WajevV3a/s640/ae-linear.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fig. 6 - Shutter control with standard PID error</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipW6A7oW9ANcm-CkjVEii_E-FAr77y3lVqBsnIu9dmdxeDPZ4NzANbr05M20mMtvB3gMzZ2gXAqmOD7d_NLMZ9mrFtlSBNprr0z9oqM3AMeathmT4w5gTdJ3BatMozXM1-jSflb-Gnxwg7/s1600/ae-spline.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="364" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipW6A7oW9ANcm-CkjVEii_E-FAr77y3lVqBsnIu9dmdxeDPZ4NzANbr05M20mMtvB3gMzZ2gXAqmOD7d_NLMZ9mrFtlSBNprr0z9oqM3AMeathmT4w5gTdJ3BatMozXM1-jSflb-Gnxwg7/s640/ae-spline.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fig 7 - Shutter control with modified spline error.</td></tr>
</tbody></table>
Robinson Mittmannhttp://www.blogger.com/profile/00297238237576764278noreply@blogger.com2tag:blogger.com,1999:blog-8812665732576096785.post-3774703225373975972012-10-19T07:35:00.001-07:002012-10-20T20:55:36.937-07:00Interprocess RPC generation tool<h3>
Introduction</h3>
This post discusses a methodology to create a Remote Procedure Call interface for Process-to-Process intercommunication. That is, to communicate between two processes in the same host.<br />
The general solution for the problem is summarized here as design pattern. A tool to automatically generate the RPC stubs code, called <b>irpcgen,</b> is presented as well.<br />
<br />
<h3>
The Problem</h3>
We have two processes in an embedded system, lets call them <b>H</b> and <b>C</b>.<br />
<br />
<ul>
<li><b>H</b> - is a hard real-time process with strict deadlines. This can be one or more control loops or an acquisition system for example.</li>
<li><b>I</b> - controls the operation of <b>H</b> and perform other non time sensitive tasks. It may implement the user interface, operational logging etc... but it's main task is create and watchdog <b>H</b>.</li>
</ul>
<br />
The question that arises is, what's the best approach to create a communication channel between <b>H</b> and <b>I</b>? More specifically, we want to answer two questions:<br />
<br />
<ol>
<li>which IPC mechanism will be best suited for the task?</li>
<li>how to send and receive structured information over these channels?</li>
</ol>
<br />
<h3>
The Solution</h3>
To answer <b>1.</b> I created a small set of programs to benchmark several alternative IPC mechanisms in Linux. See my previous posts on the subject: <a href="http://bobmittmann.blogspot.com/2011/12/embedded-linux-interprocess_05.html?spref=bl">Embedded Linux Interprocess Communication Mechanisms Benchmark - Part 2</a><br />
<div>
<br /></div>
<div>
With these data in hand it occurred to me that a natural channel will be two pipes, I have used this approach in other opportunities, but I never considered before of using unnamed pipes for the task. That's what I propose here, to used a pair of unnamed pipes connected to the <b>stdin</b> and <b>stdout</b>. Which was the best thing to do as our controlling process <b>I</b> is the one forking <b>H</b>. And H have only one single controller attached to it. This way we created a two-way Process to Process IPC channel. Now we have to be able to send and receive structured data trough it.</div>
<div>
<br /></div>
<div>
My requisites for the data exchange mechanisms were:</div>
<div>
<ul>
<li>Simple to program and extend</li>
<li>The communication has to be synchronous</li>
<li>The programming interface has to be at high level, RPC like</li>
</ul>
</div>
<div>
First thing to do was to transform an asynchronous channel into a synchronous one. To do this a small overhead protocol was introduced. It just defines a framing structure, to delimit the message boundaries and a scheme to multiplex different message types, also it introduced control messages for synchronization an link management.</div>
<div>
<br /></div>
<div>
Next step was to create a way of encapsulate C structures into the messages and to label them in order to be able to demultiplex on reception. No marshalling is necessary because both processes are in the same host. This involved in defining, for each message, a function to be called to transmit it and a corresponding callback to be invoked on reception.</div>
<div>
<br /></div>
<div>
This can be done manually. As a matter of fact I just did it, in the first product developed with this approach. It was also a way of validating the strategy without incurring in too much tooling effort.</div>
<div>
But for this to be generally useful a tool to automatic generate the code was needed. </div>
<br />
<div>
<h3>
The irpcgen tool</h3>
To make the development easy I created a tool to generate the stubs for the server and client as well as sample server service calls. The program works pretty much like the SUN RPC rpgen tool, except that instead of reading a RPCL input file (.x) it reads a standard C header (.h) file. This is to super simplify the things. You just need to write your API in a header file an use the functions in the client's side. The implementation of the functions will be at the server's side.<br />
<br />
The <b>rpcgen</b> will read the header file and will create stubs for all functions declared that can be used as RPC. This functions represent the server's API and have to follow some rules:<br />
<br />
1 - The return has to be a bool type;<br />
2 - There must be at most 2 arguments to the function;<br />
3 - It cannot be declared static;<br />
4 - If a second argument is provided it has to be a pointer to something except a void pointer;<br />
<br />
Functions that fail to conform to any of these rules are not considered IRPC and no stub will be generated for them.<br />
<br />
Furthermore the direction of the data transmission will be derived by the position and type of the arguments. The following cases are possible:<br />
<br />
<h4>
No arguments</h4>
Sends nothing returns nothing (but invokes the corresponding callback on the server side) .<br />
<br />
<pre class="prettyprint lang-c">bool my_rpc(void);
</pre>
<br />
<h4>
Single argument passed by value.</h4>
<div>
This is an server input value. This is more or less obvious as the client cannot read anything back. </div>
<br />
<pre class="prettyprint lang-c">bool my_rpc_set(int val);
bool my_rpc_set(struct my_req req);
</pre>
<br />
<h4>
Single argument passed by constant reference.</h4>
<div>
This is similar as the previous case, the single argument is a server input value.</div>
<br />
<pre class="prettyprint lang-c">bool my_rpc_set(const struct my_req * req);
</pre>
<br />
<h4>
Single argument passed by reference.</h4>
<div>
This case the argument is a return value from the server. The client should provide a pointer to a variable that will receive the data.</div>
<br />
<pre class="prettyprint lang-c">bool my_rpc_get(int * val);
bool my_rpc_get(struct my_rsp * rsp);
</pre>
<br />
<h4>
Two arguments</h4>
<div>
This case the first argument is an input value and the second a return value from the server. The client should provide a pointer to a variable that will receive the data. Note that the second argument must be a non constant reference. The first argument can be any, except a void pointer (void *); </div>
<br />
<pre class="prettyprint lang-c">bool my_rpc_set_and_get(struct my_req * req, struct my_rsp * rsp);
bool my_rpc_set_and_get(const struct my_req * req, struct my_rsp * rsp);
bool my_rpc_set_and_get(struct my_req req, struct my_rsp * rsp);
bool my_rpc_set_and_get(int req, int * rsp);
</pre>
<br />
<b>Strings</b><br />
<br />
If any argument is passed as a char pointer (char *) it will be treated as a NULL terminated string. The rule for a single argument is the same as for reference. I.e. if it's declared as const it represents a client to server message and will be reverse for non const strings.<br />
<br />
<pre class="prettyprint lang-c">bool my_rpc_set_and_get(char * req, char * rsp);
bool my_rpc_set_and_get(const char * req, char * rsp);
bool my_rpc_set(const char * req);
bool my_rpc_get(char * rsp);
</pre>
<br />
<h3>
Service calls</h3>
<div>
The irpcgen tool will optionally create a ".h" file with "_svc" appended to the input file name. E.g. if the input is "my_rpc.h" the generated file will be "my_rpc_svc.h". The file will contains the signature for the services to be implemented.</div>
<div>
<br /></div>
<div>
The file:</div>
<br />
<pre class="prettyprint lang-c">bool my_rpc_get(int * val);
bool my_rpc_set(int val);
</pre>
<br />
<div>
Will produce:</div>
<br />
<pre class="prettyprint lang-c">bool my_rpc_get_svc(int * val);
bool my_rpc_set_svc(int val);
</pre>
<br />
The "_svc" functions must implement the server behaivour. Optionally a "*_svc.c" can be created with dummy functions. All you need to do is to fill this functions body to have a functional RPC system.</div>
<h3>
The libirpc</h3>
<div>
The libirpc is the companion of the irpcgen. The generated code depends on this library to run.</div>
<div>
<br />
<h3>
Source code</h3>
The <b>irpcgen</b> tool is GPL open-source and can be downloaded from: <a href="https://docs.google.com/file/d/0BzTuEidJ5iLebWItVkpkNjdiMkE/edit">irpcgen.tar.gz</a></div>
<div>
The package also contains the libirpc and a sample. The library is LGPL licensed.<br />
<br />
There is a Makefile in the directories irpcgen, libirpc and sample. You need to compile irpcgen and libirpc before compiling the test.<br />
<br />
If you want do cross-compile the library and the sample to an embedded platform, set the environment variable CROSS_COMPILE to the prefix of your tool-chain e.g. <b>export CROSS_COMPILE=arm-gnu-linux-</b>.<br />
<br />
<br />
<br />
<br />
<br /></div>
Robinson Mittmannhttp://www.blogger.com/profile/00297238237576764278noreply@blogger.com0tag:blogger.com,1999:blog-8812665732576096785.post-91457921438753047682012-10-12T15:43:00.001-07:002012-10-12T15:47:07.143-07:00YARD-ICE goes Open Source<h2>
YARD-ICE</h2>
<b>YARD-ICE </b>stands for Yet Another Remote Debugger - In Circuit Emulator. It is a hardware and software platform I made public recently at Google Code. The project goal is to design the Hardware and Software of a JTAG tool to program and debug ARM microcontrollers. The target audience include developers of deep embedded systems with shallow pockets.<br />
<br />
Link to the Project: <a href="http://code.google.com/p/yard-ice/">YARD-ICE on Google Code</a><br />
<br />
<h2>
Why Another JTAG Tool?</h2>
There are tons of tool in the market. Why another one? The main reasons are three:<br />
<br />
<ol>
<li> performance. Some basic, low cost tools, available in the market are really slow. One of the main reasons is that low level operations are performed by the Host PC. The round trip of the USB is the one to be blamed. <b>YARD-ICE</b> solve this problem with and FPGA handling the serialization and other bit handling.</li>
<li> support for Linux/MAC platforms. Most ICE hardware lacks a decent support for non Windows platforms. There are some exceptions, but those are expensive tools with TCP/IP support. <b>YARD-ICE</b> is a TCP/IP based tool with embedded GDB server. It's designed to work with any IDE supporting GDB like Eclipse.</li>
<li>flexibility. Some tools are OK for some processors, but their best performance is tied to a certain proprietary tool. Scripting is not always an option. And when this possibility exists it's some obscure language or API with Windows DLL dependencies, and too slow. Why not to write a simple shell or python script in the host to automate a test or to program your systems in the factory? YARD-ICE provides a simple <b>csh</b> like scripting capability, you can run small scripts remotely through a ordinary TCP connection. End better than this, if you don't like the way we do or want to customize your tool? No worries, it's LGPL open source, meaning that you have what you need to do just that.</li>
</ol>
<br />
Apart from that I really like bit scrubbing. It's a good way of knowing the processor cores in depth.<br />
<br />Robinson Mittmannhttp://www.blogger.com/profile/00297238237576764278noreply@blogger.com1Toronto, ON, Canada43.653226 -79.383184343.469412 -79.69904129999999 43.837039999999995 -79.0673273tag:blogger.com,1999:blog-8812665732576096785.post-27746304176080137722012-10-05T12:57:00.000-07:002013-11-28T12:00:36.064-08:00Unix Select+Timers<h3>
Introduction</h3>
When developing real-time network protocols and other embedded time sensitive systems, it is common having to read from one or more file-descriptors while keeping track of various timeouts at the same time.<br />
<br />
This post discusses a method to implements timers and file-descriptors polling in a single loop. It's very limited in the resource usage and relatively fast for a small number of timers and file descriptors. This conditions are usually met in embedded systems, where either is not allowed or expected for a device to serve too many clients.<br />
<br />
The solution is fairly portable among UNIX like OSs as it uses POSIX calls. I wont claim that this is the best method to do it, but I have to say it's being successfully used in some time sensitive protocol implementations.<br />
<br />
<h3>
Timers</h3>
To implement the timers we have to keep track of the time, this is performed by a <b>clock</b>. The <b>clock</b> is a monotonic counter obtained through the <span style="font-family: Courier New, Courier, monospace;">clock_gettime(</span><span style="background-color: white; font-size: 13px;"><span style="font-family: Courier New, Courier, monospace;">CLOCK_MONOTONIC</span></span><span style="font-family: Courier New, Courier, monospace;">)</span> system call. And, usually, it represents an absolute time since the system start up. The reference or epoch of the clock doesn't really matter, the important thing is that it can' t be subjected to corrections like NTP.<br />
<br />
The timers are represented in the same way the <b>clock</b> is. Active timers (not expired) have their times set in the future. The timers with a time in the past are expired and are consider inactive. We compare the clock value with the timers values to determine when a timer expire and an appropriate action can be taken. One approach is to associate callback functions with the timers.<br />
<br />
To improve the performance on 32bits systems, for the clock and timers we use only the first 32bits of the value, this way the time will wraps each 4294 seconds or 71 minutes. That means that to unequivocally determine if a timer timeout is in the future it should be at most at 1/2 of the wrapping value or about 35.5 minutes. This is more than enough for most of real-time applications. If this is not your case consider using a milliseconds clock reference (see bellow).<br />
<br />
To setup a timer timeout it's just a matter of adding the timeout time in microseconds to the <b>clock</b>. In the example bellow we use a value of 0 to represent an inactive timer. So if the value of the clock plus the timeout time wraps to 0 we add 1 microsecond to avoid this condition. Other more elaborated methods can be used but this have the advantage of avoiding an extra memory reference when polling the timers. <b> </b><br />
<br />
<h3>
Polling</h3>
<div>
The idea is to use the <span style="font-family: Courier New, Courier, monospace;">select()</span> system call to poll for the files-descriptors adjusting the timeout parameter according to the expiration time of the timers. We compare all the timers with the current clock and selects the smallest difference, higher than zero, between the expiration time an current time.<br />
<br /></div>
The <span style="font-family: Courier New, Courier, monospace;">select()</span> system call has the advantage of being fast and conservative regarding resources usage, for a small number of file descriptors. The call by itself will not depend much of the number of the file-descriptors as it depends on the value of the last file-descriptor in the set. Another advantage of <span style="font-family: Courier New, Courier, monospace;">select()</span> is portability.<br />
<br />
<pre class="prettyprint lang-c">#include <stlib.h>
#include <stdint.h>
#include <time.h>
#define ONE_SECOND 1000000
#define TMR_MAX 8
#define FD_MAX 8
/* get the system monotonic clock value in microseconds. */
static uint32_t get_clock_us(void)
{
struct timespec tv;
clock_gettime(CLOCK_MONOTONIC, &tv);
return (tv.tv_sec * 1000000) + (tv.tv_nsec / 1000);
}
/* the maximum timer timeout allowed is
2147 seconds ~ 35 minutes */
uint32_t tmr[TMR_MAX]; /* List of timers */
unsigned int tmr_cnt; /* Number of timers in the list */
int fd[FD_MAX]; /* List of file descriptors */
unsigned int fd_cnt; /* Number of descriptors in the list */
static void * my_task(void * arg)
{
struct timeval tv;
uint32_t clock;
int fd_max;
fd_set rs;
int ret;
int i;
for (;;) {
/* get the current time in mircosseconds */
clock = get_clock_us();
/* clear the fd set */
FD_ZERO(&rs);
/* initialize dt_min to 1 minute */
dt_min = 60 * ONE_SECOND;
/* initialize fd_max */
fd_max = 0;
for (i = 0; i < tmr_cnt; i++) {
int32_t dt;
if (tmr[i] == 0) /* timer is inactive */
continue;
if ((dt = (int32_t)(tmr[i] - clock)) <= 0) {
/* timer timeout */
on_timeout(i);
} else if (dt < dt_min) {
/* adjust the minimum timeout time */
dt_min = dt;
}
}
tv.tv_usec = dt_min;
tv.tv_sec = 0;
for (i = 0; i < fd_cnt; i++) {
if (fd[i] != -1) {
FD_SET(fd[i], &rs);
if (fd[i] > fd_max)
fd_max = fd[i];
}
}
ret = select(fd_max + 1, &rs, NULL, NULL, &tv);
if (ret < 0) {
if (errno == EINTR) /* select() interrupted */
continue;
/* select() failed */
return ret;
}
for (i = 0; i < fd_cnt; i++) {
if ((fd[i] != -1) && FD_ISSET(fd[i], &rs)) {
/* read from the file descriptor */
on_recv(fd[i]);
}
}
}
}
void timer_set(unsigned int id, unsigned int tmo_us)
{
tmr[id] = clock + tmo_us;
if (tmr[id] == 0)
tmr[id]++;
}
</pre>
I've tried to keep the example as short as possible, so the structure is far from ideal in terms of encapsulation. If the list of timers or file descriptors is changes dynamically, a mutual exclusion mechanism should be implemented as well. This is to avoid race conditions when evaluating the timers or the file descriptors. <br />
<h3>
Minor improvements</h3>
It may be a good idea to avoid arithmetic divisions in platforms that don't have an equivalent <b>div</b> instruction, like ARM v4 and v5 (ARM7-9). This will improve the performance a little bit. The following code is an alternative to the original one that uses sums of shifts to get an approximation of the 'by 1000' division, when calculating the number of microseconds.<br />
<br />
<pre class="prettyprint lang-c">static inline uint32_t get_sys_clock_us(void)
{
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
/* This is a fast, no division, good approximation to:
tv_nsec / 1000. The maximum error is 74 microseconds
It costs only 5 structions on ARMv5 */
return (ts.tv_sec * 1000000) + (ts.tv_nsec >> 10) +
(ts.tv_nsec >> 15) - (ts.tv_nsec >> 17) + (ts.tv_nsec >> 21);
}
</pre>
<br />
If timers with more than 35 minutes are needed the clock function can be modified to count in milliseconds instead of microseconds. Follows the non-division implementation of the clock function, and the conversion to microseconds to set-up the timeval struct: <br />
<br />
<pre class="prettyprint lang-c">static uint32_t get_clock_ms(void)
{</pre>
<pre class="prettyprint lang-c"><pre class="prettyprint lang-c"> struct timespec ts;
</pre>
clock_gettime(CLOCK_MONOTONIC, &ts);
/* This is a fast, no division, good approximation to: tv_nsec / 1000000. */
return (ts.tv_sec * 1000) + (ts.tv_nsec >> 20) +
(ts.tv_nsec >> 25) - (ts.tv_nsec >> 26) +
(ts.tv_nsec >> 28) + (ts.tv_nsec >> 29);
}
...
tv.tv_usec = dt_min * 1000;
</pre>
Robinson Mittmannhttp://www.blogger.com/profile/00297238237576764278noreply@blogger.com1tag:blogger.com,1999:blog-8812665732576096785.post-71962769477420118802012-02-10T14:07:00.000-08:002012-02-23T06:45:07.595-08:00The Espresso MachineThis tale begins in Brazil, winter time. I mean winter on the north hemisphere. Naturally it was summer in South-America, where we fled to escape the peak of Canada's cold (turns out that the winter was not that bad this year). Anyway, my wife and I were in vacations visiting our relatives there. While my wife went to the north-east part of the country, I had to go to the the capital of Minas-Gerais state, the city of Belo Horizonte. There is where my younger sister has being living.<br />
<br />
I won't say that I do not appreciate a good espresso coffee, I'm more like a tea kind of guy. But even someone as inexperienced as I am, have to admit, that there is something rather pleasant in the taste of a good coffee extracted by a skilled barista. That was sure the case when we went to a coffee shop called KahlĂșa. By the recommendation of my brother-in-law, as well as my sister, I tasted two 'single origin' ('sigle origin' being in opposition of 'blends' as I learned from them). The first one called Araponga and the other one being Sul-de-Minas Especial(South of Minas Gerais Special), to be more precise we tasted the later first. I may fail to describe the sensation of smelling the 'exquisite' aroma, a mixing of the brew and the freshly roasted beans. They where roasting the coffee while we are at the store. All that I can say is that the coffees were amazing, no bitter nor soar, just perfect. So much so that I couldn't help my self but to buy right away two packets. One to myself, my wife and dog (you have to know the dog to understand), and the other one for a couple of friends who were 'dog sitting' our little cockapoo. It is worth mentioning that the beans were medium roasted, packed and sealed as we were in the store. This allows to preserve most of the characteristics of the coffee, I suppose.<br />
<br />
All very well, except by the fact that, we didn't have the grinder to get a coffee powder, nor the espresso machine to brew it into something worth drinking. Returning to Toronto the first thing I did was to look for machines and learn a little bit about the art of espresso making. Well, there is a plethora of ways to brew coffee and a lot of different types of machines to do espresso variants. The choice of a particular type of machine will depend, as we learned, on how much you want to be involved in the process of coffee making. Tt can range from completely manual to fully automated ones. In some matters, as food and beverages, I like to be in control of the preparation whenever is possible, or at least be part of it. Besides of the fact that, I don't classify myself as gourmet, I like to fancy of being a reasonable cook. So I decided to venture into this new endeavour of espresso making.<br />
<br />
After some googling around, I settled for the Rancillio Silvia espresso machine and the<br />
Baratza Vario grinder. The main reasons being, the good reviews of both machines in several sites like CoffeGeek (<a href="http://coffeegeek.com/reviews/consumer/rancilio_silvia">http://coffeegeek.com/reviews/consumer/rancilio_silvia</a>, <a href="http://coffeegeek.com/proreviews/firstlook/baratzavario/details">http://coffeegeek.com/proreviews/firstlook/baratzavario/details</a>), as well as the bundle was in the budget we had available. I located a store (<a href="http://www.espressoplanet.com/">http://www.espressoplanet.com/</a>) in Mississauga (a city nearby Toronto) which have this particular combination in a promotional package, along with some accessories and 1Kg of coffee beans. The first Saturday, just after arriving home, we went there. I must say that I was very impressed by the store, that turns out being much larger than I expected. The person who took care of us there was very kind and knowledgeable. We had the opportunity to test the machines on the spot, clarify some doubts and taste coffees. Needless to say, we bought the package and other stuff we deemed necessary to complete the espresso experience. These included: a calibrated tamper, a knock box, some 'vacuum' sealed containers for the beans and a new water filter. In the picture bellow you can see how the two machines are happily installed in our dining room.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTUj8UDvC5cnrhFeq8pbJB3_twSVtBWceYPUQpld44pwOATtCuxGGHCxYm-Kae8hfcMgoD4_ZUSUU8tvoXG5NM5NJB47PJlZ39aC-9qrzPtqmv4_WECcsIXInXijnlq0SOWH3t7ninJaWf/s1600/espresso-silvia-vario.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="426" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTUj8UDvC5cnrhFeq8pbJB3_twSVtBWceYPUQpld44pwOATtCuxGGHCxYm-Kae8hfcMgoD4_ZUSUU8tvoXG5NM5NJB47PJlZ39aC-9qrzPtqmv4_WECcsIXInXijnlq0SOWH3t7ninJaWf/s640/espresso-silvia-vario.jpg" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Rancilio Silvia and Baratza Vario</td></tr>
</tbody></table>Robinson Mittmannhttp://www.blogger.com/profile/00297238237576764278noreply@blogger.com0Toronto, ON, Canada43.6654711 -79.376258243.6640351 -79.37872569999999 43.666907099999996 -79.3737907tag:blogger.com,1999:blog-8812665732576096785.post-3470575108786216332012-02-10T11:52:00.000-08:002012-10-16T07:19:54.648-07:00ARM-GCC Toolchain How-ToOnce in a wile I have to compile the GCC Toolchain (Binutils, GCC, GDB) for a new platform, either because I want to have some new feature, or due to a bug correction, and also after installing a new operating system. As I don't do this often, I always have trouble remembering some steps. That's why I'm posting it here.<br />
<br />
Before you go any further I want to point out that we will not cover here how to compile the C++ compiler (g++) - this will require the compilation of a runtime library, and is a little more challenging. Only the C language will be supported, and no C library (libc) will be generated as well. This will be, for sure, a limiting factor for almost everybody except those who are developing system software.<br />
<br />
This tutorial will explain how to compile a cross GCC toolchain for ARM processors on a <b>Ubuntu 10.04 LST </b>host machine. It will probably work fine on other Ubuntu releases as well, but p<span style="background-color: white;">lease be aware that there is a good chance of these procedures failing if you intend to use a different set of OS and source code (other versions of GCC, binutils or GDB).</span><br />
<br />
So there we go. First of all, let's get the packages:<br />
<br />
<span class="Apple-style-span" style="font-size: large;">Downloading the source code</span><br />
<br />
<pre class="prettyprint lang-bsh">cd /tmp
wget ftp://ftp.gnu.org/pub/gnu/binutils/binutils-2.22.tar.bz2
wget ftp://ftp.gnu.org/pub/gnu/gcc/gcc-4.6.2/gcc-core-4.6.2.tar.bz2
wget ftp://ftp.gnu.org/pub/gnu/gdb/gdb-7.4.tar.bz2
</pre>
<br />
Now lets prepare the environment to compile and install. I usually install the tools in a subdirectory over the <b>/opt</b> directory. In this case we will be installing in the <b>/opt/arm-none-eabi</b> directory. The binaries (programs, gcc, gdb and such) will be located in the <b>/opt/arm-none-eabi/bin</b> <br />
subdirectory and will be prefixed by "<b>arm-none-eabi</b>" (arm-none-eabi-as,, arm-none-eabi-gcc,...) .<br />
<br />
<span class="Apple-style-span" style="font-size: large;">Installing development libraries</span><br />
<br />
<pre class="prettyprint lang-bsh">sudo apt-get install libmpfr-dev libgmp3-dev libmpc-dev
sudo apt-get install libz-dev
</pre>
<br />
The first line install the MPFR, GMP and MPC development libraries, which are required to compile GCC since version 4.3.<br />
The last line adds the zlib development package, as you may get an error when compiling the zlib provided with GCC.<br />
<br />
<span class="Apple-style-span" style="font-size: large;">Creating a build tree</span><br />
<br />
Assuming that all the source code files where downloaded in the <b>/tmp</b> directory, ad we will compile in our home directory:<br />
<br />
<pre class="prettyprint lang-bsh">cd
mkdir gcc-toolchain
cd gcc-toolchain
bzip2 -dc /tmp/binutils-2.22.tar.bz2 | tar -vxf -
bzip2 -dc /tmp/gcc-core-4.6.2.tar.bz2 | tar -vxf -
bzip2 -dc /tmp/gdb-7.4.tar.bz2 | tar -vxf -
mkdir arm-none-eabi
cd arm-none-eabi
mkdir binutils-2.22
mkdir gcc-4.6.2
mkdir gdb-7.4
export PATH=/opt/arm-none-eabi/bin:/bin:/usr/bin
</pre>
<br />
The last line will set-up the PATH for the compilation. Notice that the first entry (/opt/arm-none-eabi/bin) does not exist yet, but it will be crated when installing the binutils and will be necessary for compiling the GCC.<br />
<br />
<span class="Apple-style-span" style="font-size: large;">Compiling GNU binutils</span><br />
<br />
First let's do the basics: assembler, archiver, linker and object files utilities.<br />
<br />
<pre class="prettyprint lang-bsh">cd binutils-2.22
../../binutils-2.22/configure --prefix=/opt/arm-none-eabi --target=arm-none-eabi --disable-nls
make -j 8
sudo make install
cd ..
</pre>
<br />
<span class="Apple-style-span" style="font-size: large;">Compiling GCC</span><br />
<br />
If everything went well, we are good to compile the cross-compiler. To make sure check the <b>/opt/arm-none-eabi/bin</b> directory, all the "arm-none-*" family of binutils must be there.<br />
<br />
<pre class="prettyprint lang-bsh">cd gcc-4.6.2
../../gcc-4.6.2/configure --prefix=/opt/arm-none-eabi --target=arm-none-eabi --disable-nls --disable-libssp --disable-zlib --enable-languages="c"
make -j 8
sudo make install
cd ..
</pre>
<br />
GCC is up, let's see if it's running:<br />
<pre class="prettyprint lang-bsh">$ arm-none-eabi-gcc
arm-none-eabi-gcc: fatal error: no input files
compilation terminated.
</pre>
<br />
If you got that message your compiler is fine.<br />
<br />
<span class="Apple-style-span" style="font-size: large;">Compiling GDB</span><br />
As an optional step, you can compile the GDB. This will allows you, with the right tool, to remotely debug your embedded application.<br />
<br />
<pre class="prettyprint lang-bsh">cd gdb-7.4
../../gdb-7.4/configure --prefix=/opt/arm-none-eabi --target=arm-none-eabi --disable-nls
make -j 8
sudo make install
cd ..</pre>
<h4>
Update your PATH</h4>
You have to include the newly created toolchain <b>bin</b> directory into your PATH environment. Edit <b>.bashrc</b>, in your home directory, and add the following line:<br />
<br />
<pre class="prettyprint lang-bsh">export PATH=$PATH:/opt/arm-none-eabi/bin
</pre>
<br />
For the changes to take effect you will have to restart the terminal or source your <b>.bashrc</b> with:<br />
<br />
<pre class="prettyprint lang-bsh">$ source ~.bashrc
</pre>
<br />
<b>/!\ Attention</b>: the <b>-j 8</b> parameter in the make line, allows for parallel building, which will speed-up the compilation process quite a lot. But, from my experience, I recommend not to use <b>-j</b> alone, as this may result in a non-responsive computer and sometimes the compilation itself or some other applications may crash. For that reason always set the number of tasks to match the number of cores or threads your machine has. For example I'm using a Intel Core 7 with 4 cores and 2 threads per core (Intel Hyper Threading), so I use <b>-j 8</b>.<br />
<br />
<hr />
<br />
<span style="font-size: x-large;">Other tips</span><br />
<br />
<span style="font-size: large;">GMP and MPFR</span><br />
<br />
In some operating systems the GMP and MPFR libraries required to compile the GCC are outdate, to solve this we download source code and tell the configure script where to find them:<br />
<br />
<pre class="prettyprint lang-bsh">ftp://ftp.gnu.org/pub/gnu/gmp/gmp-4.3.2.tar.bz2
http://www.mpfr.org/mpfr-current/mpfr-3.1.0.tar.bz2
bzip2 -dc /tmp/gmp-4.3.2.tar.bz2 | tar -vxf -
bzip2 -dc /tmp/mpfr-3.1.0.tar.bz2 | tar -vxf -
</pre>
<br />
<span style="font-size: large;">Newlib</span><br />
<br />
Depending on what your projects are you most probably will need a C library. NewLib is a good option and you can compile it along with GCC.<br />
Quoting from the Newlib's website: "Newlib is a C library intended for use on embedded systems. It is a conglomeration of several library parts, all under free software licenses that make them easily usable on embedded products."<br />
<br />
<pre class="prettyprint lang-bsh">cd / tmp
wget ftp://sources.redhat.com/pub/newlib/newlib-1.20.0.tar.gz
cd ~/gcc-toolchain
gzip -dc /tmp/newlib-1.20.0.tar.gz | tar -vxf -
</pre>
<br />
In the GCC compilation step you need to inform you want to use the NewLib as your default C library: <i>--with-newlib</i><br />
As the NewLib provides support for building the run-time elements of C++ we can enable the C++ in the GCC compilation as well:<i> --enable-languages="c,c++"</i><br />
<br />
<pre class="prettyprint lang-bsh">cd gcc-4.6.2
../../gcc-4.6.2/configure --prefix=/opt/arm-none-eabi --target=arm-none-eabi --disable-nls --disable-libssp --disable-zlib --enable-languages="c,c++" --with-newlib --with-headers=../../newlib-1.20.0/newlib/libc/include
make -j 8
sudo make install
cd ..
</pre>
<br />
Now you can compile the library:<br />
<br />
<pre class="prettyprint lang-bsh">mkdir newlib-1.20.0
cd newlib-1.20.0
../../newlib-1.20.0/configure --prefix=/opt/arm-none-eabi --target=arm-none-eabi
make -j 8
sudo make install
cd ..</pre>
<h4>
NewLib Notes</h4>
If you want to use the NewLib to do I/O, dynamic memory, file operations and some other other functions, you will need to create a OS adaptation layer. I may write a tutorial on the subject one of these days.<br />
<br />
<span style="font-size: large;">Updates</span><br />
<br />
I've tried the compilation sequence in my <b>netbook</b> running Lubuntu 12.04 and it worked like a charm. If you have a resource limited, old computer, or simply don't swallow the new Gnome/Ubuntu interface I really recommend LUbuntu: <a href="http://lubuntu.net/">http://lubuntu.net/</a>.<br />
<br />Robinson Mittmannhttp://www.blogger.com/profile/00297238237576764278noreply@blogger.com0Toronto, ON, Canada43.6525 -79.381667-1.5292459999999934 -160.241042 88.834246000000007 1.4777080000000069tag:blogger.com,1999:blog-8812665732576096785.post-43058403927502269132011-12-05T17:44:00.001-08:002012-02-23T06:47:27.650-08:00Embedded Linux Interprocess Communication Mechanisms Benchmark 2nd Part<span class="Apple-style-span" style="font-family: inherit;">This is the second part of the benchmark of some IPC on Embedded Linux. See the previous post: <a href="http://bobmittmann.blogspot.com/2011/12/embedded-linux-interprocess.html?spref=bl">Embedded Linux Interprocess Communication Mechanisms Benchmark - Part 1</a>.</span><br />
<span class="Apple-style-span" style="font-family: inherit;"><br />
</span><br />
<span class="Apple-style-span" style="font-family: inherit; font-size: x-large;">Source Code</span><br />
The source code with the tests can be downloaded from here: <a href="https://docs.google.com/open?id=0BzTuEidJ5iLeNDZlMTBjNDYtYTkzYy00YWU3LWJjMzItZGMxOGI5M2E4OTBl">ipc_bm.tar.gz</a>.<br />
To compile just adjust the variable CROSS_COMPILE in the main Makefile and do make all.<br />
<div>
<span class="Apple-style-span" style="font-size: x-large;">Results</span><br />
The listings bellow show the output of the tests<br />
<span class="Apple-style-span" style="font-size: large;">POSIX Message Queue</span><br />
<pre class="prettyprint lang-bsh">* IPC Benchmark start
- POSIX mq server...
- POSIX mq client...
- Large message send test:
0.52 secs, 3870.7 Msgs/s, 3870.7 KiB/s
- Large message receive test:
0.52 secs, 3858.2 Msgs/s, 3858.2 KiB/s
- Medium message send test:
0.50 secs, 4019.3 Msgs/s, 502.4 KiB/s
- Medium message receive test:
0.46 secs, 4386.2 Msgs/s, 548.3 KiB/s
- Small message send test:
0.50 secs, 4021.6 Msgs/s, 62.8 KiB/s
- Small message receive test:
0.45 secs, 4426.0 Msgs/s, 69.2 KiB/s
- Event posting test:
0.34 secs, 5918.5 Msgs/s, 23.1 KiB/s
* IPC Benchmark end.
</pre>
<span class="Apple-style-span" style="font-size: large;">Shared Memory</span><br />
<pre class="prettyprint lang-bsh">* IPC Benchmark start
- POSIX shared memory server...
- POSIX shared memory client...
- Large message send test:
0.82 secs, 2453.5 Msgs/s, 2453.5 KiB/s
- Large message receive test:
0.82 secs, 2447.1 Msgs/s, 2447.1 KiB/s
- Medium message send test:
0.80 secs, 2503.6 Msgs/s, 313.0 KiB/s
- Medium message receive test:
0.80 secs, 2494.6 Msgs/s, 311.8 KiB/s
- Small message send test:
0.79 secs, 2519.9 Msgs/s, 39.4 KiB/s
- Small message receive test:
0.80 secs, 2515.5 Msgs/s, 39.3 KiB/s
- Event posting test:
0.79 secs, 2540.8 Msgs/s, 9.9 KiB/s
* IPC Benchmark end.
</pre>
<span class="Apple-style-span" style="font-size: large;">ONC RPC</span><br />
<pre class="prettyprint lang-bsh">* IPC Benchmark start
- ONC RPC Server...
- ONC RPC Client...
- Large message send test
- 1.73 secs, 1156.8 Msgs/s, 1156.8 KiB/s
- Large message receive test
- 1.74 secs, 1148.4 Msgs/s, 1148.4 KiB/s
- Medium message send test
- 1.65 secs, 1209.0 Msgs/s, 151.1 KiB/s
- Medium message receive test
- 1.65 secs, 1211.2 Msgs/s, 151.4 KiB/s
- Small message send test
- 1.64 secs, 1219.2 Msgs/s, 19.0 KiB/s
- Small message receive test
- 1.66 secs, 1202.1 Msgs/s, 18.8 KiB/s
- Event posting test
- 0.25 secs, 7904.5 Msgs/s, 30.9 KiB/s
* IPC Benchmark end.</pre>
<div>
<span class="Apple-style-span" style="font-size: x-large;">Comments</span><br />
In order to run all the tests you must be sure that the following options are enabled in the kernel:<br />
<br />
<pre>General Setup:
[*] POSIX Message Queues
...
Configure standard kernel features (for small systems) :
[*] Use full shmem filesystem
File systems:
...
Pseudo filesystems:
...
[*] Virtual memory file system support (former shm fs) </pre>
</div>
<br />
<a href="http://bobmittmann.blogspot.com/2011/12/embedded-linux-interprocess.html?spref=bl"><< Embedded Linux Interprocess Communication Mechanisms Benchmark - Part 1</a><br />
<br /></div>Robinson Mittmannhttp://www.blogger.com/profile/00297238237576764278noreply@blogger.com38Toronto, ON, Canada43.653226 -79.383184343.469412 -79.69904129999999 43.837039999999995 -79.0673273tag:blogger.com,1999:blog-8812665732576096785.post-17430332412925818362011-12-04T12:20:00.000-08:002012-02-23T06:47:27.646-08:00Embedded Linux Interprocess Communication Mechanisms BenchmarkHi there,<br />
<br />
this is my first attempt to blog, so please excuse me for not having or following any stylistic conventions for this kind of writing. As a matter of fact writing is not something I do often. That being said, I'm open to any criticism regarding either misuses of the English language or errors/omissions in the information content I will eventually present. So fell free to post comments and such.<br />
<br />
This first post is intended to be the initial part of a benchmark test on some IPC (Interprocess Communication) mechanisms that I'm evaluating to implement in a commercial product. I will not going to disclose what the project or product is about, but I will outline the requirements of the subsystem involved in the test.<br />
<br />
<span style="font-size: large;">Overview</span><br />
<br />
Before we start dealing with the problem, I would like to make some comments regarding the usefulness of the results I intend to achieve. For most of the reviews or comparisons out there there is a lack of information regarding the platform on which the tests where performed. In my opinion, this makes things a little confusing when you try to figure out whether such a solution will be appropriate or not for your system. Because differences in cache sizes and architectures, memory bandwidth, library implementation and other factors may affect the results, favoring one or another solution will depend on taking this conditions into account. This way whatever the results may be in my particular tests, I will only recommend the use of a particular approach to someone who have a similar platform.<br />
<br />
The system I'm working right now is based on an ARM9926EJ processor with 8KiB of data cache and 16KiB of instruction cache. The processor clock is close to 300MHz and the system memory is a 128MiB 16bits DDR2 type running roughly at 540MHz.<br />
The Linux kernel version is 2.6.32 and the C library is glibc version 2.8.<br />
<br />
The testing setup will consists of two process: a server and a client that will perform 3 type of conversation:<br />
<br />
1 - Synchronous request - the client issue a request to the server, which will perform some tasks and return some data as result. The size of the reply may vary between 4 to 1024 bytes, depending on the service being requested. The client will wait (block) for the server to reply.<br />
2 - Synchronous send - the client send some data to the server (4 to 1024 bytes) and waits for the server to process it and reply back with a status.<br />
3 - Asynchronous notification - the client send a notification (event) to the server without waiting for acknowledgment.<br />
<br />
<br />
The mechanisms being tested are:<br />
<br />
1 - POSIX Message Queues (mq): in this case the messages will be send, received and synchronized trough the mechanism itself. It's a very straightforward implementation.<br />
2 - POSIX Shared Memory + POSIX Semaphores: this will be a little more evolving as we need to have a mechanism to send the data (Shared Memory) and another for synchronization and mutual exclusion ( in case we have more than one client accessing the server's shared memory resources).<br />
3 - ONC RPC (Open Network Computing Remote Procedure Call - aka SUN RPC) : it may seems a little odd why I want even to consider this but some of the reasons are:<br />
<br />
<ul>
<li>It will simplify the interface creation trough the use of the XDR (kind of IDL)</li>
<li>It will enable the same API to be used for remote access which is another requirement of the product.</li>
<li>It has provision for UNIX Sockets for local transactions (although I, myself, never used it) </li>
<li>It will be fun to do it.</li>
</ul>
<div>
<a href="http://bobmittmann.blogspot.com/2011/12/embedded-linux-interprocess_05.html?spref=bl">Embedded Linux Interprocess Communication Mechanisms Benchmark - Part 2 >></a><br />
<br /></div>Robinson Mittmannhttp://www.blogger.com/profile/00297238237576764278noreply@blogger.com2Toronto, ON, Canada43.66541158014936 -79.37626399346925243.528225080149362 -79.637776993469245 43.802598080149359 -79.114750993469258