[Xenomai] Isolation of CPU (isolcpus=1). Unexpected better performance when RT thread is on core0.

Yann le Chevoir yann.le-chevoir at etu.esisar.grenoble-inp.fr
Mon Feb 5 16:50:00 CET 2018


Hello,

I am an engineering student and I try to proof that a 4000Hz hard real-time
application can run on an ARM board rather than on a more powerful machine.

I work with an IMX6 dual-core and Xenomai Cobalt 3.0.4. I use POSIX skin.
By the way, I first installed Xenomai Cobalt 3.0.5 but first
experimentations revealed that Alchemy API did not work properly (for
example, altency test did not work). I needed more investigations but when
I tried previous version, it worked. I did not test v3.0.6.

For now, my point is that I observe some unexpected behaviors when
isolating cpu1 and perhaps you can explain some to me.

My application looks like:

main(){

     create a POSIX Xenomai thread1, prio = 99, cpu = 1
     /*cpu1 is isolated, see below*/

     create a POSIX Xenomai thread0, prio = 98, cpu = 0
     /*I observed that thread1 latency is better if it remains a
       RT thread on core 0*/

     start both threads

     while(1){
          print_stat();
     }

}

thread1(){

     struct timespec start, stop, next, interval = 250us;

     /* Initialization of the periodicity */
     clock_gettime(CLOCK_REALTIME, &next);
     next += interval;

     while(1){
          /*Releases at specified rate*/
          clock_nanosleep(CLOCK_REALTIME, TIMER_ABSTIME, &next, NULL);
          /*Get time to check jitter and execution time*/
          clock_gettime(CLOCK_REALTIME, &start);
          do_job();
          /*Get time to check execution time*/
          clock_gettime(CLOCK_REALTIME, &stop);
          do_stat(); //jitter = start-next; exec_time = stop-start
          next += interval;
     }

}

thread0(){

     while(1){
          usleep(10000);
          /* Do nothing. I just observed that thread1 latency is better
           * if it remains a RT thread on core 0. Do you have an
           * explanation for that too?*/
     }

}

In fact, I expected that my thread will perform better if it is the only
one on core1. The idea was that all the Linux stuff remains on core0.
So, I put the boot argument isolcpus=1.
Note that it remains these processes on core1. I think it is normal:

  PID   PSR   RSS   CLS   RTPRIO   CMD
  11     1     0    FF      99     [migration/1]
  12     1     0    TS      -      [ksoftirqd/1]
  13     1     0    TS      -      [kworker/1:0]
  14     1     0    TS      -      [kworker/1:0H]

Then, I use the script /usr/xenomai/bin/dohell to load Linux (core0)
for 90 seconds: ./dohell -m /mnt 90

And I launch my process for 60 seconds.
Here is the threads list:

  PID   TID   PSR   RSS   CLS   RTPRIO   CMD
  585   585    0   1952    TS     -      /bin/sh ./dohell -m /mnt 90
  586   586    0   1588    TS     -      /bin/sh ./dohell -m /mnt 90
  587   587    0   1640    TS     -      /bin/sh ./dohell -m /mnt 90
  588   588    0   1604    TS     -      /bin/sh ./dohell -m /mnt 90
  589   589    0   352     TS     -      dd if=/dev/zero of=/dev/null
  590   590    0   1524    TS     -      /bin/sh ./dohell -m /mnt 90
  591   591    0   344     TS     -      sleep 90
  881   881    0   29332   TS     -      ./test_4000Hz
  881   890    0   29332   TS     -      ./test_4000Hz
  881   1013   1   29332   FF     99     ./test_4000Hz
  881   1014   0   29332   FF     98     ./test_4000Hz

TID 881 is the main.
I am not sure why there is the TID 890 thread. Is it a Xenomai one (main)?
TID 1013 is the thread I am interested in.
TID 1014 is the thread I maintain on core0 to have better Latency on core1.


Here is the point:

With the configuration described above, I have the graph
"Core1.png" showing latency and execution time.
Note that I always plot thread1 statistics, it is the only one
which I am interested in.

Reminder of the configuration when plotting "Core1.png":
Core0: Linux stressed + main + thread0
Core1: thread1

Min execution time is 32us.
Max execution time is 82us.
I am a bit disappointed by so execution-time variations.
How can we explain that?


Then, trying permutations to understand these variations, I decided to put
thread1 on CPU0. Linux, main, thread0 and dohell continue doing their stuff.
Note that there is again the isolcpus=1 argument, so nothing is on CPU1.
I am surprised to have a better execution time statistics. Is it a known
situation and how can we explain that? See "Core0.png".

Reminder of the configuration when plotting "Core0.png":
Core0: Linux stressed + main + thread0 + thread1
Core1: -

Min execution time is 32us.
Max execution time is 65us.


Then, given these results, as I had the feeling that a mono-core processor
performs better that a dual-core one, I tried to delete the isolcpus=1
argument to proof the contrary.

Here is the configuration when plotting "NoIsolation.png":
Core0: Linux stressed + main + thread0
Core1: Linux stressed + thread1

As you can see, the graph looks like the first one, but execution time
is even worse at 94us.

Is there something I do wrong?

Thanks for your help, and thanks for your work on Xenomai :)

Yann
-------------- next part --------------
A non-text attachment was scrubbed...
Name: NoIsolation.png
Type: image/png
Size: 15937 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20180205/771a79a2/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Core0.png
Type: image/png
Size: 15550 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20180205/771a79a2/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Core1.png
Type: image/png
Size: 16104 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20180205/771a79a2/attachment-0002.png>


More information about the Xenomai mailing list